Issue 1384: Mapping for wide strings (cxx_revision) Source: (, ) Nature: Uncategorized Issue Severity: Summary: Summary: the spec doesn"t say what the type CORBA::WChar should map to. Presumably, it should be wchar_t? If so, I think this should be stated. It is important to nail this down because of overloading ambiguities. For example, if CORBA::WChar * is allowed to be the same as char *, then we cannot use the overloaded <<= operator to insert unbounded wide strings into an Any. Resolution: Fix the mapping to specify what WChar maps to. Revised Text: On page 20-24 of ptc/98-09-03, change the paragraph: Except for boolean, char, and octet, the mappings for basic types must be distinguishable from each other for the purposes of overloading. That is, one can safely write overloaded C++ functions on Short, UShort, Long, ULong, Float, and Double. To read: Types Boolean, Char, and Octet may all map to the same underlying C++ type. This means that these types may not be distinguishable for the purposes of overloading. Type WChar maps to wchar_t in standard C++ environments or, for non-standard C++ environments, may also map to one of the integer types. This means that WChar may not be distinguishable from integer types for the purposes of overloading. All other mapping sfor basic types are distinguishable for the purposes of overlaoding; that is, one can safely write overloaded C++ functions on Short, UShort, Long, ULong, LongLong, Float, Double, and LongDouble. On page 20-59, add the following text to the paragraph introducing the overloaded string insertion operators: Note that insertion of wide strings in this manner depends on standard C++, in which wchar_t is a distinct type. Code that must be portable across standard and older C++ compilers must use the from_wstring helper type. Add a footnote to the second bullet on page 20-62: Note that extraction of wide strings in this manner depends on standard C++, in which wchar_t is a distinct type. Code that must be portable across standard and older C++ compilers must use the to_wstring helper type. Add the missing insertion and extraction operators for wide strings to the Any class on page 20-159: void operator<<=(const WChar *); Boolean operator>>=(WChar * &) const; Actions taken: May 19, 1998: received issue March 19, 1999: closed issue Discussion: End of Annotations:===== Return-Path: X-Authentication-Warning: tigger.dstc.edu.au: michi owned process doing -bs Date: Tue, 19 May 1998 10:38:42 +1000 (EST) From: Michi Henning To: cxx_revision@omg.org, issues@omg.org Subject: Mapping for wide strings? Hi, the spec doesn't say what the type CORBA::WChar should map to. Presumably, it should be wchar_t? If so, I think this should be stated. It is important to nail this down because of overloading ambiguities. For example, if CORBA::WChar * is allowed to be the same as char *, then we cannot use the overloaded <<= operator to insert unbounded wide strings into an Any. The spec says that "Except for boolean, char, and octet, the mappings for basic types must be distinguishable from each other for the purposes of overloading. That is, one can safely write overloaded C++ functions on Short, UShort, Long, ULong, Float, and Double. The first sentence seems to indicate that it would be illegal to map WChar to char, or any one of Short, UShort, Long, ULong, Float, and Double. If so, this should be stated. Does it make sense to allow WChar to map to anything but wchar_t? I don't think so, so I suspect we should *require* WChar to map to wchar_t? Cheers, Michi. -- Michi Henning +61 7 33654310 DSTC Pty Ltd +61 7 33654311 (fax) University of Qld 4072 michi@dstc.edu.au AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html Return-Path: X-Authentication-Warning: tigger.dstc.edu.au: michi owned process doing -bs Date: Tue, 19 May 1998 13:15:21 +1000 (EST) From: Michi Henning To: cxx_revision@omg.org, issues@omg.org Subject: Re: Mapping for wide strings? On Tue, 19 May 1998, Michi Henning wrote: > The spec says that > > "Except for boolean, char, and octet, the mappings for basic > types must be distinguishable from each other for the > purposes of > overloading. That is, one can safely write overloaded C++ > functions > on Short, UShort, Long, ULong, Float, and Double. > > The first sentence seems to indicate that it would be illegal to map > WChar to char, or any one of Short, UShort, Long, ULong, Float, and > Double. > If so, this should be stated. > > Does it make sense to allow WChar to map to anything but wchar_t? > I don't think so, so I suspect we should *require* WChar to map to > wchar_t? Having looked at this and the ANSI C++ spec a bit more, it's clear now that WChar may map to, for example, CORBA::Long in pre-ANSI environments. This also explains the presence of the from_wchar and from_wstring helpers for type Any. So, it looks like the para about distinguishing types needs updating. I would suggest: Types Boolean, Char, and Octet may all map to the same underlying C++ type. This means these types may not be distinguishable for the purposes of overloading. Type WChar may map to one of the integer types, or it may map to wchar_t, so WChar may not be distinguishable from an integer type for the purposes of overloading. All other mappings for basic types must be distinguishable for the purposes of overloading. That is, one can safely write overloaded C++ functions on Short, UShort, Long, ULong, LongLong, ULongLong, Float, Double, and LongDouble. I think that captures the intent. Note that I've also added (U)LongLong and LongDouble to the list -- this seems to have been the intent of the spec, because footnote 4 says that these types either map to native types or to classes that provide the required semantics. Together with the size requirements made by IDL, this implies that Short, Long, and LongLong must be three distinct C++ types (at least I think it does...) Cheers, Michi. -- Michi Henning +61 7 33654310 DSTC Pty Ltd +61 7 33654311 (fax) University of Qld 4072 michi@dstc.edu.au AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html Return-Path: X-Authentication-Warning: tigger.dstc.edu.au: michi owned process doing -bs Date: Tue, 19 May 1998 16:16:38 +1000 (EST) From: Michi Henning To: cxx_revision@omg.org, issues@omg.org Subject: Re: Mapping for wide strings? On Tue, 19 May 1998, Michi Henning wrote: > Having looked at this and the ANSI C++ spec a bit more, it's clear now > that WChar may map to, for example, CORBA::Long in pre-ANSI environments. Hmmm... I'm not so sure now... On page 20-56: Copying insertion of a string type or wide string type causes one of the following functions to be invoked: void operator<<=(Any&, const char *); void operator<<=(Any&, const WChar *); On page 20-58: Boolean operator>>=(const Any&, T&); [ ... ] The first form form of this function is used only for the following types: [ ... ] - Unbounded strings and wide strings (char* and WChar* passed by reference, i.e., char*& and WChar*&) This appears to be impossible for pre-ANSI environments, where wchar_t is not a distinct type? In particular, if the stubs also use an array of integers that has the same element type as the integer wchar_t maps to, then we would get redefinition errors? Also, the Any class on page 20-115 does not show <<= or >>= for wide strings, so I'm not sure now what the intent really is... I think we need to double-check everywhere that wide string support really works for both ANSI and non-ANSI environments. Right now, it appears there is at least some inconsistency in the spec as to exactly how it should work. Cheers, Michi. -- Michi Henning +61 7 33654310 DSTC Pty Ltd +61 7 33654311 (fax) University of Qld 4072 michi@dstc.edu.au AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html Date: Sun, 21 Feb 1999 09:42:35 +1000 (EST) From: Michi Henning To: C++ Revision Task Force Subject: Proposal for 1384 Organization: Triodia Technologies On page 20-24, change the para: Except for boolean, char, and octet, the mappings for basic types must be distinguishable from each other for the purposes of overloading. That is, one can safely write overloaded C++ functions on Short, UShort, Long, ULong, Float, and Double. To read: Types Boolean, Char, and Octet may all map to the same underlying C++ type. This means that these types may not be distinguishable for the purposes of overloading. Type WChar maps to wchar_t in standard C++ environments or, for non-standard C++ environments, may also map to one of the integer types. This means that WChar may not be distinguishable from integer types for the purposes of overloading. All other mapping sfor basic types are distinguishable for the purposes of overlaoding; that is, one can safely write overloaded C++ functions on Short, UShort, Long, ULong, LongLong, Float, Double, and LongDouble. On page 20-59, add a footnote to the para introducing the overloaded string insertion operators: Note that insertion of wide strings in this manner depends on standard C++, in which wchar_t is a distinct type. Code that must be portable across standard and older C++ compilers must use the from_wstring helper type. Add a similar footnote to the second bullet on page 20-62: Note that extraction of wide strings in this manner depends on standard C++, in which wchar_t is a distinct type. Code that must be portable across standard and older C++ compilers must use the to_wstring helper type. Add the missing insertion and extraction operators for wide strings to the Any class on page 20-159: void operator<<=(const WChar *); Boolean operator>>=(WChar * &) const; Cheers, Michi. -- Michi Henning +61 7 3236 1633 Triodia Technologies +61 4 1118 2700 (mobile) PO Box 372 +61 7 3211 0047 (fax) Annerley 4103 michi@triodia.com AUSTRALIA http://www.triodia.com/staff/michi-henning.html