Issue 3075: Length of wstring in GIOP 1.1 (interop) Source: Fujitsu (Mr. Masayoshi Shimamura, shima.masa(at)jp.fujitsu.com) Nature: Uncategorized Issue Severity: Summary: I have a question about GIOP wstring encoding. Section "15.3.2.7 Strings and Wide Strings" in CORBA 2.3.1 says: For GIOP version 1.1, a wide string is encoded as an unsigned long indicating the length of the string in octets or unsigned integers (determined by the transfer syntax for wchar) followed by the individual wide characters. Both the string length and contents include a terminating null. The terminating null character for a wstring is also a wide character. In the sentence above, I believe that the "length" represents number of octets (in the case of byte oriented codeset) or number of unsigned integers (in the case of non-byte oriented codeset). For example, "abc" (ASCII code) ----> length is 4 (including one null terminate) L"abc" (Unicode, USC2) ----> length is 4 (including one null terminate of wchar) Is my understanding right? Resolution: see above Revised Text: Actions taken: December 3, 1999: received issue May 13, 2002: closed issue Discussion: Rationale for Rejection This is correct for GIOP 1.1. GIOP 1.2 changed the meaning of length to always indicate the number of octets, in order to reduce problems with bridg-ing in environments where the codeset is unknown in the intermediate bridges. The specification is unambiguous with respect to string length, so no change is required. End of Annotations:===== Date: Fri, 03 Dec 1999 20:41:25 +0900 From: Masayoshi Shimamura To: interop@omg.org Subject: Length of wstring in GIOP 1.1 Message-Id: <3847AC652D0.0788SHIMA@margaux> MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver 1.24 Content-Type: text/plain; charset=US-ASCII X-UIDL: R'>!!QPY!!d/P!!WYP!! Dear Introp members, I have a question about GIOP wstring encoding. Section "15.3.2.7 Strings and Wide Strings" in CORBA 2.3.1 says: For GIOP version 1.1, a wide string is encoded as an unsigned long indicating the length of the string in octets or unsigned integers (determined by the transfer syntax for wchar) followed by the individual wide characters. Both the string length and contents include a terminating null. The terminating null character for a wstring is also a wide character. In the sentence above, I believe that the "length" represents number of octets (in the case of byte oriented codeset) or number of unsigned integers (in the case of non-byte oriented codeset). For example, "abc" (ASCII code) ----> length is 4 (including one null terminate) L"abc" (Unicode, USC2) ----> length is 4 (including one null terminate of wchar) Is my understanding right? Regards, -- Masayoshi SHIMAMURA TEL:+81-45-476-4581 FAX:+81-45-476-4726 Planning Department I, Strategic Planning Division, Strategy and Planning Group, FUJITSU LIMITED E-mail: shima@rp.open.cs.fujitsu.co.jp Sender: jbiggar@cisco.com Message-ID: <38480195.8760BC79@floorboard.com> Date: Fri, 03 Dec 1999 09:44:53 -0800 From: Jonathan Biggar X-Mailer: Mozilla 4.7 [en] (X11; U; SunOS 5.6 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: Masayoshi Shimamura CC: interop@omg.org Subject: Re: Length of wstring in GIOP 1.1 References: <3847AC652D0.0788SHIMA@margaux> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: %1B!!S^Sd94f'!!:W~e9 Masayoshi Shimamura wrote: > > Dear Introp members, > > I have a question about GIOP wstring encoding. Section "15.3.2.7 > Strings > and Wide Strings" in CORBA 2.3.1 says: > > For GIOP version 1.1, a wide string is encoded as an unsigned > long > indicating the length of the string in octets or unsigned > integers > (determined by the transfer syntax for wchar) followed by the > individual wide characters. Both the string length and contents > include a terminating null. The terminating null character for a > wstring is also a wide character. > > In the sentence above, I believe that the "length" represents number > of > octets (in the case of byte oriented codeset) or number of unsigned > integers (in the case of non-byte oriented codeset). > > For example, > > "abc" (ASCII code) ----> length is 4 (including one null > terminate) > L"abc" (Unicode, USC2) ----> length is 4 (including one null > terminate of wchar) > > Is my understanding right? I believe that you are correct for GIOP 1.1. Of course, GIOP 1.2 changed the meaning of length to always indicate the number of octets, in order to reduce problems with bridging in environments where the codeset is unknown in the intermediate bridges. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org