Issue 3075: Length of wstring in GIOP 1.1 (interop)
Source: Fujitsu (Mr. Masayoshi Shimamura, shima.masa(at)jp.fujitsu.com)
Nature: Uncategorized Issue
Severity: 
Summary: I have a question about GIOP wstring encoding. Section "15.3.2.7 Strings
and Wide Strings" in CORBA 2.3.1 says:

    For GIOP version 1.1, a wide string is encoded as an unsigned long
    indicating the length of the string in octets or unsigned integers
    (determined by the transfer syntax for wchar) followed by the
    individual wide characters. Both the string length and contents
    include a terminating null. The terminating null character for a
    wstring is also a wide character.

In the sentence above, I believe that the "length" represents number of
octets (in the case of byte oriented codeset) or number of unsigned
integers (in the case of non-byte oriented codeset).

For example,

      "abc" (ASCII code) ----&gt; length is 4 (including one null terminate)
      L"abc" (Unicode, USC2) ----&gt; length is 4 (including one null terminate of wchar)

Is my understanding right?

Resolution: see above
Revised Text: 
Actions taken:
December 3, 1999: received issue
May 13, 2002: closed issue
Discussion: 
Rationale for Rejection
This is correct for GIOP 1.1. GIOP 1.2 changed the meaning of length to
always indicate the number of octets, in order to reduce problems with bridg-ing
in environments where the codeset is unknown in the intermediate
bridges. The specification is unambiguous with respect to string length, so no
change is required.
End of Annotations:=====
Date: Fri, 03 Dec 1999 20:41:25 +0900
From: Masayoshi Shimamura <shima@rp.open.cs.fujitsu.co.jp>
To: interop@omg.org
Subject: Length of wstring in GIOP 1.1
Message-Id: <3847AC652D0.0788SHIMA@margaux>
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver 1.24
Content-Type: text/plain; charset=US-ASCII
X-UIDL: R'>!!QPY!!d/P!!WYP!!

Dear Introp members,

I have a question about GIOP wstring encoding. Section "15.3.2.7
Strings
and Wide Strings" in CORBA 2.3.1 says:

    For GIOP version 1.1, a wide string is encoded as an unsigned long
    indicating the length of the string in octets or unsigned integers
    (determined by the transfer syntax for wchar) followed by the
    individual wide characters. Both the string length and contents
    include a terminating null. The terminating null character for a
    wstring is also a wide character.

In the sentence above, I believe that the "length" represents number
of
octets (in the case of byte oriented codeset) or number of unsigned
integers (in the case of non-byte oriented codeset).

For example,

      "abc" (ASCII code) ----> length is 4 (including one null
terminate)
      L"abc" (Unicode, USC2) ----> length is 4 (including one null
terminate of wchar)

Is my understanding right?


Regards, 

--
Masayoshi SHIMAMURA
TEL:+81-45-476-4581  FAX:+81-45-476-4726
Planning Department I, Strategic Planning Division, 
Strategy and Planning Group, FUJITSU LIMITED
E-mail: shima@rp.open.cs.fujitsu.co.jp
Sender: jbiggar@cisco.com
Message-ID: <38480195.8760BC79@floorboard.com>
Date: Fri, 03 Dec 1999 09:44:53 -0800
From: Jonathan Biggar <jon@floorboard.com>
X-Mailer: Mozilla 4.7 [en] (X11; U; SunOS 5.6 sun4u)
X-Accept-Language: en
MIME-Version: 1.0
To: Masayoshi Shimamura <shima@rp.open.cs.fujitsu.co.jp>
CC: interop@omg.org
Subject: Re: Length of wstring in GIOP 1.1
References: <3847AC652D0.0788SHIMA@margaux>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=us-ascii
X-UIDL: %1B!!S^Sd94f'!!:W~e9

Masayoshi Shimamura wrote:
> 
> Dear Introp members,
> 
> I have a question about GIOP wstring encoding. Section "15.3.2.7
> Strings
> and Wide Strings" in CORBA 2.3.1 says:
> 
>     For GIOP version 1.1, a wide string is encoded as an unsigned
> long
>     indicating the length of the string in octets or unsigned
> integers
>     (determined by the transfer syntax for wchar) followed by the
>     individual wide characters. Both the string length and contents
>     include a terminating null. The terminating null character for a
>     wstring is also a wide character.
> 
> In the sentence above, I believe that the "length" represents number
> of
> octets (in the case of byte oriented codeset) or number of unsigned
> integers (in the case of non-byte oriented codeset).
> 
> For example,
> 
>       "abc" (ASCII code) ----> length is 4 (including one null
> terminate)
>       L"abc" (Unicode, USC2) ----> length is 4 (including one null
> terminate of wchar)
> 
> Is my understanding right?

I believe that you are correct for GIOP 1.1.  Of course, GIOP 1.2
changed the meaning of length to always indicate the number of octets,
in order to reduce problems with bridging in environments where the
codeset is unknown in the intermediate bridges.

-- 
Jon Biggar
Floorboard Software
jon@floorboard.com
jon@biggar.org