Issue 585:  IDL Type Extensions: wchar and wstring CDR encoding (interop)
Source:  (, )
Nature:  Uncategorized
Severity: 
Summary: Summary: Section 4.1 GIOP CDR Transfer Syntax: The spec should cover cases where TCS-W is byte-oriented or non-byte oriented

Resolution:  duplicate to closed issue 1096
Revised Text:  see core rtf 2.2 changes , changed to fix marshalling for giop 1.2 
Actions taken:
May 29, 1997: received issue
June 22, 1998: moved from orb_revision to port-rtf
June 23, 1998: moved from port-rtf to interop
February 17, 1999: closed issue
February 19, 1999: closed issue; Resolved
Discussion: 
 received issue
End of Annotations:=====
Return-Path: <leou@austin.ibm.com>
To: issues@omg.org, orb_revision@omg.org
Reply-To: leou@austin.ibm.com
Subject: IDL Type Extensions: wchar and wstring CDR encoding issues
Date: Thu, 29 May 97 15:41:44 -0500
From: Leo Uzcategui <leou@austin.ibm.com>


Reference: IDL Type Extensions V1.0 ( ptc/97-01-01 )

In section 4.1 GIOP CDR Transfer Syntax on p. 19, it states "a wide 
character is represented as the fixed number of bits used to encode any 
single codepoint in the TCS." 
What does this mean when the negotiated TCS-W is byte-oriented?  
The spec should cover cases where TCS-W is byte-oriented or non-byte 
oriented.


Return-Path: <leou@austin.ibm.com>
Reply-To: leou@austin.ibm.com
To: Masayoshi Shimamura <shima@rp.open.cs.fujitsu.co.jp>
Cc: <leou@austin.ibm.com>, <davg@mfltd.co.uk>, <jfh@hal.com>,
        <janssen@parc.xerox.com>, <kline_s@apollo.hp.com>,
        <hari@austin.ibm.com>, <rabin@osf.org>, <alex_thomas@globalnet.co.uk>,
        <nicktindall@vnet.ibm.com>, <yamada@hal.com>, idltx@omg.org
Subject: Re: IDL Type Extensions Spec issues 
Date: Wed, 04 Jun 97 08:16:36 -0500
From: Leo Uzcategui <leou@austin.ibm.com>


Shimamura-san,

Thanks for your explanation. However, I believe the spec allows TCS-W to be 
either byte-oriented or non-byte oriented. 

On p. 7, under heading "Transmission Code Set" it states 

   "The intent is for TCS-C to be byte-oriented and TCS-W to be 
   non-byte-oriented. However, this specification does allow both types 
   of characters to be transmitted using the same transmission code set.
   That is, the selection of a trasmission code set is orthogonal to
   the wideness or narrowness of the characters, although a given code
   set may be better suited for either narrow or wide characters."

Unfortunately, I was not involved during the spec definition and am going
by what the spec says and discussion with some at IBM who were involved,
like Hari Madduri and Gary Miller.

Was there not agreement on this point?  Can someone else please comment?

Thanks, Leo

I also cc'd idltx@omg.org in case I missed someone on the original list,
sorry if you get 2 copies.

> Dear Mr. Leo Uzcategui,
> 
> Sorry for the late response.
> 
> On Thu, 29 May 97 15:26:58 -0500
> Leo Uzcategui <leou@austin.ibm.com> wrote:
> > 
> > I need your help in trying to interpret the IDL Type Extensions V1.0
> > Spec ( ftp://ftp.omg.org/pub/docs/ptc/97-01-01.pdf [.ps] )
> > The spec is vague and open to misinterpretation with respect to the
> > CDR encoding of wchar/wstring data.  
> > 
> > I am somewhat unsure which OMG TF or RTF is responsible for definitive
> > answers (ORBOS?). But since you were all involved in the definition, you 
> > can be of immediate help.  I intend to mail these to issues@omg.org and 
> > orb_revision@omg.org so they get recorded.
> > 
> > In section 4.1 GIOP CDR Transfer Syntax on p. 19, it states "a wide 
> > character is represented as the fixed number of bits used to encode any 
> > single codepoint in the TCS." 
> > What does this mean when the negotiated TCS-W is byte-oriented?  
> > The spec should cover cases where TCS-W is byte-oriented or non-byte 
> > oriented.
> > 
> 
> Negotiated TCS-W must be only non-byte oriented codeset. See section
> "2.1 Character Processing Terminology" on page 6.
> 
> 	Byte-Oriented Code Set
> 
> 	An encoding of characters where the numeric code corresponding
> 	to a character code element can occupy one or more bytes. A byte
> 	as used in this document is synonymous with octet, which
> 	occupies 8 bits. As noted above, byte-oriented code sets use the
> 	char data type.  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 	~~~~~~~~~~~~~~~
> 
> 	Non-Byte Oriented Code Set
> 	
> 	An encoding of characters where the numeric code corresponding
> 	to a character code element can occupy fixed 16 or 32 bits. As
> 	noted above, non-byte oriented code sets use the wchar data type.
> 	             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> > In section 4.1.2 on p. 20, spec states "A wide string is encoded as
> > an unsigned long indicating the length of the string in octects or 
> > unsigned integers (determined by the transfer syntax for wchar) followed 
> > by the individual wide characters.  Both the string length and contents 
> > include a terminating null. The terminating null character for a wstring 
> > is also a wide character."
> > Again, what does this mean if TCS-W is byte-oriented?
> 
> My answer is same above. Negotiated TCS-W must be only non-byte oriented
> codeset. In other words, native codeset and conversion codesets of 
> *char type* must be *byte-oriented*. And also, native codeset and
> conversion codesets of *wchar type* must be *non-byte-oriented*.
> 
> > Our interpretation: 
> > If the negotiated TCS-W is non-byte oriented, then length is the number 
> > of characters, each character encoded using a fixed number of bits (as 
> > determined by the code set) plus the terminating null (also of the same 
> > width as the other characters.) 
> 
> Strictly speaking, the length means the number of code element including
> terminating null. It is not the number of characters including null. 
> 
> Because, for example, Unicode has specification of combining character.
> Unicode can represent character which has diacritical mark such as grave
> accent by specification of combining character.
> 
>                      \
> If the character is "A" (combination of character "A" and grave 
> accent "`"), the character may represented by combination: 
> code of "A" and code of "`" in Unicode. In the case, number of
> characters is one, but the number of code elements is two. 
> 
>                \
> If the string "A" with Unicode is represented by CDR, the length should
> be three: code of "A", code of "`" and code of null. Therefore, the
> lenght of string in CDR should mean number of code elements instead of
> number of characters.
> 
> > If the negotiated TCS-W is byte-oriented, then length is the number of 
> > bytes (not of characters), each character encoded using only the bytes 
> > required by the chosen code set (not a fixed-width representation), plus 
> > the terminating single byte null.
> > 
> 
> There is not such case.
> 
> 
> Regards,
> 
> Masayoshi SHIMAMURA
> ------------------------------------------------------------------------
> Masayoshi SHIMAMURA              FUJITSU LIMITED
> Planning Department II           Tel: +81-45-476-4591 (Ext. 7128-4221)
> IT Strategy and Planning Div.    FAX: +81-45-476-4726 (Ext. 7128-6783)
> Software Group                   E-mail: shima@rp.open.cs.fujitsu.co.jp
> ------------------------------------------------------------------------
----------------------------------------------------------------------
Leo Uzcategui, IBM Corp.
11400 Burnet Rd, Austin, TX 78758
(512) 823-9573 fax:(512) 838-1032
----------------------------------------------------------------------


Return-Path: <hari@austin.ibm.com>
From: hari@austin.ibm.com
Date: Wed, 4 Jun 1997 12:28:00 -0500
To: leou@austin.ibm.com, jfh@hal.com
Subject: Re: Re[2]: IDL Type Extensions Spec issues
Cc: shima@rp.open.cs.fujitsu.co.jp, davg@mfltd.co.uk, janssen@parc.xerox.com,
        kline_s@apollo.hp.com, hari@austin.ibm.com, rabin@osf.org,
        alex_thomas@globalnet.co.uk, nicktindall@vnet.ibm.com, yamada@hal.com,
        idltx@omg.org
Content-Md5: xXpmKBl75LX0vjAzzSIJ8g==

Jim, Leo:
	The change I wanted in the referenced note below is consistent
with the paragraph referred to by Leo (on page 7). I see that that the spec
is consistent with the view that a "WChar" type data could be transmitted
using either byte-oriented or non-byte oriented transmission codesets.
To the best of my understanding, that is what we (the revision committee
members) all wanted. 
	What Shimamura San is referring to in the definitions part of
the spec is not excluding the possibility of transmitting wchar type
of data using either codesets. It is describing the more likely case.
If I recall correctly, Fujitsu was one of the strong advocates of that
flexibility where one could use either type of codesets.
	I would have to disagree with Dr. Shimamura's interpretation
as noted below his remarks.
	regards,
	 -Hari
	
> From root Wed Jun  4 11:03:27 1997
> From: jfh@hal.com (Jim Hughes)
> Message-Id: <199706041602.JAA02694@kodiak.hal.com>
> Date: Wed, 4 Jun 1997 09:02:31 -0700 (PDT)
> To: leou@austin.ibm.com
> Cc: shima@rp.open.cs.fujitsu.co.jp, leou@austin.ibm.com, davg@mfltd.co.uk,
        janssen@parc.xerox.com, kline_s@apollo.hp.com, hari@austin.ibm.com,
        rabin@osf.org, alex_thomas@globalnet.co.uk, nicktindall@vnet.ibm.com,
        yamada@hal.com, idltx@omg.org
> Subject: Re[2]: IDL Type Extensions Spec issues
> In-Reply-To: <9706041316.AA15270@leou.austin.ibm.com>
> X-Mailer: Ishmail 1.3-960829-sol24 <http://www.ishmail.com>
> Mime-Version: 1.0
> Content-Type: text/plain
> Content-Length: 7652
> Status: RO
> 
> Leo,
> 
> I'll let Shimamura-san give a technical answer for Fujitsu to your response,
> but let me add some more information.
> 
> The text you reference on page 7 was proposed for removal by me when I
> created Draft 6 of the final document because I thought it was confusing. IBM
> requested that it be re-instated per the email below. Probably Hari can give
> you the best reasons why this was done and how it should be implemented.
> 
> Jim
> 
> >>>>> Forwarded from hari@austin.ibm.com
> 
> Date: Tue, 7 Jan 1997 14:23:25 -0600
> From: hari@austin.ibm.com
> To: jfh@hal.com
> Subject: My changes to draft 5
> Cc: nicktindall@VNET.IBM.COM, leou@austin.ibm.com, hari@austin.ibm.com
> 
> Jim,
> 	Here are the changes that I would like:
> 	
> 	1. Once we revert to narrow and wide characters referring to IDL
> 	   "char' and "wchar" types of data, it becomes unambiguous and
> 	   even necessary to point out that they could be transimtted using
> 	   either byte-oriented or non-byte oriented code sets.
> 	   Therefore the removed text in your notes on pages 5, 6, and
> 	   25 needs to be reintroduced. (Incidentally Mr. Shimamura didn't
> 	   like the phrase "don't have to be different" (in your note on page
> 	    25). Rephrase it as you like, but I want that point to be made.)
> 	    
> 	   Also at the bottom of page 18 your note says " previous material
> 	   suggesting that a byte-oriented code set could be used for 'wide
> 	   characters' was removed". This removed text needs to be restored.
> 	   
> (rest of email deleted)
> 
> <<<<< End forwarded message
> 
> 
> ================
> On Wed, 04 Jun 97 08:16:36 -0500, Leo Uzcategui <leou@austin.ibm.com> wrote:
>      
> > 
> > Shimamura-san,
> > 
> > Thanks for your explanation. However, I believe the spec allows TCS-W to be
> > either byte-oriented or non-byte oriented. 
> > 
> > On p. 7, under heading "Transmission Code Set" it states 
> > 
> >    "The intent is for TCS-C to be byte-oriented and TCS-W to be 
> >    non-byte-oriented. However, this specification does allow both types 
> >    of characters to be transmitted using the same transmission code set.
> >    That is, the selection of a trasmission code set is orthogonal to
> >    the wideness or narrowness of the characters, although a given code
> >    set may be better suited for either narrow or wide characters."
> > 
> > Unfortunately, I was not involved during the spec definition and am going
> > by what the spec says and discussion with some at IBM who were involved,
> > like Hari Madduri and Gary Miller.
> > 
> > Was there not agreement on this point?  Can someone else please comment?
> > 
> > Thanks, Leo
> > 
> > I also cc'd idltx@omg.org in case I missed someone on the original list,
> > sorry if you get 2 copies.
> > 
> > > Dear Mr. Leo Uzcategui,
> > > 
> > > Sorry for the late response.
> > > 
> > > On Thu, 29 May 97 15:26:58 -0500
> > > Leo Uzcategui <leou@austin.ibm.com> wrote:
> > > > 
> > > > I need your help in trying to interpret the IDL Type Extensions V1.0
> > > > Spec ( ftp://ftp.omg.org/pub/docs/ptc/97-01-01.pdf [.ps] )
> > > > The spec is vague and open to misinterpretation with respect to the
> > > > CDR encoding of wchar/wstring data.  
> > > > 
> > > > I am somewhat unsure which OMG TF or RTF is responsible for definitive
> > > > answers (ORBOS?). But since you were all involved in the definition,
> > you 
> > > > can be of immediate help.  I intend to mail these to issues@omg.org and
> > > > orb_revision@omg.org so they get recorded.
> > > > 
> > > > In section 4.1 GIOP CDR Transfer Syntax on p. 19, it states "a wide 
> > > > character is represented as the fixed number of bits used to encode any
> > > > single codepoint in the TCS." 
> > > > What does this mean when the negotiated TCS-W is byte-oriented?  
> > > > The spec should cover cases where TCS-W is byte-oriented or non-byte 
> > > > oriented.
> > > > 
> > > 
> > > Negotiated TCS-W must be only non-byte oriented codeset. See section
> > > "2.1 Character Processing Terminology" on page 6.

No. TCS-W can be either.

> > > 
> > > 	Byte-Oriented Code Set
> > > 
> > > 	An encoding of characters where the numeric code corresponding
> > > 	to a character code element can occupy one or more bytes. A byte
> > > 	as used in this document is synonymous with octet, which
> > > 	occupies 8 bits. As noted above, byte-oriented code sets use the
> > > 	char data type.  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > 	~~~~~~~~~~~~~~~
> > > 
> > > 	Non-Byte Oriented Code Set
> > > 	
> > > 	An encoding of characters where the numeric code corresponding
> > > 	to a character code element can occupy fixed 16 or 32 bits. As
> > > 	noted above, non-byte oriented code sets use the wchar data type.
> > > 	             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > 
> > > > In section 4.1.2 on p. 20, spec states "A wide string is encoded as
> > > > an unsigned long indicating the length of the string in octects or 
> > > > unsigned integers (determined by the transfer syntax for wchar)
> > followed 
> > > > by the individual wide characters.  Both the string length and contents
> > > > include a terminating null. The terminating null character for a
> > wstring 
> > > > is also a wide character."
> > > > Again, what does this mean if TCS-W is byte-oriented?
> > > 
> > > My answer is same above. Negotiated TCS-W must be only non-byte oriented
> > > codeset. In other words, native codeset and conversion codesets of 
> > > *char type* must be *byte-oriented*. And also, native codeset and
> > > conversion codesets of *wchar type* must be *non-byte-oriented*.
No again.

> > > 
> > > > Our interpretation: 
> > > > If the negotiated TCS-W is non-byte oriented, then length is the number
> > > > of characters, each character encoded using a fixed number of bits (as 
> > > > determined by the code set) plus the terminating null (also of the same
> > > > width as the other characters.) 
> > > 
> > > Strictly speaking, the length means the number of code element including
> > > terminating null. It is not the number of characters including null. 

Yes. I agree.
> > > 
> > > Because, for example, Unicode has specification of combining character.
> > > Unicode can represent character which has diacritical mark such as grave
> > > accent by specification of combining character.
> > > 
> > >                      \
> > > If the character is "A" (combination of character "A" and grave 
> > > accent "`"), the character may represented by combination: 
> > > code of "A" and code of "`" in Unicode. In the case, number of
> > > characters is one, but the number of code elements is two. 
> > > 
> > >                \
> > > If the string "A" with Unicode is represented by CDR, the length should
> > > be three: code of "A", code of "`" and code of null. Therefore, the
> > > lenght of string in CDR should mean number of code elements instead of
> > > number of characters.
> > > 
> > > > If the negotiated TCS-W is byte-oriented, then length is the number of 
> > > > bytes (not of characters), each character encoded using only the bytes 
> > > > required by the chosen code set (not a fixed-width representation),
> > plus 
> > > > the terminating single byte null.
> > > > 
> > > 
> > > There is not such case.

I disagree. There sure is such a case.
> > > 
> > > 
> > > Regards,
> > > 
> > > Masayoshi SHIMAMURA
> > > ------------------------------------------------------------------------
> > > Masayoshi SHIMAMURA              FUJITSU LIMITED
> > > Planning Department II           Tel: +81-45-476-4591 (Ext. 7128-4221)
> > > IT Strategy and Planning Div.    FAX: +81-45-476-4726 (Ext. 7128-6783)
> > > Software Group                   E-mail: shima@rp.open.cs.fujitsu.co.jp
> > > ------------------------------------------------------------------------
> > ----------------------------------------------------------------------
> > Leo Uzcategui, IBM Corp.
> > 11400 Burnet Rd, Austin, TX 78758
> > (512) 823-9573 fax:(512) 838-1032
> > ----------------------------------------------------------------------
> 
> 

I have stated what I believe to be the consensus we arrived at. What do
the other RTF members say?
	regards,
	-Hari


Return-Path: <jfh@hal.com>
From: jfh@hal.com (Jim Hughes)
Date: Wed, 4 Jun 1997 09:02:31 -0700 (PDT)
To: leou@austin.ibm.com
Cc: shima@rp.open.cs.fujitsu.co.jp, leou@austin.ibm.com, davg@mfltd.co.uk,
        janssen@parc.xerox.com, kline_s@apollo.hp.com, hari@austin.ibm.com,
        rabin@osf.org, alex_thomas@globalnet.co.uk, nicktindall@vnet.ibm.com,
        yamada@hal.com, idltx@omg.org
Subject: Re[2]: IDL Type Extensions Spec issues

Leo,

I'll let Shimamura-san give a technical answer for Fujitsu to your response,
but let me add some more information.

The text you reference on page 7 was proposed for removal by me when I
created Draft 6 of the final document because I thought it was confusing. IBM
requested that it be re-instated per the email below. Probably Hari can give
you the best reasons why this was done and how it should be implemented.

Jim

>>>>> Forwarded from hari@austin.ibm.com

Date: Tue, 7 Jan 1997 14:23:25 -0600
From: hari@austin.ibm.com
To: jfh@hal.com
Subject: My changes to draft 5
Cc: nicktindall@VNET.IBM.COM, leou@austin.ibm.com, hari@austin.ibm.com

Jim,
	Here are the changes that I would like:
	
	1. Once we revert to narrow and wide characters referring to IDL
	   "char' and "wchar" types of data, it becomes unambiguous and
	   even necessary to point out that they could be transimtted using
	   either byte-oriented or non-byte oriented code sets.
	   Therefore the removed text in your notes on pages 5, 6, and
	   25 needs to be reintroduced. (Incidentally Mr. Shimamura didn't
	   like the phrase "don't have to be different" (in your note on page
	    25). Rephrase it as you like, but I want that point to be made.)
	    
	   Also at the bottom of page 18 your note says " previous material
	   suggesting that a byte-oriented code set could be used for 'wide
	   characters' was removed". This removed text needs to be restored.
	   
(rest of email deleted)

<<<<< End forwarded message


================
On Wed, 04 Jun 97 08:16:36 -0500, Leo Uzcategui <leou@austin.ibm.com> wrote:
     
> 
> Shimamura-san,
> 
> Thanks for your explanation. However, I believe the spec allows TCS-W to be
> either byte-oriented or non-byte oriented. 
> 
> On p. 7, under heading "Transmission Code Set" it states 
> 
>    "The intent is for TCS-C to be byte-oriented and TCS-W to be 
>    non-byte-oriented. However, this specification does allow both types 
>    of characters to be transmitted using the same transmission code set.
>    That is, the selection of a trasmission code set is orthogonal to
>    the wideness or narrowness of the characters, although a given code
>    set may be better suited for either narrow or wide characters."
> 
> Unfortunately, I was not involved during the spec definition and am going
> by what the spec says and discussion with some at IBM who were involved,
> like Hari Madduri and Gary Miller.
> 
> Was there not agreement on this point?  Can someone else please comment?
> 
> Thanks, Leo
> 
> I also cc'd idltx@omg.org in case I missed someone on the original list,
> sorry if you get 2 copies.
> 
> > Dear Mr. Leo Uzcategui,
> > 
> > Sorry for the late response.
> > 
> > On Thu, 29 May 97 15:26:58 -0500
> > Leo Uzcategui <leou@austin.ibm.com> wrote:
> > > 
> > > I need your help in trying to interpret the IDL Type Extensions V1.0
> > > Spec ( ftp://ftp.omg.org/pub/docs/ptc/97-01-01.pdf [.ps] )
> > > The spec is vague and open to misinterpretation with respect to the
> > > CDR encoding of wchar/wstring data.  
> > > 
> > > I am somewhat unsure which OMG TF or RTF is responsible for definitive
> > > answers (ORBOS?). But since you were all involved in the definition,
> you 
> > > can be of immediate help.  I intend to mail these to issues@omg.org and
> > > orb_revision@omg.org so they get recorded.
> > > 
> > > In section 4.1 GIOP CDR Transfer Syntax on p. 19, it states "a wide 
> > > character is represented as the fixed number of bits used to encode any
> > > single codepoint in the TCS." 
> > > What does this mean when the negotiated TCS-W is byte-oriented?  
> > > The spec should cover cases where TCS-W is byte-oriented or non-byte 
> > > oriented.
> > > 
> > 
> > Negotiated TCS-W must be only non-byte oriented codeset. See section
> > "2.1 Character Processing Terminology" on page 6.
> > 
> > 	Byte-Oriented Code Set
> > 
> > 	An encoding of characters where the numeric code corresponding
> > 	to a character code element can occupy one or more bytes. A byte
> > 	as used in this document is synonymous with octet, which
> > 	occupies 8 bits. As noted above, byte-oriented code sets use the
> > 	char data type.  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > 	~~~~~~~~~~~~~~~
> > 
> > 	Non-Byte Oriented Code Set
> > 	
> > 	An encoding of characters where the numeric code corresponding
> > 	to a character code element can occupy fixed 16 or 32 bits. As
> > 	noted above, non-byte oriented code sets use the wchar data type.
> > 	             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > 
> > > In section 4.1.2 on p. 20, spec states "A wide string is encoded as
> > > an unsigned long indicating the length of the string in octects or 
> > > unsigned integers (determined by the transfer syntax for wchar)
> followed 
> > > by the individual wide characters.  Both the string length and contents
> > > include a terminating null. The terminating null character for a
> wstring 
> > > is also a wide character."
> > > Again, what does this mean if TCS-W is byte-oriented?
> > 
> > My answer is same above. Negotiated TCS-W must be only non-byte oriented
> > codeset. In other words, native codeset and conversion codesets of 
> > *char type* must be *byte-oriented*. And also, native codeset and
> > conversion codesets of *wchar type* must be *non-byte-oriented*.
> > 
> > > Our interpretation: 
> > > If the negotiated TCS-W is non-byte oriented, then length is the number
> > > of characters, each character encoded using a fixed number of bits (as 
> > > determined by the code set) plus the terminating null (also of the same
> > > width as the other characters.) 
> > 
> > Strictly speaking, the length means the number of code element including
> > terminating null. It is not the number of characters including null. 
> > 
> > Because, for example, Unicode has specification of combining character.
> > Unicode can represent character which has diacritical mark such as grave
> > accent by specification of combining character.
> > 
> >                      \
> > If the character is "A" (combination of character "A" and grave 
> > accent "`"), the character may represented by combination: 
> > code of "A" and code of "`" in Unicode. In the case, number of
> > characters is one, but the number of code elements is two. 
> > 
> >                \
> > If the string "A" with Unicode is represented by CDR, the length should
> > be three: code of "A", code of "`" and code of null. Therefore, the
> > lenght of string in CDR should mean number of code elements instead of
> > number of characters.
> > 
> > > If the negotiated TCS-W is byte-oriented, then length is the number of 
> > > bytes (not of characters), each character encoded using only the bytes 
> > > required by the chosen code set (not a fixed-width representation),
> plus 
> > > the terminating single byte null.
> > > 
> > 
> > There is not such case.
> > 
> > 
> > Regards,
> > 
> > Masayoshi SHIMAMURA
> > ------------------------------------------------------------------------
> > Masayoshi SHIMAMURA              FUJITSU LIMITED
> > Planning Department II           Tel: +81-45-476-4591 (Ext. 7128-4221)
> > IT Strategy and Planning Div.    FAX: +81-45-476-4726 (Ext. 7128-6783)
> > Software Group                   E-mail: shima@rp.open.cs.fujitsu.co.jp
> > ------------------------------------------------------------------------
> ----------------------------------------------------------------------
> Leo Uzcategui, IBM Corp.
> 11400 Burnet Rd, Austin, TX 78758
> (512) 823-9573 fax:(512) 838-1032
> ----------------------------------------------------------------------