Issue 3681: interop issue: CodeSets service context in GIOP 1.0 request (interop) Source: Xerox (Mr. William C. Janssen, Jr., janssen(at)parc.xerox.com) Nature: Uncategorized Issue Severity: Summary: Well, a new Java release is upon us, and with it comes a new CORBA implementation. I'm trying Java 2 SE 1.3 CORBA clients against an ILU 2.0beta1 CosNaming server, and we find that the Java ORB cannot reliably connect to the server. Why not? First, we must analyze the IOR provided by the ILU service: IOR:000000000000002849444C3A6F6D672E6F72672F436F734E616D696E672F4E616D696E67436F6E746578743A312E300000000002000000000000002F0001000000000016776174736F6E2E706172632E7865726F782E636F6D00270F0000000B4E616D6553657276696365000000000100000024000100000000000100000001000000140001001800010001000000000001010000000000 If we look at this (those who've received it un-truncated) we find that it advertises the following: _IIOP_ParseCDR: byte order BigEndian, repository id <IDL:omg.org/CosNaming/NamingContext:1.0>, 2 profiles _IIOP_ParseCDR: profile 1 is 47 bytes, tag 0 (INTERNET), BigEndian byte order _IIOP_ParseCDR: profile 2 is 36 bytes, tag 1 (MULTIPLE COMPONENT), BigEndian byte order (iiop.c:parse_IIOP_Profile): bo=BigEndian, version=1.0, hostname=watson.parc.xerox.com, port=9999, object_key=<NameService> (iiop.c:parse_IIOP_Profile): encoded object key is <NameService> (iiop.c:parse_IIOP_Profile): non-native cinfo is <iiop_1_0_1_NameService@tcp_watson.parc.xerox.com_9999> (iiop.c:parse_MultiComponent_Profile): profile contains 1 component (iiop.c:parse_MultiComponent_Profile): component 1 of type 1, 20 bytes (iiop.c:parse_MultiComponent_Profile): native codeset for SHORT CHARACTER is 00010001, with 0 converters (iiop.c:parse_MultiComponent_Profile): native codeset for CHARACTER is 00010100, with 0 converters That is, there's a vanilla Internet profile (bo=BigEndian, version=1.0, hostname=watson.parc.xerox.com, port=9999, object_key=<NameService>), plus a Multicomponent profile, noting that the ILU ORB's native codesets are Latin-1 and UCS-2. OK, great. Now we get the first message from the Java ORB: 0000 47 49 4f 50 01 00 00 00 00 00 01 00 GIOP........ 0000 00 00 00 02 00 00 00 01 00 00 00 0c 00 00 00 00 ................ 0010 00 01 00 20 00 01 01 00 00 00 00 06 00 00 00 90 ... ............ 0020 00 00 00 00 00 00 00 28 49 44 4c 3a 6f 6d 67 2e .......(IDL:omg. 0030 6f 72 67 2f 53 65 6e 64 69 6e 67 43 6f 6e 74 65 org/SendingConte 0040 78 74 2f 43 6f 64 65 42 61 73 65 3a 31 2e 30 00 xt/CodeBase:1.0. 0050 00 00 00 01 00 00 00 00 00 00 00 54 00 01 01 00 ...........T.... 0060 00 00 00 0c 31 33 2e 31 2e 31 30 33 2e 36 38 00 ....13.1.103.68. 0070 0e e9 00 00 00 00 00 18 af ab ca fe 00 00 00 02 ................ 0080 67 d5 93 95 00 00 00 08 00 00 00 00 00 00 00 00 g............... 0090 00 00 00 01 00 00 00 01 00 00 00 14 00 00 00 00 ................ 00a0 00 01 00 20 00 00 00 00 00 01 01 00 00 00 00 00 ... ............ 00b0 00 00 00 05 01 00 00 00 00 00 00 07 53 79 6e 65 ............Syne 00c0 72 67 79 00 00 00 00 06 5f 69 73 5f 61 00 00 00 rgy....._is_a... 00d0 00 00 00 00 00 00 00 28 49 44 4c 3a 6f 6d 67 2e .......(IDL:omg. 00e0 6f 72 67 2f 43 6f 73 4e 61 6d 69 6e 67 2f 4e 61 org/CosNaming/Na 00f0 6d 69 6e 67 43 6f 6e 74 65 78 74 3a 31 2e 30 00 mingContext:1.0. Note that we are seeing a CodeSets service context, even though the request is GIOP 1.0. The service context specifies a TCS-C of ASCII, and a TCS-W of UCS-2. The question is, what should the server do with it? First of all, there seems to be no way in which the algorithm in section 13.2.7.6 can result in the TCS-C specified in the service context. So perhaps this bug should be detected and signalled back to the sending ORB. How? Using CODESET_INCOMPATIBLE might make sense, but that really doesn't flag the bug in the client-side implementation of the codesets determination algorithm. Perhaps a straight COMM_FAILURE would be better. Opinions? Secondly, since this is GIOP 1.0, the client could reasonably ignore the service context, and go ahead with using its default codeset (Latin-1). However, to do so risks comm failure down the line, as ASCII (the TCS-C assumed by the client) does not permit many Latin-1 characters. It seems better to flag this situation up front. Resolution: close with revision. See below Revised Text: Resolution: Close with revision Section 13.7.2 makes it very clear that the server should not raise an exception if a codeset service context is contained in a GIOP 1.0 request message. No clarification is needed in that section. If a client or a server sends character data which is not encoded as LATIN-1, the receiver will not be able to detect it is not being LATIN-1 by examining the received sequence octets. Thus nothing can be stated for this case. A client orb should never use a GIOP 1.0 connection to send wchar or wstring data. This situation is covered by the resolution to Issue 3576, since if the objref has a version 1 IOR profile, it does not carry Codeset Information. However, if an operation with Wchar parameter data is erroneosly sent over GIOP 1.0 by a client, the server must generate a "MARSHAL" exception. We need minor code for this case. If a server sends Wchar or wstring data using GIOP 1.0 the client orb should raise a MARSHAL exception to the client application. This should have its own minor code. Revised Text: At the end of section 15.3.1.6 add the following: " If a client orb erroneously sends wchar or wstring data in a GIOP 1.0 message, the server shall generate a MARSHAL standard system exception, with standard minor code i. If a server erroneously sends wchar data in a GIOP 1.0 response, the client orb shall raise a MARSHAL exception to the client application with standard minor code j. " Add the following minor codes to table 4-3 for MARSHAL Standard System exception: " MARSHAL i wchar or wstring data erroneosly sent by client over GIOP 1.0 connection MARSHAL j wchar or wstring data erroneously returned by over GIOP 1.0 connection " Actions taken: July 5, 2000: received issue February 27, 2001: closed issue Discussion: The IOR does not have the codeset component in the INTEROP IOR Profile. A client may choose to ignore the MULTIPLE COMPONENTS profile safely. The Interop chapters require profiles to be complete, the MULTIPLE COMPONENTS profile is not complete, since it has no addressing information or object key. The server should not process codeset service context in GIOP 1.0, if present. If client sends other than LATIN-1 in GIOP 1.0, errors may arise. They should be dealt with in the response to each request message, rather than in the processing of the service context in the first message.. We need to clarify what is valid within the GIOP protocol for the server to do in this situation. The client should not be including a codeset service context in a GIOP 1.0 message, and certainly cannot transmit any wide characters or strings over a GIOP 1.0 connection. The codeset service context might be considered harmless if its TCS-C matched the fixed codeset used in GIOP 1.0 for chars and strings. The client is wrong to send the service context in this case of GIOP 1.0, (irrespective of its contents). 13.7.2 of Core states: " The following are the rules for processing a received service context: • The service context is in the OMG defined range: • If it is valid for the supported GIOP version, then it must be processed correctly according to the rules associated with it for that GIOP version level. • If it is not valid for the GIOP version, then it may be ignored by the receiving ORB, however it must be passed on through a bridge and must be made available to interceptors. No exception shall be raised. " Based on the above quote, the server is also wrong to look at it. If the client "pumps" ascii over the connection, the server will have no problems, since ascii is a subset of Latin-1. However, the server might "pump back" string return results or out parameters using the fully extended 8 bit space of Latin-1, and the client has to deal with it. We also need to use the propose/dispose mechanism that has been setup between Sun and the OMG provide for feeding back bug reports to Sun and having them run it through a Java standards process moral equivalent of OMG's "Urgent bugfix process"? That also needs to happen to fix this particular problem.. End of Annotations:===== To: issues@omg.org, interop@omg.org Cc: sbrawer@parc.xerox.com, janssen@parc.xerox.com Subject: interop issue: CodeSets service context in GIOP 1.0 request Sender: Bill Janssen From: Bill Janssen Mime-Version: 1.0 Message-Id: <00Jul5.145042pdt."3438"@watson.parc.xerox.com> Date: Wed, 5 Jul 2000 14:50:41 PDT Content-Type: text/plain; charset=US-ASCII X-UIDL: H~S!!5gpd91GPe9hX)e9 Well, a new Java release is upon us, and with it comes a new CORBA implementation. I'm trying Java 2 SE 1.3 CORBA clients against an ILU 2.0beta1 CosNaming server, and we find that the Java ORB cannot reliably connect to the server. Why not? First, we must analyze the IOR provided by the ILU service: IOR:000000000000002849444C3A6F6D672E6F72672F436F734E616D696E672F4E616D696E67436F6E746578743A312E300000000002000000000000002F0001000000000016776174736F6E2E706172632E7865726F782E636F6D00270F0000000B4E616D6553657276696365000000000100000024000100000000000100000001000000140001001800010001000000000001010000000000 If we look at this (those who've received it un-truncated) we find that it advertises the following: _IIOP_ParseCDR: byte order BigEndian, repository id , 2 profiles _IIOP_ParseCDR: profile 1 is 47 bytes, tag 0 (INTERNET), BigEndian byte order _IIOP_ParseCDR: profile 2 is 36 bytes, tag 1 (MULTIPLE COMPONENT), BigEndian byte order (iiop.c:parse_IIOP_Profile): bo=BigEndian, version=1.0, hostname=watson.parc.xerox.com, port=9999, object_key= (iiop.c:parse_IIOP_Profile): encoded object key is (iiop.c:parse_IIOP_Profile): non-native cinfo is (iiop.c:parse_MultiComponent_Profile): profile contains 1 component (iiop.c:parse_MultiComponent_Profile): component 1 of type 1, 20 bytes (iiop.c:parse_MultiComponent_Profile): native codeset for SHORT CHARACTER is 00010001, with 0 converters (iiop.c:parse_MultiComponent_Profile): native codeset for CHARACTER is 00010100, with 0 converters That is, there's a vanilla Internet profile (bo=BigEndian, version=1.0, hostname=watson.parc.xerox.com, port=9999, object_key=), plus a Multicomponent profile, noting that the ILU ORB's native codesets are Latin-1 and UCS-2. OK, great. Now we get the first message from the Java ORB: 0000 47 49 4f 50 01 00 00 00 00 00 01 00 GIOP........ 0000 00 00 00 02 00 00 00 01 00 00 00 0c 00 00 00 00 ................ 0010 00 01 00 20 00 01 01 00 00 00 00 06 00 00 00 90 ... ............ 0020 00 00 00 00 00 00 00 28 49 44 4c 3a 6f 6d 67 2e .......(IDL:omg. 0030 6f 72 67 2f 53 65 6e 64 69 6e 67 43 6f 6e 74 65 org/SendingConte 0040 78 74 2f 43 6f 64 65 42 61 73 65 3a 31 2e 30 00 xt/CodeBase:1.0. 0050 00 00 00 01 00 00 00 00 00 00 00 54 00 01 01 00 ...........T.... 0060 00 00 00 0c 31 33 2e 31 2e 31 30 33 2e 36 38 00 ....13.1.103.68. 0070 0e e9 00 00 00 00 00 18 af ab ca fe 00 00 00 02 ................ 0080 67 d5 93 95 00 00 00 08 00 00 00 00 00 00 00 00 g............... 0090 00 00 00 01 00 00 00 01 00 00 00 14 00 00 00 00 ................ 00a0 00 01 00 20 00 00 00 00 00 01 01 00 00 00 00 00 ... ............ 00b0 00 00 00 05 01 00 00 00 00 00 00 07 53 79 6e 65 ............Syne 00c0 72 67 79 00 00 00 00 06 5f 69 73 5f 61 00 00 00 rgy....._is_a... 00d0 00 00 00 00 00 00 00 28 49 44 4c 3a 6f 6d 67 2e .......(IDL:omg. 00e0 6f 72 67 2f 43 6f 73 4e 61 6d 69 6e 67 2f 4e 61 org/CosNaming/Na 00f0 6d 69 6e 67 43 6f 6e 74 65 78 74 3a 31 2e 30 00 mingContext:1.0. Note that we are seeing a CodeSets service context, even though the request is GIOP 1.0. The service context specifies a TCS-C of ASCII, and a TCS-W of UCS-2. The question is, what should the server do with it? First of all, there seems to be no way in which the algorithm in section 13.2.7.6 can result in the TCS-C specified in the service context. So perhaps this bug should be detected and signalled back to the sending ORB. How? Using CODESET_INCOMPATIBLE might make sense, but that really doesn't flag the bug in the client-side implementation of the codesets determination algorithm. Perhaps a straight COMM_FAILURE would be better. Opinions? Secondly, since this is GIOP 1.0, the client could reasonably ignore the service context, and go ahead with using its default codeset (Latin-1). However, to do so risks comm failure down the line, as ASCII (the TCS-C assumed by the client) does not permit many Latin-1 characters. It seems better to flag this situation up front. Bill Sender: jon@corvette.floorboard.com Message-ID: <3963C325.85B60C86@floorboard.com> Date: Wed, 05 Jul 2000 16:22:13 -0700 From: Jonathan Biggar X-Mailer: Mozilla 4.73 [en] (X11; U; SunOS 5.5.1 sun4m) X-Accept-Language: en MIME-Version: 1.0 To: Bill Janssen CC: issues@omg.org, interop@omg.org, sbrawer@parc.xerox.com Subject: Re: interop issue: CodeSets service context in GIOP 1.0 request References: <00Jul5.145042pdt."3438"@watson.parc.xerox.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: N5Je99D_d9Rk\!!K#/!! Bill Janssen wrote: > Note that we are seeing a CodeSets service context, even though the > request is GIOP 1.0. The service context specifies a TCS-C of > ASCII, > and a TCS-W of UCS-2. > > The question is, what should the server do with it? > > First of all, there seems to be no way in which the algorithm in > section 13.2.7.6 can result in the TCS-C specified in the service > context. So perhaps this bug should be detected and signalled back > to > the sending ORB. How? Using CODESET_INCOMPATIBLE might make sense, > but that really doesn't flag the bug in the client-side > implementation > of the codesets determination algorithm. Perhaps a straight > COMM_FAILURE would be better. Opinions? I'd send back a GIOP MessageError to signal that the received message was incorrectly formatted and close the connection. That will also result in a COMM_FAILURE exception. > Secondly, since this is GIOP 1.0, the client could reasonably ignore > the service context, and go ahead with using its default codeset > (Latin-1). However, to do so risks comm failure down the line, as > ASCII (the TCS-C assumed by the client) does not permit many Latin-1 > characters. It seems better to flag this situation up front. The liberal in what you accept policy would suggest that you accept a correctly formatted Codeset service context, but in this case it isn't correct, so you need to reject it. I suppose you probably will also need to have an "accept stupid Java Codeset contexts" switch in order to allow for interoperability. :-( -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Date: Wed, 05 Jul 2000 19:27:52 -0400 From: Bob Kukura Organization: IONA Technologies X-Mailer: Mozilla 4.73 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: Bill Janssen CC: interop@omg.org, sbrawer@parc.xerox.com Subject: Re: interop issue: CodeSets service context in GIOP 1.0 request References: <00Jul5.145042pdt."3438"@watson.parc.xerox.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: --,e9PImd9KTLe9C6O!! First, the IIOP client should absolutely without a doubt ignore the codeset component. The component is not in a profile supporting IIOP. For an IIOP client implementation to pay any attention to the component in the TAG_MULTIPLE_COMPONENTS profile would violate rules 1 and 2 quoted here from section 13.6.4 of the CORBA 2.4 draft I have, and which have been in CORBA since 2.0: 13.6.4 Profile and Component Composition in IORs The following rules augment the preceding discussion: 1. Profiles must be independent, complete, and self-contained. Their use shall not depend on information contained in another profile. 2. Any invocation uses information from exactly one profile. Second, the IIOP server should probably just ignore the service context, as it is not specified as being applicable to GIOP 1.0. But raising an exception may make more sense in this case, as you suggest. In this case, my opinion is that the client is rightfully ignoring the codeset component, and would probably include the same codeset service context even if the IOR contained no codeset component at all (have you tried this?). But I don't think the client should be including a codeset service context in a GIOP 1.0 message, and certainly cannot transmit any wide characters or strings over a GIOP 1.0 connection. The codeset service context might be considered harmless if its TCS-C matched the fixed codeset used in GIOP 1.0 for chars and strings. -Bob Bill Janssen wrote: > > Well, a new Java release is upon us, and with it comes a new CORBA > implementation. I'm trying Java 2 SE 1.3 CORBA clients against an ILU > 2.0beta1 CosNaming server, and we find that the Java ORB cannot > reliably connect to the server. Why not? First, we must analyze the > IOR provided by the ILU service: > > IOR:000000000000002849444C3A6F6D672E6F72672F436F734E616D696E672F4E616D696E67436F6E746578743A312E300000000002000000000000002F0001000000000016776174736F6E2E706172632E7865726F782E636F6D00270F0000000B4E616D6553657276696365000000000100000024000100000000000100000001000000140001001800010001000000000001010000000000 > > If we look at this (those who've received it un-truncated) we find that it advertises the following: > > _IIOP_ParseCDR: byte order BigEndian, repository id , 2 profiles > _IIOP_ParseCDR: profile 1 is 47 bytes, tag 0 (INTERNET), BigEndian byte order > _IIOP_ParseCDR: profile 2 is 36 bytes, tag 1 (MULTIPLE COMPONENT), BigEndian byte order > (iiop.c:parse_IIOP_Profile): bo=BigEndian, version=1.0, hostname=watson.parc.xerox.com, port=9999, object_key= > (iiop.c:parse_IIOP_Profile): encoded object key is > (iiop.c:parse_IIOP_Profile): non-native cinfo is > (iiop.c:parse_MultiComponent_Profile): profile contains 1 component > (iiop.c:parse_MultiComponent_Profile): component 1 of type 1, 20 bytes > (iiop.c:parse_MultiComponent_Profile): native codeset for SHORT CHARACTER is 00010001, with 0 converters > (iiop.c:parse_MultiComponent_Profile): native codeset for CHARACTER is 00010100, with 0 converters > > That is, there's a vanilla Internet profile (bo=BigEndian, > version=1.0, hostname=watson.parc.xerox.com, port=9999, > object_key=), plus a Multicomponent profile, noting that > the ILU ORB's native codesets are Latin-1 and UCS-2. > > OK, great. Now we get the first message from the Java ORB: > > 0000 47 49 4f 50 01 00 00 00 00 00 01 00 GIOP........ > 0000 00 00 00 02 00 00 00 01 00 00 00 0c 00 00 00 00 ................ > 0010 00 01 00 20 00 01 01 00 00 00 00 06 00 00 00 90 ... ............ > 0020 00 00 00 00 00 00 00 28 49 44 4c 3a 6f 6d 67 2e .......(IDL:omg. > 0030 6f 72 67 2f 53 65 6e 64 69 6e 67 43 6f 6e 74 65 org/SendingConte > 0040 78 74 2f 43 6f 64 65 42 61 73 65 3a 31 2e 30 00 xt/CodeBase:1.0. > 0050 00 00 00 01 00 00 00 00 00 00 00 54 00 01 01 00 ...........T.... > 0060 00 00 00 0c 31 33 2e 31 2e 31 30 33 2e 36 38 00 ....13.1.103.68. > 0070 0e e9 00 00 00 00 00 18 af ab ca fe 00 00 00 02 ................ > 0080 67 d5 93 95 00 00 00 08 00 00 00 00 00 00 00 00 g............... > 0090 00 00 00 01 00 00 00 01 00 00 00 14 00 00 00 00 ................ > 00a0 00 01 00 20 00 00 00 00 00 01 01 00 00 00 00 00 ... ............ > 00b0 00 00 00 05 01 00 00 00 00 00 00 07 53 79 6e 65 ............Syne > 00c0 72 67 79 00 00 00 00 06 5f 69 73 5f 61 00 00 00 rgy....._is_a... > 00d0 00 00 00 00 00 00 00 28 49 44 4c 3a 6f 6d 67 2e .......(IDL:omg. > 00e0 6f 72 67 2f 43 6f 73 4e 61 6d 69 6e 67 2f 4e 61 org/CosNaming/Na > 00f0 6d 69 6e 67 43 6f 6e 74 65 78 74 3a 31 2e 30 00 mingContext:1.0. > > Note that we are seeing a CodeSets service context, even though the > request is GIOP 1.0. The service context specifies a TCS-C of ASCII, > and a TCS-W of UCS-2. > > The question is, what should the server do with it? > > First of all, there seems to be no way in which the algorithm in > section 13.2.7.6 can result in the TCS-C specified in the service > context. So perhaps this bug should be detected and signalled back to > the sending ORB. How? Using CODESET_INCOMPATIBLE might make sense, > but that really doesn't flag the bug in the client-side implementation > of the codesets determination algorithm. Perhaps a straight > COMM_FAILURE would be better. Opinions? > > Secondly, since this is GIOP 1.0, the client could reasonably ignore > the service context, and go ahead with using its default codeset > (Latin-1). However, to do so risks comm failure down the line, as > ASCII (the TCS-C assumed by the client) does not permit many Latin-1 > characters. It seems better to flag this situation up front. > > Bill To: Bob Kukura cc: interop@omg.org Subject: Re: interop issue: CodeSets service context in GIOP 1.0 request In-Reply-To: Your message of "Wed, 05 Jul 2000 16:27:52 PDT." <3963C478.83D3E83F@iona.com> From: Bill Janssen Message-Id: <00Jul5.164433pdt."3439"@watson.parc.xerox.com> Date: Wed, 5 Jul 2000 16:44:26 PDT Content-Type: text X-UIDL: 6:*e9^+Y!!]^ed9kY7!! > In this case, my opinion is that the client is rightfully ignoring the > codeset component, and would probably include the same codeset service > context even if the IOR contained no codeset component at all (have you > tried this?). I'll test this, and report back. > But I don't think the client should be including a codeset > service context in a GIOP 1.0 message, and certainly cannot transmit > any > wide characters or strings over a GIOP 1.0 connection. The codeset > service context might be considered harmless if its TCS-C matched > the > fixed codeset used in GIOP 1.0 for chars and strings. Yep. Bill Date: Thu, 06 Jul 2000 18:33:33 +0100 From: Simon Nash Organization: IBM X-Mailer: Mozilla 4.72 [en] (Windows NT 5.0; I) X-Accept-Language: en MIME-Version: 1.0 To: Jonathan Biggar CC: Bill Janssen , interop@omg.org, sbrawer@parc.xerox.com Subject: Re: interop issue: CodeSets service context in GIOP 1.0 request References: <00Jul5.145042pdt."3438"@watson.parc.xerox.com> <3963C325.85B60C86@floorboard.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: e= > Bill Janssen wrote: > > > Note that we are seeing a CodeSets service context, even though the > > request is GIOP 1.0. The service context specifies a TCS-C of ASCII, > > and a TCS-W of UCS-2. > > > > The question is, what should the server do with it? > > > > First of all, there seems to be no way in which the algorithm in > > section 13.2.7.6 can result in the TCS-C specified in the service > > context. So perhaps this bug should be detected and signalled back to > > the sending ORB. How? Using CODESET_INCOMPATIBLE might make sense, > > but that really doesn't flag the bug in the client-side implementation > > of the codesets determination algorithm. Perhaps a straight > > COMM_FAILURE would be better. Opinions? > > I'd send back a GIOP MessageError to signal that the received message > was incorrectly formatted and close the connection. That will also > result in a COMM_FAILURE exception. > It is not incorrectly formatted. It is correctly formatted but its contents are wrong. I think the closest exception for this case is BAD_PARAM. The following is from section 13.7.2.6: If a client transmits wide character data and does not specify its wchar transmission code set in the service context, then the server-side ORB raises exception BAD_PARAM. This isn't the case under discussion, but it is a violation by the client of the rules, detected by the server, and as such is similar. However the point is moot in this particular case since the server should ignore this service context if sent on a GIOP 1.0 message. > > Secondly, since this is GIOP 1.0, the client could reasonably ignore > > the service context, and go ahead with using its default codeset > > (Latin-1). However, to do so risks comm failure down the line, as > > ASCII (the TCS-C assumed by the client) does not permit many Latin-1 > > characters. It seems better to flag this situation up front. > > The liberal in what you accept policy would suggest that you accept a > correctly formatted Codeset service context, but in this case it isn't > correct, so you need to reject it. I suppose you probably will also > need to have an "accept stupid Java Codeset contexts" switch in order to > allow for interoperability. :-( > I think the client is wrong to send the service context in this case (irrespective of its contents) and the server is also wrong to look at it. Simon -- Simon C Nash, Technology Architect, IBM Java Technology Centre Tel. +44-1962-815156 Fax +44-1962-818999 Hursley, England Internet: nash@hursley.ibm.com Lotus Notes: Simon Nash@ibmgb To: Simon Nash cc: Jonathan Biggar , interop@omg.org, sbrawer@parc.xerox.com Subject: Re: interop issue: CodeSets service context in GIOP 1.0 request In-Reply-To: Your message of "Thu, 06 Jul 2000 10:33:33 PDT." <3964C2ED.A1858F3@hursley.ibm.com> From: Bill Janssen Message-Id: <00Jul6.160213pdt."3438"@watson.parc.xerox.com> Date: Thu, 6 Jul 2000 16:02:09 PDT Content-Type: text X-UIDL: B!K!! It is not incorrectly formatted. It is correctly formatted but its contents > are wrong. I think the closest exception for this case is BAD_PARAM. The > following is from section 13.7.2.6: > If a client transmits wide character data and does not specify its wchar > transmission code set in the service context, then the server-side ORB raises > exception BAD_PARAM. > This isn't the case under discussion, but it is a violation by the client of > the rules, detected by the server, and as such is similar. However the point > is moot in this particular case since the server should ignore this service > context if sent on a GIOP 1.0 message. Yes, I think I agree with Simon on all this. However, I still hate to ignore it. The two sides of the connection are assuming different character sets. This can't be good for interoperability. Surely the client has to use Latin-1 over GIOP 1.0, not ASCII? That seems to me to lean towards Jon's suggestion of sending a MessageError. Bill Date: Fri, 07 Jul 2000 09:45:29 +0100 From: Simon Nash Organization: IBM X-Mailer: Mozilla 4.72 [en] (Windows NT 5.0; I) X-Accept-Language: en MIME-Version: 1.0 To: Bill Janssen CC: Jonathan Biggar , interop@omg.org, sbrawer@parc.xerox.com Subject: Re: interop issue: CodeSets service context in GIOP 1.0 request References: <00Jul6.160213pdt."3438"@watson.parc.xerox.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: K&4e9i3Fe9?*=e9Fjnd9 Bill, I believe ASCII is a proper subset of Latin-1, so as long as only characters in the ASCII range are transmitted there should be no problems. However, I still think the server has no business looking inside this service context and attempting to validate it when it is sent on a GIOP 1.0 request. GIOP 1.0 clients are obliged to support Latin-1, whatever random nonsense they may happen to send in nonstandard service contexts. This would be a much more interesting discussion if the negotiation exchange were taking place over GIOP 1.1 or 1.2. Simon Bill Janssen wrote: > > > It is not incorrectly formatted. It is correctly formatted but its contents > > are wrong. I think the closest exception for this case is BAD_PARAM. The > > following is from section 13.7.2.6: > > If a client transmits wide character data and does not specify its wchar > > transmission code set in the service context, then the server-side ORB raises > > exception BAD_PARAM. > > This isn't the case under discussion, but it is a violation by the client of > > the rules, detected by the server, and as such is similar. However the point > > is moot in this particular case since the server should ignore this service > > context if sent on a GIOP 1.0 message. > > Yes, I think I agree with Simon on all this. > > However, I still hate to ignore it. The two sides of the connection are > assuming different character sets. This can't be good for interoperability. > Surely the client has to use Latin-1 over GIOP 1.0, not ASCII? That seems > to me to lean towards Jon's suggestion of sending a MessageError. > > Bill -- Simon C Nash, Technology Architect, IBM Java Technology Centre Tel. +44-1962-815156 Fax +44-1962-818999 Hursley, England Internet: nash@hursley.ibm.com Lotus Notes: Simon Nash@ibmgb To: Simon Nash cc: Jonathan Biggar , interop@omg.org, sbrawer@parc.xerox.com Subject: Re: interop issue: CodeSets service context in GIOP 1.0 request In-Reply-To: Your message of "Fri, 07 Jul 2000 01:45:29 PDT." <396598A9.46271D1A@hursley.ibm.com> From: Bill Janssen Message-Id: <00Jul7.091430pdt."3438"@watson.parc.xerox.com> Date: Fri, 7 Jul 2000 09:14:22 PDT Content-Type: text X-UIDL: GeE!!3*pd9Z4n!!hlXd9 > Bill, > I believe ASCII is a proper subset of Latin-1, so as long as only > characters in > the ASCII range are transmitted there should be no problems. The key word here is `subset', Simon. What you say is of course correct, but I don't think that the two code sets are "compatible" in the sense of chapter 13. > However, I still think the server has no business looking inside this > service context and attempting to validate it when it is sent on a > GIOP 1.0 request. The spec says that GIOP 1.0 servers `may' ignore the service context, not that they are obliged to. > GIOP 1.0 clients are obliged to support Latin-1, > whatever random nonsense they may happen to send in nonstandard > service contexts. Yes, of course. I hope that the Java folks at Sun share this understanding. Thanks. Bill Date: Fri, 07 Jul 2000 09:47:31 -0700 From: "M. Mortazavi" X-Mailer: Mozilla 4.7 [en] (WinNT; I) X-Accept-Language: en MIME-Version: 1.0 To: Bill Janssen CC: Simon Nash , Jonathan Biggar , interop@omg.org, sbrawer@parc.xerox.com Subject: Re: interop issue: CodeSets service context in GIOP 1.0 request References: <00Jul7.091430pdt."3438"@watson.parc.xerox.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: $*~!!doN!!cd[d9a&E!! Bill Janssen wrote: > > Bill, > > I believe ASCII is a proper subset of Latin-1, so as long as only > > characters in > > the ASCII range are transmitted there should be no problems. > > The key word here is `subset', Simon. What you say is of course > > correct, > but I don't think that the two code sets are "compatible" in the > > sense > of chapter 13. > > > However, I still think the server has no business looking inside > > this > > service context and attempting to validate it when it is sent on a > > GIOP 1.0 request. > > The spec says that GIOP 1.0 servers `may' ignore the service > > context, > not that they are obliged to. > > > GIOP 1.0 clients are obliged to support Latin-1, > > whatever random nonsense they may happen to send in nonstandard > > service contexts. > > Yes, of course. I hope that the Java folks at Sun share this > understanding. I need to hear more about this and why it is the case. We're working on these and other issues and have been keeping an ear tuned to this conversation. Roger - M. > Thanks. > > Bill Date: Fri, 07 Jul 2000 15:05:10 -0400 From: Jishnu Mukerji Reply-To: jis@fpk.hp.com Organization: Hewlett-Packard EIAL, Florham Park NJ USA X-Mailer: Mozilla 4.61 [en] (WinNT; I) X-Accept-Language: en MIME-Version: 1.0 To: Peter Walker Cc: interop@emerald.omg.org Subject: Re: issue 3681 -- Interop RTF issue References: <4.2.0.58.20000707141951.00bc6100@emerald.omg.org> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: 9N$!!95@!!hV > This is issue # 3681 > > interop issue: CodeSets service context in GIOP 1.0 request > > Well, a new Java release is upon us, and with it comes a new CORBA > implementation. I'm trying Java 2 SE 1.3 CORBA clients against an ILU > 2.0beta1 CosNaming server, and we find that the Java ORB cannot > reliably connect to the server. Why not? First, we must analyze the > IOR provided by the ILU service: [rest elided for brevity] The bottom line excerpted from a different message from Bill Janssen: > There's no way that the TCS-C determination algorithm in Chapter 13 > of the CORBA spec could result in a TCS-C of ASCII, given the > situation, > yet that's what the JDK 1.3 orb is sending in its CodeSets service > context. > > In particular, when using GIOP 1.0, the client is effectively > obliged > to use Latin-1 as the TCS-C, since there is no guarantee that the > server will notice or conform to any other TCS-C selected by the > client. This seems to me like a Java 2 SE 1.3 bug. Does the propose/dispose mechanism that has been setup between Sun and the OMG provide for feeding back bug reports to Sun and having them run it through a Java standards process moral equivalent of OMG's "Urgent bugfix process"? That is what needs to happen to fix this particular problem I think. Any thoughts on this matter Peter? Jishnu. To: "M. Mortazavi" Cc: sbrawer@parc.xerox.com, interop@omg.org Subject: Re: interop issue: CodeSets service context in GIOP 1.0 request In-Reply-To: Your message of "Fri, 07 Jul 2000 09:47:31 PDT." <396609A3.B9F4C37E@eng.sun.com> From: Bill Janssen Message-Id: <00Jul7.105033pdt."3438"@watson.parc.xerox.com> Date: Fri, 7 Jul 2000 10:50:26 PDT Content-Type: text X-UIDL: leQd9Te'e9Wc1!!;BL!! Thanks, Masood. I believe my understanding of the problem is summed up in my first message: There's no way that the TCS-C determination algorithm in Chapter 13 of the CORBA spec could result in a TCS-C of ASCII, given the situation, yet that's what the JDK 1.3 orb is sending in its CodeSets service context. In particular, when using GIOP 1.0, the client is effectively obliged to use Latin-1 as the TCS-C, since there is no guarantee that the server will notice or conform to any other TCS-C selected by the client. Bill Date: Tue, 18 Jul 2000 11:14:31 -0700 From: "M. Mortazavi" X-Mailer: Mozilla 4.7 [en] (WinNT; I) X-Accept-Language: en MIME-Version: 1.0 To: Bill Janssen CC: Simon Nash , Jonathan Biggar , interop@omg.org, sbrawer@parc.xerox.com Subject: Re: interop issue: CodeSets service context in GIOP 1.0 request References: <00Jul7.091430pdt."3438"@watson.parc.xerox.com> Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=iso-8859-1 X-UIDL: +4kd9`e\!!a#c!!'#Ud9 Bill Janssen wrote: > > Bill, > > I believe ASCII is a proper subset of Latin-1, so as long as only > > characters in > > the ASCII range are transmitted there should be no problems. > > The key word here is `subset', Simon. What you say is of course > > correct, > but I don't think that the two code sets are "compatible" in the > > sense > of chapter 13. > The compatibility conditions appear to be quite vague. 13.9.1 says : is determined with respect to two code sets by examining their entries in the registry, paying special attention to the character sets encoded by each code set. For each of the two code sets, an attempt is made to see if there is at least one (fuzzy-defined) character set in common, and if such a character set is found, then the assumption is made that these code sets ar e Obviously, applications which exploit parts of a character set not properly encoded in this scheme will suffer information loss when communicating with another application in thi s scheme. I could never grasp how a "fuzzy" scheme could be relied on when interoperability is held to high and strict standards. M. > > > However, I still think the server has no business looking inside this > > service context and attempting to validate it when it is sent on a > > GIOP 1.0 request. > > The spec says that GIOP 1.0 servers `may' ignore the service context, > not that they are obliged to. > > > GIOP 1.0 clients are obliged to support Latin-1, > > whatever random nonsense they may happen to send in nonstandard > > service contexts. > > Yes, of course. I hope that the Java folks at Sun share this > understanding. > > Thanks. > > Bill Date: Tue, 18 Jul 2000 12:06:00 -0700 From: "M. Mortazavi" X-Mailer: Mozilla 4.7 [en] (WinNT; I) X-Accept-Language: en MIME-Version: 1.0 To: Bill Janssen CC: Simon Nash , Jonathan Biggar , interop@omg.org, sbrawer@parc.xerox.com Subject: Re: interop issue: CodeSets service context in GIOP 1.0 request References: <00Jul6.160213pdt."3438"@watson.parc.xerox.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: 4,Me9!jV!!N,Ge9B-[!! Bill - Bill Janssen wrote: > > It is not incorrectly formatted. It is correctly formatted but its contents > > are wrong. I think the closest exception for this case is BAD_PARAM. The > > following is from section 13.7.2.6: > > If a client transmits wide character data and does not specify its wchar > > transmission code set in the service context, then the server-side ORB raises > > exception BAD_PARAM. > > This isn't the case under discussion, but it is a violation by the client of > > the rules, detected by the server, and as such is similar. However the point > > is moot in this particular case since the server should ignore this service > > context if sent on a GIOP 1.0 message. > > Yes, I think I agree with Simon on all this. > > However, I still hate to ignore it. The two sides of the connection are > assuming different character sets. This can't be good for interoperability. > Surely the client ORB > has to use Latin-1 over GIOP 1.0, not ASCII? or throw CODESET_INCOMPATIBLE to the client thread, assuming that "conversion from the client native code set to the fallback, and the fallback to the server native code set would result in massive data loss," but then there's the apparent agreement that Latin-1 should always be supported by a GIOP1.0 client. 13.7.2.1 says: "Backward compatibility. In previous CORBA specifications, IDL type char was limited to ISO 8859-1. The conversion framework should be compatible with existing clients and servers that use ISO 8859-1 as the code set for char." This seems to call for Latin-1, for the purposes of _backward_ compatiblity. But "3.2 Lexical Conventions" section still leaves me at a quandary about the "backward"ness of this compatibility. If I'm not mistaken, there seems to be a quite a mess here in the specs. (Note that I'm quoting from the formal CORBA 2.3 spec.) Cheers, M. > That seems > to me to lean towards Jon's suggestion of sending a MessageError. > > Bill To: "M. Mortazavi" cc: Simon Nash , Jonathan Biggar , interop@omg.org, sbrawer@parc.xerox.com Subject: Re: interop issue: CodeSets service context in GIOP 1.0 request In-Reply-To: Your message of "Tue, 18 Jul 2000 12:06:00 PDT." <3974AA98.73DFF779@eng.sun.com> From: Bill Janssen Message-Id: <00Jul18.122544pdt."3438"@watson.parc.xerox.com> Date: Tue, 18 Jul 2000 12:25:34 PDT Content-Type: text X-UIDL: -hld9>(7e9EB[!!gFc!! > If I'm not mistaken, there seems to be a quite a mess here in the specs. (Note that I'm > quoting from the formal CORBA 2.3 spec.) No, I don't think you're mistaken :-). Bill From: "Rutt, T E (Tom)" To: "'interop@omg.org'" Cc: "Rutt, T E (Tom)" Subject: Wordsmith on issue 2681 resolution Date: Tue, 7 Nov 2000 12:18:37 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" X-UIDL: GT@e9'0*!!&CV!!C'l!! Please look this over, if there are not comments I will put this Proposal out for vote at the end of the week. -------- Issue 3681: interop issue: CodeSets service context in GIOP 1.0 request (interop) Click here for this issue's archive. Source: Xerox (Mr. William C. Janssen, Jr., janssen@parc.xerox.com) Nature: Uncategorized Issue Severity: Summary: Well, a new Java release is upon us, and with it comes a new CORBA implementation. I'm trying Java 2 SE 1.3 CORBA clients against an ILU 2.0beta1 CosNaming server, and we find that the Java ORB cannot reliably connect to the server. Why not? First, we must analyze the IOR provided by the ILU service: IOR:000000000000002849444C3A6F6D672E6F72672F436F734E616D696E672F4E616 D696E67436F6E746578743A312E300000000002000000000000002F00010000000000 16776174736F6E2E706172632E7865726F782E636F6D00270F0000000B4E616D65536 572766963650000000001000000240001000000000001000000010000001400010018 00010001000000000001010000000000 If we look at this (those who've received it un-truncated) we find that it advertises the following: _IIOP_ParseCDR: byte order BigEndian, repository id , 2 profiles _IIOP_ParseCDR: profile 1 is 47 bytes, tag 0 (INTERNET), BigEndian byte order _IIOP_ParseCDR: profile 2 is 36 bytes, tag 1 (MULTIPLE COMPONENT), BigEndian byte order (iiop.c:parse_IIOP_Profile): bo=BigEndian, version=1.0, hostname=watson.parc.xerox.com, port=9999, object_key= (iiop.c:parse_IIOP_Profile): encoded object key is (iiop.c:parse_IIOP_Profile): non-native cinfo is (iiop.c:parse_MultiComponent_Profile): profile contains 1 component (iiop.c:parse_MultiComponent_Profile): component 1 of type 1, 20 bytes (iiop.c:parse_MultiComponent_Profile): native codeset for SHORT CHARACTER is 00010001, with 0 converters (iiop.c:parse_MultiComponent_Profile): native codeset for CHARACTER is 00010100, with 0 converters That is, there's a vanilla Internet profile (bo=BigEndian, version=1.0, hostname=watson.parc.xerox.com, port=9999, object_key=), plus a Multicomponent profile, noting that the ILU ORB's native codesets are Latin-1 and UCS-2. OK, great. Now we get the first message from the Java ORB: 0000 47 49 4f 50 01 00 00 00 00 00 01 00 GIOP........ 0000 00 00 00 02 00 00 00 01 00 00 00 0c 00 00 00 00 ................ 0010 00 01 00 20 00 01 01 00 00 00 00 06 00 00 00 90 ... ............ 0020 00 00 00 00 00 00 00 28 49 44 4c 3a 6f 6d 67 2e .......(IDL:omg. 0030 6f 72 67 2f 53 65 6e 64 69 6e 67 43 6f 6e 74 65 org/SendingConte 0040 78 74 2f 43 6f 64 65 42 61 73 65 3a 31 2e 30 00 xt/CodeBase:1.0. 0050 00 00 00 01 00 00 00 00 00 00 00 54 00 01 01 00 ...........T.... 0060 00 00 00 0c 31 33 2e 31 2e 31 30 33 2e 36 38 00 ....13.1.103.68. 0070 0e e9 00 00 00 00 00 18 af ab ca fe 00 00 00 02 ................ 0080 67 d5 93 95 00 00 00 08 00 00 00 00 00 00 00 00 g............... 0090 00 00 00 01 00 00 00 01 00 00 00 14 00 00 00 00 ................ 00a0 00 01 00 20 00 00 00 00 00 01 01 00 00 00 00 00 ... ............ 00b0 00 00 00 05 01 00 00 00 00 00 00 07 53 79 6e 65 ............Syne 00c0 72 67 79 00 00 00 00 06 5f 69 73 5f 61 00 00 00 rgy....._is_a... 00d0 00 00 00 00 00 00 00 28 49 44 4c 3a 6f 6d 67 2e .......(IDL:omg. 00e0 6f 72 67 2f 43 6f 73 4e 61 6d 69 6e 67 2f 4e 61 org/CosNaming/Na 00f0 6d 69 6e 67 43 6f 6e 74 65 78 74 3a 31 2e 30 00 mingContext:1.0. Note that we are seeing a CodeSets service context, even though the request is GIOP 1.0. The service context specifies a TCS-C of ASCII, and a TCS-W of UCS-2. The question is, what should the server do with it? First of all, there seems to be no way in which the algorithm in section 13.2.7.6 can result in the TCS-C specified in the service context. So perhaps this bug should be detected and signalled back to the sending ORB. How? Using CODESET_INCOMPATIBLE might make sense, but that really doesn't flag the bug in the client-side implementation of the codesets determination algorithm. Perhaps a straight COMM_FAILURE would be better. Opinions? Secondly, since this is GIOP 1.0, the client could reasonably ignore the service context, and go ahead with using its default codeset (Latin-1). However, to do so risks comm failure down the line, as ASCII (the TCS-C assumed by the client) does not permit many Latin-1 characters. It seems better to flag this situation up front. Discussion: The IOR does not have the codeset component in the INTEROP IOR Profile. A client may choose to ignore the MULTIPLE COMPONENTS profile safely. The Interop chapters require profiles to be complete, the MULTIPLE COMPONENTS profile is not complete, since it has no addressing information or object key. The server should not process codeset service context in GIOP 1.0, if present. If client sends other than LATIN-1 in GIOP 1.0, errors may arise. They should be dealt with in the response to each request message, rather than in the processing of the service context in the first message.. We need to clarify what is valid within the GIOP protocol for the server to do in this situation. The client should not be including a codeset service context in a GIOP 1.0 message, and certainly cannot transmit any wide characters or strings over a GIOP 1.0 connection. The codeset service context might be considered harmless if its TCS-C matched the fixed codeset used in GIOP 1.0 for chars and strings. The client is wrong to send the service context in this case of GIOP 1.0, (irrespective of its contents). 13.7.2 of Core states: " The following are the rules for processing a received service context: * The service context is in the OMG defined range: * If it is valid for the supported GIOP version, then it must be processed correctly according to the rules associated with it for that GIOP version level. * If it is not valid for the GIOP version, then it may be ignored by the receiving ORB, however it must be passed on through a bridge and must be made available to interceptors. No exception shall be raised. " Based on the above quote, the server is also wrong to look at it. If the client "pumps" ascii over the connection, the server will have no problems, since ascii is a subset of Latin-1. However, the server might "pump back" string return results or out parameters using the fully extended 8 bit space of Latin-1, and the client has to deal with it. We also need to use the propose/dispose mechanism that has been setup between Sun and the OMG provide for feeding back bug reports to Sun and having them run it through a Java standards process moral equivalent of OMG's "Urgent bugfix process"? That also needs to happen to fix this particular problem.. Resolution: Section 13.7.2 makes it very clear that the server should not raise an exception if a codeset service context is contained in a GIOP 1.0 request message. No clarification is needed in that section. If a client or a server sends character data which is not encoded as LATIN-1, the receiver will not be able to detect it is not being LATIN-1 by examining the received sequence octets. Thus nothing can be stated for this case. However, if an operation with Wchar parameter data is sent over GIOP 1.0 by a client, the server must generate a "BAD PARAM" exception. We need minor code for this case. If a server sends Wchar or wstring data using GIOP 1.0 the client should close the connection. Proposed Revised Text: At the end of section 15.3.1.6 add the following: " If a client sends wchar data in a GIOP 1.0 message, the server shall generate a BAD_PARAM standard system exception, with standard minor code 25+i. If a server sends wchar data in a GIOP 1.0 response, the client shall close the connection. " Add the following minor code to table 4-3 for BAD_PARAM Standard System exception: " BAD_PARAM 25+i wchar or wstring data not allowed in GIOP 1.0 message " Actions taken: July 5, 2000: received issue ter To: "Rutt, T E (Tom)" cc: "'interop@omg.org'" Subject: Re: Wordsmith on issue 2681 resolution Date: Tue, 07 Nov 2000 17:57:29 +0000 From: Craig Ryan Content-Type: text X-UIDL: bD>!!:E6!!"^e!!1AM!! >However, if an operation with Wchar parameter data is sent over GIOP 1.0 by >a client, >the server must generate a "BAD PARAM" exception. We need minor code for >this case. > >If a server sends Wchar or wstring data using GIOP 1.0 the client should >close the connection. In either case this assumes the act of marshalling and sending the wide data is in itself legal. Why is this not simply a MARSHAL condition on the sending side when attempting to marshal wide data into a 1.0 CDR stream, ie even before the message is transmitted? regards, craig. From: "Rutt, T E (Tom)" To: "'interop@omg.org'" Cc: "Rutt, T E (Tom)" Subject: wordsmithing issue 3861 proposed resolution Date: Tue, 7 Nov 2000 13:29:00 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" X-UIDL: :=$!!flJe9TIKe9B[7!! Please review this proposed resolution. If I have no comments before Friday noon, I will put it out for vote of interop rtf. Tom Rutt -------- Issue 3681: interop issue: CodeSets service context in GIOP 1.0 request (interop) Click here for this issue's archive. Source: Xerox (Mr. William C. Janssen, Jr., janssen@parc.xerox.com) Nature: Uncategorized Issue Severity: Summary: Well, a new Java release is upon us, and with it comes a new CORBA implementation. I'm trying Java 2 SE 1.3 CORBA clients against an ILU 2.0beta1 CosNaming server, and we find that the Java ORB cannot reliably connect to the server. Why not? First, we must analyze the IOR provided by the ILU service: IOR:000000000000002849444C3A6F6D672E6F72672F436F734E616D696E672F4E616 D696E67436F6E746578743A312E300000000002000000000000002F00010000000000 16776174736F6E2E706172632E7865726F782E636F6D00270F0000000B4E616D65536 572766963650000000001000000240001000000000001000000010000001400010018 00010001000000000001010000000000 If we look at this (those who've received it un-truncated) we find that it advertises the following: _IIOP_ParseCDR: byte order BigEndian, repository id , 2 profiles _IIOP_ParseCDR: profile 1 is 47 bytes, tag 0 (INTERNET), BigEndian byte order _IIOP_ParseCDR: profile 2 is 36 bytes, tag 1 (MULTIPLE COMPONENT), BigEndian byte order (iiop.c:parse_IIOP_Profile): bo=BigEndian, version=1.0, hostname=watson.parc.xerox.com, port=9999, object_key= (iiop.c:parse_IIOP_Profile): encoded object key is (iiop.c:parse_IIOP_Profile): non-native cinfo is (iiop.c:parse_MultiComponent_Profile): profile contains 1 component (iiop.c:parse_MultiComponent_Profile): component 1 of type 1, 20 bytes (iiop.c:parse_MultiComponent_Profile): native codeset for SHORT CHARACTER is 00010001, with 0 converters (iiop.c:parse_MultiComponent_Profile): native codeset for CHARACTER is 00010100, with 0 converters That is, there's a vanilla Internet profile (bo=BigEndian, version=1.0, hostname=watson.parc.xerox.com, port=9999, object_key=), plus a Multicomponent profile, noting that the ILU ORB's native codesets are Latin-1 and UCS-2. OK, great. Now we get the first message from the Java ORB: 0000 47 49 4f 50 01 00 00 00 00 00 01 00 GIOP........ 0000 00 00 00 02 00 00 00 01 00 00 00 0c 00 00 00 00 ................ 0010 00 01 00 20 00 01 01 00 00 00 00 06 00 00 00 90 ... ............ 0020 00 00 00 00 00 00 00 28 49 44 4c 3a 6f 6d 67 2e .......(IDL:omg. 0030 6f 72 67 2f 53 65 6e 64 69 6e 67 43 6f 6e 74 65 org/SendingConte 0040 78 74 2f 43 6f 64 65 42 61 73 65 3a 31 2e 30 00 xt/CodeBase:1.0. 0050 00 00 00 01 00 00 00 00 00 00 00 54 00 01 01 00 ...........T.... 0060 00 00 00 0c 31 33 2e 31 2e 31 30 33 2e 36 38 00 ....13.1.103.68. 0070 0e e9 00 00 00 00 00 18 af ab ca fe 00 00 00 02 ................ 0080 67 d5 93 95 00 00 00 08 00 00 00 00 00 00 00 00 g............... 0090 00 00 00 01 00 00 00 01 00 00 00 14 00 00 00 00 ................ 00a0 00 01 00 20 00 00 00 00 00 01 01 00 00 00 00 00 ... ............ 00b0 00 00 00 05 01 00 00 00 00 00 00 07 53 79 6e 65 ............Syne 00c0 72 67 79 00 00 00 00 06 5f 69 73 5f 61 00 00 00 rgy....._is_a... 00d0 00 00 00 00 00 00 00 28 49 44 4c 3a 6f 6d 67 2e .......(IDL:omg. 00e0 6f 72 67 2f 43 6f 73 4e 61 6d 69 6e 67 2f 4e 61 org/CosNaming/Na 00f0 6d 69 6e 67 43 6f 6e 74 65 78 74 3a 31 2e 30 00 mingContext:1.0. Note that we are seeing a CodeSets service context, even though the request is GIOP 1.0. The service context specifies a TCS-C of ASCII, and a TCS-W of UCS-2. The question is, what should the server do with it? First of all, there seems to be no way in which the algorithm in section 13.2.7.6 can result in the TCS-C specified in the service context. So perhaps this bug should be detected and signalled back to the sending ORB. How? Using CODESET_INCOMPATIBLE might make sense, but that really doesn't flag the bug in the client-side implementation of the codesets determination algorithm. Perhaps a straight COMM_FAILURE would be better. Opinions? Secondly, since this is GIOP 1.0, the client could reasonably ignore the service context, and go ahead with using its default codeset (Latin-1). However, to do so risks comm failure down the line, as ASCII (the TCS-C assumed by the client) does not permit many Latin-1 characters. It seems better to flag this situation up front. Discussion: The IOR does not have the codeset component in the INTEROP IOR Profile. A client may choose to ignore the MULTIPLE COMPONENTS profile safely. The Interop chapters require profiles to be complete, the MULTIPLE COMPONENTS profile is not complete, since it has no addressing information or object key. The server should not process codeset service context in GIOP 1.0, if present. If client sends other than LATIN-1 in GIOP 1.0, errors may arise. They should be dealt with in the response to each request message, rather than in the processing of the service context in the first message.. We need to clarify what is valid within the GIOP protocol for the server to do in this situation. The client should not be including a codeset service context in a GIOP 1.0 message, and certainly cannot transmit any wide characters or strings over a GIOP 1.0 connection. The codeset service context might be considered harmless if its TCS-C matched the fixed codeset used in GIOP 1.0 for chars and strings. The client is wrong to send the service context in this case of GIOP 1.0, (irrespective of its contents). 13.7.2 of Core states: " The following are the rules for processing a received service context: * The service context is in the OMG defined range: * If it is valid for the supported GIOP version, then it must be processed correctly according to the rules associated with it for that GIOP version level. * If it is not valid for the GIOP version, then it may be ignored by the receiving ORB, however it must be passed on through a bridge and must be made available to interceptors. No exception shall be raised. " Based on the above quote, the server is also wrong to look at it. If the client "pumps" ascii over the connection, the server will have no problems, since ascii is a subset of Latin-1. However, the server might "pump back" string return results or out parameters using the fully extended 8 bit space of Latin-1, and the client has to deal with it. We also need to use the propose/dispose mechanism that has been setup between Sun and the OMG provide for feeding back bug reports to Sun and having them run it through a Java standards process moral equivalent of OMG's "Urgent bugfix process"? That also needs to happen to fix this particular problem.. Resolution: Section 13.7.2 makes it very clear that the server should not raise an exception if a codeset service context is contained in a GIOP 1.0 request message. No clarification is needed in that section. If a client or a server sends character data which is not encoded as LATIN-1, the receiver will not be able to detect it is not being LATIN-1 by examining the received sequence octets. Thus nothing can be stated for this case. However, if an operation with Wchar parameter data is sent over GIOP 1.0 by a client, the server must generate a "BAD PARAM" exception. We need minor code for this case. If a server sends Wchar or wstring data using GIOP 1.0 the client should close the connection. Proposed Revised Text: At the end of section 15.3.1.6 add the following: " If a client sends wchar data in a GIOP 1.0 message, the server shall generate a BAD_PARAM standard system exception, with standard minor code 25+i. If a server sends wchar data in a GIOP 1.0 response, the client shall close the connection. " Add the following minor code to table 4-3 for BAD_PARAM Standard System exception: " BAD_PARAM 25+i wchar or wstring data not allowed in GIOP 1.0 message " Actions taken: July 5, 2000: received issue Date: Tue, 07 Nov 2000 13:36:06 -0500 From: Jishnu Mukerji Organization: Hewlett-Packard EIAL, Florham Park NJ USA X-Mailer: Mozilla 4.73 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 To: Craig Ryan Cc: "Rutt, T E (Tom)" , "'interop@omg.org'" Subject: Re: Wordsmith on issue 2681 resolution References: <200011071747.RAA18744@dublin.iona.ie> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: K&_d9p$nd9VO&e9Xeg!! Craig Ryan wrote: > > >However, if an operation with Wchar parameter data is sent over GIOP 1.0 by > >a client, > >the server must generate a "BAD PARAM" exception. We need minor code for > >this case. > > > >If a server sends Wchar or wstring data using GIOP 1.0 the client should > >close the connection. > > In either case this assumes the act of marshalling and sending the > wide data is in itself legal. Why is this not simply a MARSHAL > condition on the sending side when attempting to marshal wide data > into a 1.0 CDR stream, ie even before the message is transmitted? It should be. However, the issue at hand is about a buggy sending side doing strange things and shipping stuff over to the hapless receiving side. The question is what should the receiving side do? Although , I am still not convinced that it should be a BAD_PARAM. IMHO it is a MARSHAL error, since said parameter cannot be meaningfully unmarshaled, no? BTW for the: > >If a server sends Wchar or wstring data using GIOP 1.0 the client should > >close the connection. The client ORB will also need to say something to the invoker that caused the said malformed message to be sent from the server to the client. We need to specify that. Perhaps should be a MARSHAL exception with a minor code stating something like "reply message botched by server" or some such? Jishnu. Sender: jon@corvette.floorboard.com Message-ID: <3A084829.A0CA31BA@floorboard.com> Date: Tue, 07 Nov 2000 10:21:29 -0800 From: Jonathan Biggar X-Mailer: Mozilla 4.76 [en] (X11; U; SunOS 5.7 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: Craig Ryan CC: "Rutt, T E (Tom)" , "'interop@omg.org'" Subject: Re: Wordsmith on issue 2681 resolution References: <200011071747.RAA18744@dublin.iona.ie> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: D%k!!oUPe9FcZd96&X!! Craig Ryan wrote: > > >However, if an operation with Wchar parameter data is sent over GIOP 1.0 by > >a client, > >the server must generate a "BAD PARAM" exception. We need minor code for > >this case. > > > >If a server sends Wchar or wstring data using GIOP 1.0 the client should > >close the connection. > > In either case this assumes the act of marshalling and sending the > wide data is in itself legal. Why is this not simply a MARSHAL > condition on the sending side when attempting to marshal wide data > into a 1.0 CDR stream, ie even before the message is transmitted? Actually, I consider it a "should never happen" condition, since a robust ORB implementation would never choose a GIOP 1.0 connection to transmit a request that has wchar data. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org From: "Rutt, T E (Tom)" To: Craig Ryan , "'Jishnu Mukerji'" Cc: "Rutt, T E (Tom)" , "'interop@omg.org'" Subject: RE: Wordsmith on issue 2681 resolution Date: Tue, 7 Nov 2000 17:15:40 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain X-UIDL: J > >However, if an operation with Wchar parameter data is sent over GIOP 1.0 by > >a client, > >the server must generate a "BAD PARAM" exception. We need minor code for > >this case. > > > >If a server sends Wchar or wstring data using GIOP 1.0 the client should > >close the connection. > > In either case this assumes the act of marshalling and sending the > wide data is in itself legal. Why is this not simply a MARSHAL > condition on the sending side when attempting to marshal wide data > into a 1.0 CDR stream, ie even before the message is transmitted? It should be. However, the issue at hand is about a buggy sending side doing strange things and shipping stuff over to the hapless receiving side. The question is what should the receiving side do? Although , I am still not convinced that it should be a BAD_PARAM. IMHO it is a MARSHAL error, since said parameter cannot be meaningfully unmarshaled, no? In giop 1.1 plus the bad param exception is already specified for use when the client sents wchar/wstring without a service context being sent on the connection. See minor code 23 (or near there) for bad param. BTW for the: > >If a server sends Wchar or wstring data using GIOP 1.0 the client should > >close the connection. The client ORB will also need to say something to the invoker that caused the said malformed message to be sent from the server to the client. We need to specify that. Perhaps should be a MARSHAL exception with a minor code stating something like "reply message botched by server" or some such? I agree Jishnu. Date: Thu, 9 Nov 2000 17:20:10 +1000 (EST) From: Michi Henning To: Jishnu Mukerji cc: Craig Ryan , "Rutt, T E (Tom)" , "'interop@omg.org'" Subject: Re: Wordsmith on issue 2681 resolution In-Reply-To: <3A084B96.AE6F0AEA@hp.com> Message-ID: Organization: Object Oriented Concepts MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-UIDL: :F[d9X;'!!a-c!!+%Ae9 On Tue, 7 Nov 2000, Jishnu Mukerji wrote: > > In either case this assumes the act of marshalling and sending the > > wide data is in itself legal. Why is this not simply a MARSHAL > > condition on the sending side when attempting to marshal wide data > > into a 1.0 CDR stream, ie even before the message is transmitted? > > It should be. However, the issue at hand is about a buggy sending side > doing strange things and shipping stuff over to the hapless receiving > side. The question is what should the receiving side do? > > Although , I am still not convinced that it should be a BAD_PARAM. IMHO > it is a MARSHAL error, since said parameter cannot be meaningfully > unmarshaled, no? I agree. BAD_PARAM would be wrong because it is meant to indicate a valid parameter with an unacceptable value. However, sending wchar over IIOP 1.0 is a fundamental protocol violation. >From the spec: "MARSHAL: A request or reply from the network is structurally invalid." That hits the mark exactly. > BTW for the: > > > >If a server sends Wchar or wstring data using GIOP 1.0 the client > should > > >close the connection. No need to close the connection. That's at the discretion of the ORB. > The client ORB will also need to say something to the invoker that > caused the said malformed message to be sent from the server to the > client. We need to specify that. Perhaps should be a MARSHAL > exception > with a minor code stating something like "reply message botched by > server" or some such? Yes. MARSHAL is what should be raised in the client. Whether the connection is closed or not is up to the ORB and need not be specified, IMO. Cheers, Michi. -- Michi Henning +61 7 3891 5744 Object Oriented Concepts +61 4 1118 2700 (mobile) Suite 4, 904 Stanley St +61 7 3891 5009 (fax) East Brisbane 4169 michi@ooc.com.au AUSTRALIA http://www.ooc.com.au/staff/michi-henning.html