Issue 4591: New issue: Behavior of writeByte, writeChar, writeBytes, writeChars (java2idl-rtf) Source: Oracle (Mr. Everett Anderson, ) Nature: Uncategorized Issue Severity: Summary: A Serializable or Externalizable can define methods which take java.io.ObjectOutputStreams. There are four methods on that class that are well defined in Java, but the mapping to CORBA may need to be clarified: writeByte(int) writeChar(int) writeBytes(String) writeChars(String) Please see the Java docs for these methods: http://java.sun.com/j2se/1.3/docs/api/java/io/DataOutput.html#writeByte(int) http://java.sun.com/j2se/1.3/docs/api/java/io/DataOutput.html#writeChar(int) http://java.sun.com/j2se/1.3/docs/api/java/io/DataOutput.html#writeBytes(java.lang.String) http://java.sun.com/j2se/1.3/docs/api/java/io/DataOutput.html#writeChars(java.lang.String) Based on those detailed definitions, I'd say the mapping would be: writeByte(int) * write_octet of the lower 8 bits of the int writeChar(int) * two write_octet calls in the order as shown in the java docs writeBytes(String) * For each char in the String, call charAt, take the lower 8 bits, and call write_octet writeChars(String) * For each char in the String, call charAt, split the char into two bytes, and make two write_octet calls in the order shown in the java docs Another interpretation might be to use wstrings or wchar arrays for writeChars. The problem I see there is that since there isn't a readChars method, the user will be using the DataInput methods readFully or multiple readChar calls to reconstruct the String. Since the wire format of wstrings/wchars depends on the code set, he might start to see code set specific bytes instead of the expected bytes detailed in the java docs of writeChars and writeChar. Resolution: see below Revised Text: In the second last paragraph of section 1.4.10, add the following text before the last sentence of the paragraph: "Java ints and strings written by the writeByte, writeChar, writeBytes, and writeChars methods of java.io.ObjectOutputStream are marshaled as specified by the definitions of these methods in the java.io.DataOutput interface." Actions taken: October 4, 2001: received issue May 13, 2002: closed issue Discussion: As proposed in the issue summary, the stream data written by these APIs for RMI-IIOP needs to be consistent with Java serialization, since otherwise some classes that serialize and deserialize correctly using Java serialization or RMI-JRMP would not serialize and deserialize correctly using RMI-IIOP. However, care needs to be taken when making this change to avoid breaking existing applications. This is because the revised specification for writeBytes(String) is incompatible with the way this method is currently implemented by certain JDK ORBs (specifically IBM's J2SE 1.3 and Sun's J2SE 1.3), and the change may affect application code as well as the ORB. The incompatibility is that the writeBytes() method currently converts characters to bytes using the platform's default character encoding, but the new specification doesn't do this. Therefore, any platform whose default character encoding is not ISO8859-1 (e.g., IBM zOS) will send different data for writeBytes() with the revised specification, and any application-defined Java readObject() methods that consume this data (on any platform) will have to be modified to consume unconverted data instead of converted data. Similarly, IDL custom valuetype unmarshal method implementations that consume data written by writeBytes() may have to be modified in the same way. Following established principles for incompatible changes to the J2SE platform, this change should only be made with adequate warning and on a major release boundary of the J2SE platform (e.g, 1.4 or 1.5) rather than on a minor release boundary (e.g., 1.4.x) or a service update (e.g., 1.4.0_xx). End of Annotations:===== Date: Thu, 04 Oct 2001 12:13:07 -0700 From: Everett Anderson X-Mailer: Mozilla 4.73 [en] (Windows NT 5.0; U) X-Accept-Language: en,pdf,ja MIME-Version: 1.0 To: java2idl-rtf@omg.org CC: issues@omg.org Subject: New issue: Behavior of writeByte, writeChar, writeBytes, writeChars Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: I"$e9;A~e9^"E!!!"&!! Hi, I'd like to raise a new issue on the Java to IDL RTF list. A Serializable or Externalizable can define methods which take java.io.ObjectOutputStreams. There are four methods on that class that are well defined in Java, but the mapping to CORBA may need to be clarified: writeByte(int) writeChar(int) writeBytes(String) writeChars(String) Please see the Java docs for these methods: http://java.sun.com/j2se/1.3/docs/api/java/io/DataOutput.html#writeByte(int) http://java.sun.com/j2se/1.3/docs/api/java/io/DataOutput.html#writeChar(int) http://java.sun.com/j2se/1.3/docs/api/java/io/DataOutput.html#writeBytes(java.lang.String) http://java.sun.com/j2se/1.3/docs/api/java/io/DataOutput.html#writeChars(java.lang.String) Based on those detailed definitions, I'd say the mapping would be: writeByte(int) * write_octet of the lower 8 bits of the int writeChar(int) * two write_octet calls in the order as shown in the java docs writeBytes(String) * For each char in the String, call charAt, take the lower 8 bits, and call write_octet writeChars(String) * For each char in the String, call charAt, split the char into two bytes, and make two write_octet calls in the order shown in the java docs Another interpretation might be to use wstrings or wchar arrays for writeChars. The problem I see there is that since there isn't a readChars method, the user will be using the DataInput methods readFully or multiple readChar calls to reconstruct the String. Since the wire format of wstrings/wchars depends on the code set, he might start to see code set specific bytes instead of the expected bytes detailed in the java docs of writeChars and writeChar. Date: Mon, 31 Dec 2001 22:43:37 +0000 From: Simon Nash Organization: IBM X-Mailer: Mozilla 4.72 [en] (Windows NT 5.0; I) X-Accept-Language: en MIME-Version: 1.0 To: Jishnu Mukerji , Vijay Natarajan , Harold Carr , Jeff Mischkinsky , Yoshitaka Honishi , Andy Piper , Xudong Chen , java2idl-rtf@omg.org Subject: Issue 4591 proposed resolution Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: Chld9+@^d91eMe9@95e9 Status: RO There has been no discussion of issue 4591 since Everett submitted it in October. I can see no reasonable alternative to adopting Everett's proposal, since otherwise some classes that serialize and deserialize correctly using Java serialization or RMI-JRMP would not serialize and deserialize correctly using RMI-IIOP. However, care needs to be taken when making this change to avoid breaking existing applications. This is because a) the revised specification for writeBytes(String) is incompatible with the way this method is currently implemented by certain JDK ORBs (specifically IBM's J2SE 1.3 and Sun's J2SE 1.3), and b) the change may affect application code as well as the ORB. This is because the writeBytes() method currently converts characters to bytes using the platform's default character encoding, but the new specification doesn't do this. Therefore, any platform whose default character encoding is not ISO8859-1 (e.g., IBM zOS) will send different data for writeBytes() with the revised specification, and any application-defined Java readObject() methods that consume this data (on any platform) will have to be modified to consume unconverted data instead of converted data. Similarly, IDL custom valuetype unmarshal method implementations that consume data written by writeBytes() may have to be modified in the same way. Following established principles for incompatible changes to the J2SE platform, this change should only be made with adequate warning and on a major release boundary of the J2SE platform (e.g, 1.4 or 1.5) rather than on a minor release boundary (e.g., 1.4.x) or a service update (e.g., 1.4.0_xx). Revised specification: In the second last paragraph of section 1.4.10, add the following text before the last sentence of the paragraph: Java ints and strings written by the writeByte, writeChar, writeBytes, and writeChars methods of java.io.ObjectOutputStream are marshaled as specified by the definitions of these methods in the java.io.DataOutput interface. Simon -- Simon C Nash, Chief Technical Officer, IBM Java Technology Tel. +44-1962-815156 Fax +44-1962-818999 Hursley, England Internet: nash@hursley.ibm.com Lotus Notes: Simon Nash@ibmgb