Issue 4328: Indirections with chunking & fragmentation (interop) Source: Oracle (Mr. Everett Anderson, ) Nature: Uncategorized Issue Severity: Summary: When chunking and fragmenting, it is possible for a chunk length to be inserted between the indirection tag and indirection offset. Implementations must be careful to compute the indirection offset correctly both when writing and reading to avoid errors. For interoperability, we should elminate this possibility. From an implementation point of view, the code for handling this special case should already be there. Please see the original message (see attachment) for a detailed description of the problem scenario and two implementation possibilities. Proposed resolution: Elminate the possibility of chunk lengths between indirection tag and indirection offset by changing the following paragraph in CORBA formal 00-11-03 15.3.4.6. Resolution: see below Revised Text: Section 15.3.4.6, page 15-20, change the text that currently reads: The data may be split into multiple chunks at arbitrary points except within primitive CDR types, arrays of primitive types, strings, and wstrings. It is never necessary to end a chunk within one of these types as the length of these types is known before starting to marshal them so they can be added to the length of the currently open chunk. It is the responsi-bility of the CDR stream to hide the chunking from the marshaling code. to read: The data may be split into multiple chunks at arbitrary points except within primitive CDR types, arrays of primitive types, strings, wstrings, or between the tag and offset of indirections. It is never necessary to end a chunk within one of these types as the length of these types is known before starting to marshal them so they can be added to the length of the currently open chunk. It is the responsibility of the CDR stream to hide the chunking from the marshaling code. Actions taken: May 25, 2001: received issue May 13, 2002: closed issue Discussion: End of Annotations:===== X-Authentication-Warning: emerald.omg.org: hobbit.omg.org [192.67.184.3] didn't use HELO protocol Received: from patan.sun.com (192.18.98.43) by hobbit.omg.org asmtp(1.0) id 8285; Fri, 25 May 2001 19:51:14 -0400 (EDT) Received: from taller.eng.sun.com ([129.144.252.34]) by patan.sun.com (8.9.3+Sun/8.9.3) with ESMTP id RAA27080 for ; Fri, 25 May 2001 17:46:26 -0600 (MDT) Received: from sun.com (d-ucup02-251-153 [129.144.251.153]) by taller.eng.sun.com (8.9.3+Sun/8.9.3/ENSMAIL,v2.1p1) with ESMTP id QAA10443 for ; Fri, 25 May 2001 16:46:26 -0700 (PDT) Message-ID: <3B0EEDCA.5093C190@sun.com> Date: Fri, 25 May 2001 16:42:02 -0700 From: Everett Anderson X-Mailer: Mozilla 4.73 [en] (Windows NT 5.0; U) X-Accept-Language: en,pdf,ja MIME-Version: 1.0 To: interop@omg.org Subject: Question: Indirections with chunking & fragmentation Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: E&Me9'8L!!ODid9@IIe9 Status: RO Hi, This is just a question at this point, please do not raise it as an issue, yet. Question: Is it legal to have a chunk length between the indirection tag and indirection offset? Unless one takes steps to avoid or handle this case, it can cause an error in the following situation: Near the end of fragment i, and while writing a custom marshaled valuetype (chunking), one needs to write an indirection. When finished writing the 0xffffffff indirection tag, the fragment is full. The indirection offset is calculated to be - , and a call is made to write_long with this value. (Note: When I say "stream position" I'm referring to the position in the stream as if it were never fragmented, since fragment header bytes are not taken into account in indirection offsets.) The marshaling code realizes that it needs to send the current fragment, so it * Realizes it's chunking so updates the chunk length (had to have been in fragment i in this case). * Sends the fragment * Gets a new fragment buffer, marshals in the header, etc * Opens a new chunk, reserving 4 bytes for the chunk length. Now the marshaling code writes the long. However, the indirection offset written is now incorrect -- it doesn't take into account the 4 bytes of the new chunk length. Here are two possible solutions: ------------- 1. Outlaw chunk lengths between indirection tag and offset. Implementation impact: One must update the chunk length in fragment i such that it covers the indirection tag as well as offset in fragment i+1. Thus, the new chunk in fragment i+1 is started after the indirection offset. The tricky thing here is that the marshaling code must remember or be told to open the new chunk after the indirection offset. Implementations probably already have this kind of code since they need it to avoid splitting primitives, strings/wstrings, and arrays of such types into multiple chunks even when fragmenting. The unmarshalling code can be written with the assumption that the stream position following the indirection tag is the same as the stream position of the beginning of the indirection offset. ------------- 2. Allow a chunk length between indirection tag and offset. Implementation impact: Must ensure that the code which writes the indirection is aware of whether or not it's chunking, and also knows how much space it has left. If chunking and fragmenting, and there isn't enough space in fragment i to write the indirection offset, the code must calculate the indirection offset to include the chunk length that must be added between tag and offset. The unmarshaling code checks whether the chunk has ended before reading the indirection offset, sees that it has, reads the new chunk length, and then reads the offset and computes the stream position of the desired entity. ------------- What do people think? Life would be easier if indirections were merely encoded as the stream position of the desired entity. :) Thanks, Everett Date: Tue, 29 May 2001 11:10:17 +0100 From: Simon Nash Organization: IBM X-Mailer: Mozilla 4.72 [en] (Windows NT 5.0; I) X-Accept-Language: en MIME-Version: 1.0 To: Everett Anderson CC: interop@omg.org Subject: Re: Question: Indirections with chunking & fragmentation References: <3B0EEDCA.5093C190@sun.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: C?Qe9@S@!!->Ae9l4Le9 Everett, Both of these would currently be valid implementation choices. My personal preference is for 1 since I believe this is simpler and more consistent with how other types that would cross chunk boundaries are handled. If we could make this a rule, then it would slightly simplify decoding these indirections because the offset could be computed relative to the stream position after the indirection tag has been read, instead of having to first read the offset value and then subtract 4 from the stream position. The same issue could arise with TypeCode indirections as well. Simon Everett Anderson wrote: > > Hi, > > This is just a question at this point, please do not raise it as an > issue, yet. > > Question: Is it legal to have a chunk length between the indirection > tag and indirection offset? > > Unless one takes steps to avoid or handle this case, it can cause an > error in the following situation: > > Near the end of fragment i, and while writing a custom marshaled > valuetype (chunking), one needs to write an indirection. When finished > writing the 0xffffffff indirection tag, the fragment is full. > > The indirection offset is calculated to be entity> - , and a call is made to write_long > with this value. (Note: When I say "stream position" I'm referring to > the position in the stream as if it were never fragmented, since > fragment header bytes are not taken into account in indirection > offsets.) > > The marshaling code realizes that it needs to send the current fragment, > so it > > * Realizes it's chunking so updates the chunk length (had to have been > in fragment i in this case). > * Sends the fragment > * Gets a new fragment buffer, marshals in the header, etc > * Opens a new chunk, reserving 4 bytes for the chunk length. > > Now the marshaling code writes the long. However, the indirection > offset written is now incorrect -- it doesn't take into account the 4 > bytes of the new chunk length. > > Here are two possible solutions: > > ------------- > > 1. Outlaw chunk lengths between indirection tag and offset. > > Implementation impact: > > One must update the chunk length in fragment i such that it covers the > indirection tag as well as offset in fragment i+1. Thus, the new chunk > in fragment i+1 is started after the indirection offset. > > The tricky thing here is that the marshaling code must remember or be > told to open the new chunk after the indirection offset. > Implementations probably already have this kind of code since they need > it to avoid splitting primitives, strings/wstrings, and arrays of such > types into multiple chunks even when fragmenting. > > The unmarshalling code can be written with the assumption that the > stream position following the indirection tag is the same as the stream > position of the beginning of the indirection offset. > > ------------- > > 2. Allow a chunk length between indirection tag and offset. > > Implementation impact: > > Must ensure that the code which writes the indirection is aware of > whether or not it's chunking, and also knows how much space it has > left. If chunking and fragmenting, and there isn't enough space in > fragment i to write the indirection offset, the code must calculate the > indirection offset to include the chunk length that must be added > between tag and offset. > > The unmarshaling code checks whether the chunk has ended before reading > the indirection offset, sees that it has, reads the new chunk length, > and then reads the offset and computes the stream position of the > desired entity. > > ------------- > > What do people think? > > Life would be easier if indirections were merely encoded as the stream > position of the desired entity. :) > > Thanks, > Everett -- Simon C Nash, Chief Technical Officer, IBM Java Technology Tel. +44-1962-815156 Fax +44-1962-818999 Hursley, England Internet: nash@hursley.ibm.com Lotus Notes: Simon Nash@ibmgb Date: Thu, 31 May 2001 13:31:50 +1000 (EST) From: Michi Henning Reply-To: Interoperability RTF To: Everett Anderson cc: Interoperability RTF Subject: Re: Question: Indirections with chunking & fragmentation In-Reply-To: <3B15B29E.1563920C@sun.com> Message-ID: Organization: IONA Technologies MIME-Version: 1.0 Content-ID: Content-Type: TEXT/PLAIN; CHARSET=US-ASCII X-UIDL: mBNe9:VHe9"B]d9OR;!! On Wed, 30 May 2001, Everett Anderson wrote: > Proposed resolution: > > Elminate the possibility of chunk lengths between indirection tag > and > indirection offset by changing the following paragraph in CORBA > formal > 00-11-03 15.3.4.6. > > Old: > > "The data may be split into multiple chunks at arbitrary points > except > within primitive CDR types, arrays of primitive types, strings, and > wstrings. It is never necessary to end a chunk within one of these > types > as the length of these types is known before starting to marshal > them so > they can be added to the length of the currently open chunk. It is > the > responsibility of the CDR stream to hide the chunking from the > marshaling code." > > New: > > "The data may be split into multiple chunks at arbitrary points > except > within primitive CDR types, arrays of primitive types, strings, > wstrings, or between the tag and offset of indirections. It is never > necessary to end a chunk within one of these types as the length of > these types is known before starting to marshal them so they can be > added to the length of the currently open chunk. It is the > responsibility of the CDR stream to hide the chunking from the > marshaling code." Just one comment: a poll as to what happens in various implementations right now might be good. (We don't want yet another urgent issue on our hands...) For what it's worth, ORBacus is not affected and the proposed resolution sounds good to me. Cheers, Michi. -- Michi Henning +61 7 3324 9633 Chief CORBA Scientist +61 4 1118 2700 (mobile) IONA Technologies +61 7 3324 9799 (fax) Total Business Integration http://www.ooc.com.au/staff/michi Date: Wed, 30 May 2001 19:55:26 -0700 From: Everett Anderson X-Mailer: Mozilla 4.73 [en] (Windows NT 5.0; U) X-Accept-Language: en,pdf,ja MIME-Version: 1.0 To: Juergen Boldt CC: interop@omg.org, issues@omg.org Subject: Re: Question: Indirections with chunking & fragmentation References: <3B0EEDCA.5093C190@sun.com> Content-Type: multipart/mixed; boundary="------------76373A781F6609E9F1C5D66F" X-UIDL: C,ed97Y;e9In'!!m To: interop@omg.org Hi, This is just a question at this point, please do not raise it as an issue, yet. Question: Is it legal to have a chunk length between the indirection tag and indirection offset? Unless one takes steps to avoid or handle this case, it can cause an error in the following situation: Near the end of fragment i, and while writing a custom marshaled valuetype (chunking), one needs to write an indirection. When finished writing the 0xffffffff indirection tag, the fragment is full. The indirection offset is calculated to be - , and a call is made to write_long with this value. (Note: When I say "stream position" I'm referring to the position in the stream as if it were never fragmented, since fragment header bytes are not taken into account in indirection offsets.) The marshaling code realizes that it needs to send the current fragment, so it * Realizes it's chunking so updates the chunk length (had to have been in fragment i in this case). * Sends the fragment * Gets a new fragment buffer, marshals in the header, etc * Opens a new chunk, reserving 4 bytes for the chunk length. Now the marshaling code writes the long. However, the indirection offset written is now incorrect -- it doesn't take into account the 4 bytes of the new chunk length. Here are two possible solutions: ------------- 1. Outlaw chunk lengths between indirection tag and offset. Implementation impact: One must update the chunk length in fragment i such that it covers the indirection tag as well as offset in fragment i+1. Thus, the new chunk in fragment i+1 is started after the indirection offset. The tricky thing here is that the marshaling code must remember or be told to open the new chunk after the indirection offset. Implementations probably already have this kind of code since they need it to avoid splitting primitives, strings/wstrings, and arrays of such types into multiple chunks even when fragmenting. The unmarshalling code can be written with the assumption that the stream position following the indirection tag is the same as the stream position of the beginning of the indirection offset. ------------- 2. Allow a chunk length between indirection tag and offset. Implementation impact: Must ensure that the code which writes the indirection is aware of whether or not it's chunking, and also knows how much space it has left. If chunking and fragmenting, and there isn't enough space in fragment i to write the indirection offset, the code must calculate the indirection offset to include the chunk length that must be added between tag and offset. The unmarshaling code checks whether the chunk has ended before reading the indirection offset, sees that it has, reads the new chunk length, and then reads the offset and computes the stream position of the desired entity. ------------- What do people think? Life would be easier if indirections were merely encoded as the stream position of the desired entity. :) Thanks, Everett Subject: Re: Question: Indirections with chunking & fragmentation Date: Tue, 29 May 2001 11:10:17 +0100 From: Simon Nash Organization: IBM To: Everett Anderson CC: interop@omg.org Everett, Both of these would currently be valid implementation choices. My personal preference is for 1 since I believe this is simpler and more consistent with how other types that would cross chunk boundaries are handled. If we could make this a rule, then it would slightly simplify decoding these indirections because the offset could be computed relative to the stream position after the indirection tag has been read, instead of having to first read the offset value and then subtract 4 from the stream position. The same issue could arise with TypeCode indirections as well. Simon Everett Anderson wrote: > > Hi, > > This is just a question at this point, please do not raise it as an > issue, yet. > > Question: Is it legal to have a chunk length between the indirection > tag and indirection offset? > > Unless one takes steps to avoid or handle this case, it can cause an > error in the following situation: > > Near the end of fragment i, and while writing a custom marshaled > valuetype (chunking), one needs to write an indirection. When finished > writing the 0xffffffff indirection tag, the fragment is full. > > The indirection offset is calculated to be entity> - , and a call is made to write_long > with this value. (Note: When I say "stream position" I'm referring to > the position in the stream as if it were never fragmented, since > fragment header bytes are not taken into account in indirection > offsets.) > > The marshaling code realizes that it needs to send the current fragment, > so it > > * Realizes it's chunking so updates the chunk length (had to have been > in fragment i in this case). > * Sends the fragment > * Gets a new fragment buffer, marshals in the header, etc > * Opens a new chunk, reserving 4 bytes for the chunk length. > > Now the marshaling code writes the long. However, the indirection > offset written is now incorrect -- it doesn't take into account the 4 > bytes of the new chunk length. > > Here are two possible solutions: > > ------------- > > 1. Outlaw chunk lengths between indirection tag and offset. > > Implementation impact: > > One must update the chunk length in fragment i such that it covers the > indirection tag as well as offset in fragment i+1. Thus, the new chunk > in fragment i+1 is started after the indirection offset. > > The tricky thing here is that the marshaling code must remember or be > told to open the new chunk after the indirection offset. > Implementations probably already have this kind of code since they need > it to avoid splitting primitives, strings/wstrings, and arrays of such > types into multiple chunks even when fragmenting. > > The unmarshalling code can be written with the assumption that the > stream position following the indirection tag is the same as the stream > position of the beginning of the indirection offset. > > ------------- > > 2. Allow a chunk length between indirection tag and offset. > > Implementation impact: > > Must ensure that the code which writes the indirection is aware of > whether or not it's chunking, and also knows how much space it has > left. If chunking and fragmenting, and there isn't enough space in > fragment i to write the indirection offset, the code must calculate the > indirection offset to include the chunk length that must be added > between tag and offset. > > The unmarshaling code checks whether the chunk has ended before reading > the indirection offset, sees that it has, reads the new chunk length, > and then reads the offset and computes the stream position of the > desired entity. > > ------------- > > What do people think? > > Life would be easier if indirections were merely encoded as the stream > position of the desired entity. :) > > Thanks, > Everett -- Simon C Nash, Chief Technical Officer, IBM Java Technology Tel. +44-1962-815156 Fax +44-1962-818999 Hursley, England Internet: nash@hursley.ibm.com Lotus Notes: Simon Nash@ibmgb Date: Thu, 31 May 2001 12:26:11 -0700 From: Everett Anderson X-Mailer: Mozilla 4.73 [en] (Windows NT 5.0; U) X-Accept-Language: en,pdf,ja MIME-Version: 1.0 To: Interoperability RTF Subject: Re: Question: Indirections with chunking & fragmentation References: Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: eS#"!HA!!!~~e!!fe~!! > Just one comment: a poll as to what happens in various implementations > right now might be good. (We don't want yet another urgent issue on our > hands...) For what it's worth, ORBacus is not affected and the proposed > resolution sounds good to me. Would ORBacus put chunk lengths between tag and offset? We haven't released a non-beta JDK or J2EE implementation with fragmentation, yet. Our current betas and internal code avoid putting chunk lengths between tag and offset, and make the assumption that there won't be such chunk lengths when reading. However, for the release versions, we'll read both cases properly, since this probably won't be addressed by then. Awareness of how to read and write the indirections properly in this case may be sufficient. - Everett Date: Thu, 31 May 2001 20:53:05 +0100 From: Simon Nash Organization: IBM X-Mailer: Mozilla 4.72 [en] (Windows NT 5.0; I) X-Accept-Language: en MIME-Version: 1.0 To: Everett Anderson CC: Interoperability RTF Subject: Re: Question: Indirections with chunking & fragmentation References: <3B169AD3.80C11FBA@sun.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: lp6!!pUX!!dc;e9j%;!! Everett, If all implementations are compatible with the stricter rule (your proposal) then I think it is better to make this a spec rule rather than require all implementations to contain code to handle both forms when unmarshalling. Simon Everett Anderson wrote: > > > Just one comment: a poll as to what happens in various implementations > > right now might be good. (We don't want yet another urgent issue on our > > hands...) For what it's worth, ORBacus is not affected and the proposed > > resolution sounds good to me. > > Would ORBacus put chunk lengths between tag and offset? > > We haven't released a non-beta JDK or J2EE implementation with > fragmentation, yet. Our current betas and internal code avoid putting > chunk lengths between tag and offset, and make the assumption that there > won't be such chunk lengths when reading. > > However, for the release versions, we'll read both cases properly, since > this probably won't be addressed by then. > > Awareness of how to read and write the indirections properly in this > case may be sufficient. > > - Everett -- Simon C Nash, Chief Technical Officer, IBM Java Technology Tel. +44-1962-815156 Fax +44-1962-818999 Hursley, England Internet: nash@hursley.ibm.com Lotus Notes: Simon Nash@ibmgb Date: Thu, 31 May 2001 14:51:29 -0700 From: Everett Anderson X-Mailer: Mozilla 4.73 [en] (Windows NT 5.0; U) X-Accept-Language: en,pdf,ja MIME-Version: 1.0 To: Simon Nash CC: Interoperability RTF Subject: Re: Question: Indirections with chunking & fragmentation References: <3B169AD3.80C11FBA@sun.com> <3B16A121.F048F4A5@hursley.ibm.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: m*Rd91)@!!^X#!!i0Fe9 Hi, > If all implementations are compatible with the stricter rule (your proposal) > then I think it is better to make this a spec rule rather than require all > implementations to contain code to handle both forms when unmarshalling. Sounds good to me! :)