Issue 3069: OMG IDL Syntax and Semantics issue (orb_revision) Source: (, ) Nature: Uncategorized Issue Severity: Summary: The following thing is unnoticed in CORBA V2.3, June 1999, OMG document 99-07-07.pdf, Chapter 3, "OMG IDL Syntax and Semantics", pages 3-37..3-39, definition of "sequence" type: There is no explicit definition what length sequence may have at run time. Things are perfectly defined for sequence bounds (i.e. maximum size at compile time) which is explicitly declared to be a positive integer. However, nothing is said whether length of sequence at run time can be: (a) positive; or (b) non-negative; or even (c) negative. Resolution: Revised Text: Actions taken: November 29, 1999: received issue October 30, 2000: closed issue Discussion: End of Annotations:===== Date: Mon, 29 Nov 1999 11:45:07 +0200 (EET) From: Alexey Mednonogov To: issues@omg.org Subject: New issue report (OMG IDL Syntax and Semantics) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-UIDL: n,0e9+)9e9Lk%!!/aGe9 Hello all, The following thing is unnoticed in CORBA V2.3, June 1999, OMG document 99-07-07.pdf, Chapter 3, "OMG IDL Syntax and Semantics", pages 3-37..3-39, definition of "sequence" type: There is no explicit definition what length sequence may have at run time. Things are perfectly defined for sequence bounds (i.e. maximum size at compile time) which is explicitly declared to be a positive integer. However, nothing is said whether length of sequence at run time can be: (a) positive; or (b) non-negative; or even (c) negative. I guess all ORB implementations define sequences as non-negative, but even sequences with negative run-time length will conform to OMG specficiation, which is of course a nonsense. See page 3-39, section 3.10.3.3 "Wstrings": "The actual length of a wstring is set at run-time and, if the bounded form is used, must be less than or equal to the bound". (So it can be positive? non-negative? or even negative?) Probably it would be good to add explicit definition to section 3.10.3.1 "Sequences", something like that: "The length of the sequence at run-time can be any non-negative integer number that does no exceed the maximum size of the sequence defined either at compile time (bounded sequences) or at run time (unbounded sequences)". Best regards, Alexey Mednonogov Sender: jis@fpk.hp.com Message-ID: <384C43B5.6734902E@fpk.hp.com> Date: Mon, 06 Dec 1999 18:16:05 -0500 From: Jishnu Mukerji Organization: Hewlett-Packard EIAL X-Mailer: Mozilla 4.08 [en] (X11; U; HP-UX B.10.10 9000/777) MIME-Version: 1.0 To: orb_revision@omg.org, interop@omg.org, ptc@omg.org Subject: Issue 3069 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: P(2!!k'"e94Z Date: Tue, 07 Dec 1999 10:51:54 -0800 From: Peter Walker Organization: Sun Microsystems X-Mailer: Mozilla 4.51 [en] (X11; U; SunOS 5.7 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: Martin von Loewis CC: jis@fpk.hp.com, orb_revision@omg.org, interop@omg.org Subject: Re: Issue 3069 References: <384C43B5.6734902E@fpk.hp.com> <199912071822.TAA05366@pandora> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: X?U!!bb~e9?%Sd91T0e9 Martin von Loewis wrote: > > > OMG IDL currently support only the 16 bit (\u) form, and does not > have > > the 32 bit (\U) form. The question is, is it necessary and if so > how > > important is it to add the \U form to OMG IDL? What would it be > used for > > and by whom? > > Isn't that issue 3020? > > These escapes would be used if you want to put characters outside > the > "Basic Multilingual Plane" (BMP) into a wide character string. > > So far, most characters are inside the BMP (i.e. all of Unicode > is). There are a few proposals for characters outside the BMP: > > - Plane 14 is proposed for (language) tagging > http://www.unicode.org/unicode/reports/tr7.html > This proposal has been approved by the Unicode consortium, but > it is not part of Unicode 3. > - A number of scripts is included in Plane 1 (approved by UTC, ISO > approval pending in some cases) > (http://www.unicode.org/unicode/alloc/Pipeline.html) > Etruscan > Gothic > Greek Byzantine Musical Notation > Deseret Alphabet (phonetic English script) > Western Musical Symbols > - A number of scripts is proposed for Plane 1 > (http://www.unicode.org/pending/pending.html), in particular: > Basic Egyptian Hieroglyphics > Meroitic > Old Persian Cuneiform > Ugaritic Cuneiform > Tengwar > Cirth > tlhIngan Hol (Klingon) :-) This would clearly be the realization of Orfali et al's Intergalactic Objects - I'd love to see OMG specs support Klingon :-) They do say you can't appreciate Shakespeare to the full until you've read him in the original Klingon :-) Pj. Date: Tue, 7 Dec 1999 19:22:35 +0100 Message-Id: <199912071822.TAA05366@pandora> From: Martin von Loewis To: jis@fpk.hp.com CC: orb_revision@omg.org, interop@omg.org, ptc@omg.org In-reply-to: <384C43B5.6734902E@fpk.hp.com> (message from Jishnu Mukerji on Mon, 06 Dec 1999 18:16:05 -0500) Subject: Re: Issue 3069 References: <384C43B5.6734902E@fpk.hp.com> User-Agent: SEMI/1.13.3 (Komaiko) FLIM/1.12.5 (Hirahata) Emacs/20.4 (sparc-sun-solaris2.5.1) MULE/4.0 (HANANOEN) MIME-Version: 1.0 (generated by SEMI 1.13.3 - "Komaiko") Content-Type: text/plain; charset=US-ASCII X-UIDL: '2J!!f[f!!_("!!]L"e9 > OMG IDL currently support only the 16 bit (\u) form, and does not have > the 32 bit (\U) form. The question is, is it necessary and if so how > important is it to add the \U form to OMG IDL? What would it be used for > and by whom? Isn't that issue 3020? These escapes would be used if you want to put characters outside the "Basic Multilingual Plane" (BMP) into a wide character string. So far, most characters are inside the BMP (i.e. all of Unicode is). There are a few proposals for characters outside the BMP: - Plane 14 is proposed for (language) tagging http://www.unicode.org/unicode/reports/tr7.html This proposal has been approved by the Unicode consortium, but it is not part of Unicode 3. - A number of scripts is included in Plane 1 (approved by UTC, ISO approval pending in some cases) (http://www.unicode.org/unicode/alloc/Pipeline.html) Etruscan Gothic Greek Byzantine Musical Notation Deseret Alphabet (phonetic English script) Western Musical Symbols - A number of scripts is proposed for Plane 1 (http://www.unicode.org/pending/pending.html), in particular: Basic Egyptian Hieroglyphics Meroitic Old Persian Cuneiform Ugaritic Cuneiform Tengwar Cirth tlhIngan Hol (Klingon) Brahmi Old Permic Sinaitic South Arabian Pollard Blissymbolics Soyombo - Plane 2 is under investigation by UTC, for use as CJK Unified Ideographs, Extension B (41,000 characters) There is probably little reason to ever use the ancient or fictional scripts (Plane 1) into an IDL file. Language tagging seems like a useful feature in source code as well, I'm not whether it has a value in IDL. For the plane 2 cells, there is probably as much good reason to put them into IDL source as for any other natural language text - if these assignment are approved ISO. It is possible that additional assignments are made in the future for characters in other planes. If these are supported in IDL, there is a problem mapping them to environments which only support the BMP, but not full ISO 10646. Most notably, it would be hard to map them to Java. Encoding them on the wire is much simpler: both UTF-8 and UTF-16 support those planes. But that was not the issue. Regards, Martin Sender: jis@fpk.hp.com Message-ID: <38503F99.F32B802F@fpk.hp.com> Date: Thu, 09 Dec 1999 18:47:37 -0500 From: Jishnu Mukerji Organization: Hewlett-Packard EIAL X-Mailer: Mozilla 4.08 [en] (X11; U; HP-UX B.10.10 9000/777) MIME-Version: 1.0 To: orb_revision@omg.org Subject: Issue 3069 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: lC>e9"l=!!-n,e9@-Q!! Issue 3069: OMG IDL Syntax and Semantics issue (orb_revision) Source: (, ) Nature: Uncategorized Issue Severity: Summary: The following thing is unnoticed in CORBA V2.3, June 1999, OMG document 99-07-07.pdf, Chapter 3, "OMG IDL Syntax and Semantics", pages 3-37..3-39, definition of "sequence" type: There is no explicit definition what length sequence may have at run time. Things are perfectly defined for sequence bounds (i.e. maximum size at compile time) which is explicitly declared to be a positive integer. However, nothing is said whether length of sequence at run time can be: (a) positive; or (b) non-negative; or even (c) negative. _______________________________________________________ So is it reasonable to resolve this by saying something like: "The length of the sequence at run-time can be any non-negative integer number that does no exceed the maximum size of the sequence defined either at compile time (bounded sequences) or at run time (unbounded sequences)". in section 3.10.3.1 "Sequence"? Thanks, Jishnu. -- Jishnu Mukerji Systems Architect Email: jis@fpk.hp.com Hewlett-Packard EIAL, Tel: +1 973 443 7528 300 Campus Drive, 2E-62, Fax: +1 973 443 7422 Florham Park, NJ 07932, USA. Date: Fri, 10 Dec 1999 11:27:08 +1000 (EST) From: Michi Henning To: Jishnu Mukerji cc: orb_revision@omg.org Subject: Re: Issue 3069 In-Reply-To: <38503F99.F32B802F@fpk.hp.com> Message-ID: Organization: Object Oriented Concepts MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-UIDL: W_bd9\YOd9hf(e9aodd9 On Thu, 9 Dec 1999, Jishnu Mukerji wrote: > So is it reasonable to resolve this by saying something like: > > "The length of the sequence at run-time can be any non-negative > integer number that does no exceed the maximum size of the sequence > defined > either at compile time (bounded sequences) or at run time (unbounded > sequences)". > > in section 3.10.3.1 "Sequence"? My feeling is that this needs no change at all. It seems pretty obvious that a sequence can't contain a negative number of elements. Also, the above suggestion doesn't quite fit the bill because it doesn't say anything about the length of bounded sequences at run time, and it talks about the maximum size of an unbounded sequence at run time, but there is not such thing as a maximum size; the maximum size is typically the limit on how much virtual memory I can allocate. (The "maximum" in the C++ mapping is not a real maximum because I can grow an unbounded sequence beyond the maximum.) If we say anything at all, we could say that "The length of a bounded sequence at run time is in the interval [0..bound]. The length of an unbounded sequence at run time is in the interval [0..max] (where max is implementation dependent)." However, I am not convinced that the above needs saying because, at least to me, it is stating the obvious. Cheers, Michi. -- Michi Henning +61 7 3891 5744 Object Oriented Concepts +61 4 1118 2700 (mobile) PO Box 372 +61 7 3891 5009 (fax) Annerley 4103 michi@ooc.com.au AUSTRALIA http://www.ooc.com.au/staff/michi-henning.html