Issue 6344: Ada Mapping of Sequences Too Heavy (ada-rtf) Source: Objective Interface Systems (Mr. Victor Giddings, victor.giddings(at)mail.ois.com) Nature: Uncategorized Issue Severity: Summary: The current mapping of IDL sequences to the Ada language results in a instantiation of a defined generic package. However, each instantiation results in roughly 150K of additional object code. This is excessive for embedded systems, especially if multiple sequences are used in an application. A lightweight alternative, such as the mapping defined for C++, should be defined. Resolution: Revised Text: Actions taken: October 21, 2003: received issue Discussion: Deferred due to lack of time. Disposition: Deferred End of Annotations:===== ender: giddiv@postel X-Mailer: QUALCOMM Windows Eudora Version 5.1 Date: Tue, 21 Oct 2003 10:53:10 -0400 To: issues@omg.org From: Victor Giddings Subject: Ada Mapping of Sequences Too Heavy Cc: ada-rtf@omg.org This is an issue for the Ada RTF: Summary: The current mapping of IDL sequences to the Ada language results in a instantiation of a defined generic package. However, each instantiation results in roughly 150K of additional object code. This is excessive for embedded systems, especially if multiple sequences are used in an application. A lightweight alternative, such as the mapping defined for C++, should be defined. Victor Giddings mailto:victor.giddings@ois.com Senior Product Engineer +1 703 295 6500 Subject: Re: Issue 6344: Ada Mapping of Sequences Too Heavy Date: Thu, 26 Jun 2008 14:51:39 +0200 X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Re: Issue 6344: Ada Mapping of Sequences Too Heavy Thread-Index: AcjXi2BZe1IJxWwLSzS29yferrCqAA== From: "Kellogg, Oliver" To: Cc: "Schroeder, Heiko, MSOP53" X-OriginalArrivalTime: 26 Jun 2008 12:51:41.0702 (UTC) FILETIME=[618B8A60:01C8D78B] X-MIME-Autoconverted: from quoted-printable to 8bit by amethyst.omg.org id m5QCqSrp005119 If you plan to define an alternative mapping for sequences, please also consider the following: Sequences in the C++ and Java mappings start at index 0. Sequences in the existing Ada mapping start at index 1. If a sequence index is transmitted as part of user defined IDL between C++/Java and Ada then there is potential for error due to the offset in Ada. We propose to let sequences start at index 0 in the alternative mapping in order to avoid the inter-language problem with transmitted sequence indexes. Thanks. Oliver Kellogg, EADS Heiko Schröder, EADS/Atlas -- EADS Deutschland GmbH Defence and Communication Systems Dept. MSOP22 89077 Ulm Tel: +49 (0) 731.392-7138 Fax: +49 (0) 731.392-xxxx E-Fax: +49 (0) 731.392-20 7138 E-Mail: Oliver.Kellogg@eads.com EADS Deutschland GmbH Registered Office: Ottobrunn District Court of Munich HRB107648 Chairman of the Supervisory Board: Dr. Thomas Enders Managing Directors: Dr. Stefan Zoller (chairman), Michael Hecht X-Authentication-Warning: mailrelay1.vs.dasa.de: iscan owned process doing -bs From: Oliver Kellogg To: ada-rtf@omg.org Subject: Re: Issue 6344: Ada Mapping of Sequences Too Heavy Date: Fri, 4 Jul 2008 12:42:13 +0200 User-Agent: KMail/1.8 Cc: Victor Giddings X-OriginalArrivalTime: 04 Jul 2008 10:42:25.0329 (UTC) FILETIME=[A5B0F210:01C8DDC2] Here is my proposal for a thin version of package CORBA.Sequences.Bounded. (I suggest we discuss the bounded sequence package first, until the basics are sorted out and agreed upon. I can then provide the corresponding package for unbounded sequences.) I have thinned out the package CORBA.Sequences.Bounded using these premises: * "When in doubt, leave it out". * The most important functions are those for converting between Sequence and Element_Array. Advanced operations such as searching, insertion and deletion at selected indexes, slicing etc., would be done on the Element_Array and not on the Sequence. * The most common usage patterns for sequences are 1) Get / Set an element at a specified index in the sequence. 2) Append a new element at the end of the sequence. Only those most common operations are supported directly on sequences. Certainly the definition of "most common" is subject to discussion; please provide your opinions. I tried compiling an instantiation of the original CORBA.Sequences.Bounded with GNATGPL2005 "gcc -c -O" on Linux and it yielded an object file of 72416 bytes. Compiling an instantiation of the Bounded_Simple package yields a 4224 byte object file. Thank you. Oliver M. Kellogg -- EADS Deutschland GmbH Defence and Communication Systems Dept. MSOP22 89077 Ulm Tel: +49 (0) 731.392-7138 E-Fax: +49 (0) 731.392-20 7138 E-Mail: Oliver.Kellogg@eads.com EADS Deutschland GmbH Registered Office: Ottobrunn District Court of Munich HRB107648 Chairman of the Supervisory Board: Dr. Thomas Enders Managing Directors: Dr. Stefan Zoller (chairman), Michael Hecht Cc: ada-rtf@omg.org From: Victor Giddings To: Oliver Kellogg Subject: Re: Issue 6344: Ada Mapping of Sequences Too Heavy Date: Fri, 1 Aug 2008 21:52:28 -0400 X-Mailer: Apple Mail (2.926) Oliver, Thanks for the proposal. I have to admit it is a more drastic cut-down than I had the guts to propose. That said I am not opposed it, I am looking over whether we use any of the other subprograms in our ORB implementation to see if there is sufficient pain to suggest other subprograms. I am a little confused by the statement: "Advanced operations such as searching, insertion and deletion at selected indexes, slicing etc., would be done on the Element_Array and not on the Sequence." This would seem to require invoking To_Element_Array on the sequence, doing the operation, and then invoking To_Sequence on the result. But To_Element_Array and To_Sequence both must make copies of the contents of the array, which could be expensive for complex element types. Is this your intent, or do you wish to do something more "Cstyle" like passing an access to the element_array to allow updates "in place"? Again, I am not yet advocating something different, just trying to understand your intent. Two other suggestions for discussion: 1) Retain Get_Element and Replace_Element with 1-based indexing for reverse compatibility (possibly in a section marked as deprecated.) Implementation of Get_element could call get with index-1, etc. so I don't think this would necessarily add a lot of footprint. 2) Change Null_Sequence to a function. I think this would remove the last need to access the instantiated package by its name. Null_Sequence would now be primitive on the type and so would be derived by the required new type derivation in the defining package. Comments? Victor Giddings Objective Interface Systems victor.giddings@mail.ois.com On Jul 4, 2008, at 6:42 AM, Oliver Kellogg wrote: Here is my proposal for a thin version of package CORBA.Sequences.Bounded. (I suggest we discuss the bounded sequence package first, until the basics are sorted out and agreed upon. I can then provide the corresponding package for unbounded sequences.) I have thinned out the package CORBA.Sequences.Bounded using these premises: * "When in doubt, leave it out". * The most important functions are those for converting between Sequence and Element_Array. Advanced operations such as searching, insertion and deletion at selected indexes, slicing etc., would be done on the Element_Array and not on the Sequence. * The most common usage patterns for sequences are 1) Get / Set an element at a specified index in the sequence. 2) Append a new element at the end of the sequence. Only those most common operations are supported directly on sequences. Certainly the definition of "most common" is subject to discussion; please provide your opinions. I tried compiling an instantiation of the original CORBA.Sequences.Bounded with GNATGPL2005 "gcc -c -O" on Linux and it yielded an object file of 72416 bytes. Compiling an instantiation of the Bounded_Simple package yields a 4224 byte object file. Thank you. Oliver M. Kellogg -- EADS Deutschland GmbH Defence and Communication Systems Dept. MSOP22 89077 Ulm Tel: (0) 731.392-7138 E-Fax: (0) 731.392-20 7138 E-Mail: Oliver.Kellogg@eads.com EADS Deutschland GmbH Registered Office: Ottobrunn District Court of Munich HRB107648 Chairman of the Supervisory Board: Dr. Thomas Enders Managing Directors: Dr. Stefan Zoller (chairman), Michael Hecht From: Oliver Kellogg To: Victor Giddings Subject: Re: Issue 6344: Ada Mapping of Sequences Too Heavy Date: Mon, 4 Aug 2008 16:21:56 . User-Agent: KMail/1.8 Cc: ada-rtf@omg.org, Heiko X-OriginalArrivalTime: 04 Aug 2008 14:21:59.0742 (UTC) FILETIME=[750F15E0:01C8F63D] On Saturday 02 August 2008 03:52, Victor Giddings wrote: > > Thanks for the proposal. I have to admit it is a more drastic cut-down > than I had the guts to propose. That said I am not opposed it, I am > looking over whether we use any of the other subprograms in our ORB > implementation to see if there is sufficient pain to suggest other > subprograms. Yes. In fact, I myself noticed an omission of two important subprograms in my proposal, namely the function Slice and the procedure Replace_Slice. It occurred to me that, when dealing with large sequences, it would be cumbersome to always have to retrieve the entire sequence and re-set the entire sequence using To_Element_Array/To_Sequence, even just to change a slice of two or three elements. So I put these operations back in - however, I simplified the procedure Replace_Slice to take a Start_Index (instead of Low and High indexes) which means that Replace_Slice does not move around elements internally to accommodate a (possibly smaller or larger than original) slice, i.e. it does not support size changes in the slice. A further addition that I'm pondering is a separate setter for the Length: procedure Length (Source : in out Sequence; New_Length : Length_Range); This would be advantageous for preallocating the sequence buffer (useful in particular with unbounded sequences.) What do you think? Please make suggestions as to which other subprograms you deem essential. > I am a little confused by the statement: "Advanced operations such as > searching, insertion and deletion at selected indexes, slicing etc., > would be done on the > Element_Array and not on the Sequence." This would seem to require > invoking To_Element_Array on the sequence, doing the operation, and > then invoking To_Sequence on the result. But To_Element_Array and > To_Sequence both must make copies of the contents of the array, which > could be expensive for complex element types. Is this your intent, or > do you wish to do something more "Cstyle" like passing an access to > the element_array to allow updates "in place"? That was my intent. I shied away from using access values to avoid questions of ownership/ lifetime that would need to be addressed. If this is a major concern then please do propose your reference semantics. > Two other suggestions for discussion: > > 1) Retain Get_Element and Replace_Element with 1-based indexing for > reverse compatibility (possibly in a section marked as deprecated.) > Implementation of Get_element could call get with index-1, etc. so I > don't think this would necessarily add a lot of footprint. Done as suggested, see new attachment. However, be aware that function To_Element_Array returns the Element_Array according to the new convention, i.e. with start index 0. Thus backward compatibility is not 100%. > 2) Change Null_Sequence to a function. I think this would remove the > last need to access the instantiated package by its name. > Null_Sequence would now be primitive on the type and so would be > derived by the required new type derivation in the defining package. Done as suggested, see new attachment. > > I tried compiling an instantiation of the original > > CORBA.Sequences.Bounded > > with GNATGPL2005 "gcc -c -O" on Linux and it yielded an object file > > of 72416 > > bytes. Compiling an instantiation of the Bounded_Simple package > > yields a > > 4224 byte object file. > > With these changes, the object file size is 7616 bytes. [ My example used is: package P is new CORBA.Sequences.Bounded_Simple (Integer, 10); ] Regards, Oliver M. Kellogg -- Defence and Communication Systems Dept. MSOP22 89077 Ulm Tel: (0) 731.392-7138 E-Fax: (0) 731.392-20 7138 E-Mail: Oliver.Kellogg@eads.com EADS Deutschland GmbH Registered Office: Ottobrunn District Court of Munich HRB107648 Chairman of the Supervisory Board: Dr. Thomas Enders Managing Directors: Dr. Stefan Zoller (chairman), Michael Hecht corba-sequences-bounded_simple1.ads Cc: ada-rtf@omg.org, Heiko From: "bill.beckwith@ois.com" To: Oliver Kellogg , Victor Giddings Subject: Re: Issue 6344: Ada Mapping of Sequences Too Heavy Date: Mon, 4 Aug 2008 10:51:09 -0400 X-Mailer: Apple Mail (2.926) Hi Oliver and Vic, I am an advocate of reducing the size of instantiated sequences. Some comments: 1. It seems like this is being optimized for a particular compiler. GNAT is particularly heavy on generic instantiations. Most other Ada 95 compilers produce less than 10K of object code for CORBA.Sequences.Bounded instantiations. 2. 7.6K or even 4.2K still seems much too large for and instantiation of the cut down version of sequence mapping. 3. Please make sure that you end up with both an efficient mapping that has clear memory ownership semantics. Bill On Aug 4, 2008, at 10:21 AM, Oliver Kellogg wrote: On Saturday 02 August 2008 03:52, Victor Giddings wrote: I tried compiling an instantiation of the original CORBA.Sequences.Bounded with GNATGPL2005 "gcc -c -O" on Linux and it yielded an object file of 72416 bytes. Compiling an instantiation of the Bounded_Simple package yields a 4224 byte object file. With these changes, the object file size is 7616 bytes. [ My example used is: package P is new CORBA.Sequences.Bounded_Simple (Integer, 10); ] Cc: ada-rtf@omg.org, Heiko From: Victor Giddings To: bill.beckwith@ois.com, Oliver Kellogg Subject: Re: Issue 6344: Ada Mapping of Sequences Too Heavy Date: Mon, 4 Aug 2008 13:22:47 -0400 X-Mailer: Apple Mail (2.926) All, Bill makes a good point, I will try to get numbers for some of the other compilers I have access to. (on review, I realize that the following is probably more relevant to unbounded rather than bounded sequences). Re: the clean memory ownership semantics. I'd hesitate to provide complete access to the "buffer pointer" like is available in C++. But I might be OK with the "constructor" that takes a pointer to a buffer and provides no memory management (within the sequence itself). A complete departure would be to allow a sequence to "wrap" any collection that has an ordering iterator. But this would kill performance for simple ones like sequence of Octet. I am on vacation this week, so won't be able to do any measurements until next week. Victor Giddings Objective Interface Systems victor.giddings@mail.ois.com On Aug 4, 2008, at 10:51 AM, bill.beckwith@ois.com wrote: Hi Oliver and Vic, I am an advocate of reducing the size of instantiated sequences. Some comments: 1. It seems like this is being optimized for a particular compiler. GNAT is particularly heavy on generic instantiations. Most other Ada 95 compilers produce less than 10K of object code for CORBA.Sequences.Bounded instantiations. 2. 7.6K or even 4.2K still seems much too large for and instantiation of the cut down version of sequence mapping. 3. Please make sure that you end up with both an efficient mapping that has clear memory ownership semantics. Bill On Aug 4, 2008, at 10:21 AM, Oliver Kellogg wrote: On Saturday 02 August 2008 03:52, Victor Giddings wrote: I tried compiling an instantiation of the original CORBA.Sequences.Bounded with GNATGPL2005 "gcc -c -O" on Linux and it yielded an object file of 72416 bytes. Compiling an instantiation of the Bounded_Simple package yields a 4224 byte object file. With these changes, the object file size is 7616 bytes. [ My example used is: package P is new CORBA.Sequences.Bounded_Simple (Integer, 10); ] Objective Interface Systems Fax: +1 703 295 6501