Issue 1128: String member initialization (cxx_revision) Source: (, ) Nature: Revision Severity: Summary: Summary: orbos/98-01-11 doesn"t initialize string members if they are inside a sequence or array. For consistency, it would be better to adopt the following: - A plain String_var is default-constructed to contain a null pointer (like all other _var types). - If a structure, exception, sequence, or array contains a string, that string is initialized to the empty string when default-constructed. In case of a sequence of strings, this means that strings are default-constructed to the empty string when the sequence is extended. Resolution: Revised Text: Actions taken: April 1, 1998: received issue February 19, 1999: closed issue, resolved Discussion: End of Annotations:===== Return-Path: X-Authentication-Warning: tigger.dstc.edu.au: michi owned process doing -bs Date: Wed, 1 Apr 1998 08:40:34 +1000 (EST) From: Michi Henning To: cxx_revision@omg.org, issues@omg.org Subject: string member initialization Hi, more on the change to initializing string members to the empty string... orbos/98-01-11 doesn't initialize string members if they are inside a sequence or array. For consistency, it would be better to adopt the following: - A plain String_var is default-constructed to contain a null pointer (like all other _var types). - If a structure, exception, sequence, or array contains a string, that string is initialized to the empty string when default-constructed. In case of a sequence of strings, this means that strings are default-constructed to the empty string when the sequence is extended. No initialization is done for string members in unions - this is because the union mapping makes it impossible to activate a member without supplying a value anyway. Chers, Michi. -- Michi Henning +61 7 33654310 DSTC Pty Ltd +61 7 33654311 (fax) University of Qld 4072 michi@dstc.edu.au AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html Return-Path: Date: Tue, 31 Mar 1998 23:10:04 -0800 From: Jon Goldberg To: Michi Henning CC: cxx_revision@omg.org, issues@omg.org Subject: Re: string member initialization References: Hi- With the goal of simplicity and ease of use, why not just say String_var's by default contruction always contain an empty string. The C++ mapping is confusing enough. If marshaling NULL strings is an error, lets do whatever we can to make the programmer's life easier. It's not like doing this makes implementations any harder or slower, and it means there is less "special case" wording in the spec. take care, Jon Michi Henning wrote: > > Hi, > > more on the change to initializing string members to the empty > string... > > orbos/98-01-11 doesn't initialize string members if they are inside > a sequence or array. > > For consistency, it would be better to adopt the following: > > - A plain String_var is default-constructed to contain a > null pointer (like all other _var types). > > - If a structure, exception, sequence, or array contains > a string, that string is initialized to the empty string > when default-constructed. In case of a sequence of > strings, > this means that strings are default-constructed to the > empty string when the sequence is extended. > > No initialization is done for string members in unions - this is > because > the union mapping makes it impossible to activate a member without > supplying a value anyway. > Return-Path: X-Sender: vinoski@mail.boston.iona.ie Date: Wed, 01 Apr 1998 07:43:33 -0500 To: Jon Goldberg From: Steve Vinoski Subject: Re: string member initialization Cc: Michi Henning , cxx_revision@omg.org, issues@omg.org References: Hi Jon, Making String_vars initialize to the empty string would be a terrible mistake. Their entire reason for being is to adopt char* and ensure that they are eventually string_free'd. It would be very strange to have all _var types initialize themselves to the null pointer *except* String_var. Michi's suggestions actually make the mapping simpler, not more complex. If you consider all constructed types -- structs, sequences, arrays, but not unions -- the current mapping allows all of them to be default constructed. The result of such default construction can be passed over the wire in all cases except if there are any strings contained therein (the values of any contained integers might be nonsensical, but they're still legal values). Michi's proposal merely removes the special cases involving strings. In fact, what he's proposing for strings is exactly the way things work for object references today. Finally, I disagree that the C++ mapping is confusing -- it actually has a very small set of rules that are consistently applied. Learn those rules, and it's a piece of cake. --steve At 11:10 PM 3/31/98 -0800, Jon Goldberg wrote: >With the goal of simplicity and ease of use, why not >just say String_var's by default contruction always >contain an empty string. > >The C++ mapping is confusing enough. If marshaling NULL >strings is an error, lets do whatever we can to make >the programmer's life easier. It's not like doing this >makes implementations any harder or slower, and it means >there is less "special case" wording in the spec. > >take care, > Jon > >Michi Henning wrote: >> >> Hi, >> >> more on the change to initializing string members to the empty >> string... >> >> orbos/98-01-11 doesn't initialize string members if they are inside >> a sequence or array. >> >> For consistency, it would be better to adopt the following: >> >> - A plain String_var is default-constructed to contain a >> null pointer (like all other _var types). >> >> - If a structure, exception, sequence, or array contains >> a string, that string is initialized to the empty string >> when default-constructed. In case of a sequence of strings, >> this means that strings are default-constructed to the >> empty string when the sequence is extended. >> >> No initialization is done for string members in unions - this is >> because >> the union mapping makes it impossible to activate a member without >> supplying a value anyway. >> > > Return-Path: Date: Wed, 01 Apr 1998 08:57:37 -0500 From: Paul H Kyzivat Organization: NobleNet To: Jon Goldberg CC: Michi Henning , cxx_revision@omg.org, issues@omg.org Subject: Re: string member initialization References: <3521E84C.BBC78794@corp.borland.com> Jon Goldberg wrote: > > Hi- > > With the goal of simplicity and ease of use, why not > just say String_var's by default contruction always > contain an empty string. > > The C++ mapping is confusing enough. If marshaling NULL > strings is an error, lets do whatever we can to make > the programmer's life easier. It's not like doing this > makes implementations any harder or slower, and it means > there is less "special case" wording in the spec. > > take care, > Jon While this change wouldn't necessarily make implementations harder, it will certainly make them slower. Consider the following: char* test() { CORBA::String_var s1; // (1) // other computations s1 = CORBA::string_dup("abc"); // (2) // other computations return s1._retn(); // (3) } // (4) Under your suggestion, at (1) there must be a dynamic allocation of a null string - effectively a CORBA::string_dup(""). At (2) this is freed. At (3), when the old value is extracted by _retn, it must be replaced by a valid null string, so there must be another allocation. Then at (4), when the destructor for s1 is called, that value must be freed. So, in this very simple and common example there are two unnecessary allocations and two unnecessary frees; this will certainly make implementations slower. There are a lot of things wrong with the current string mapping, but these kinds of patches only seem to make it worse. I think the only useful solution is to replace the mapping with a real string class. Return-Path: Sender: jon@floorboard.com Date: Wed, 01 Apr 1998 08:27:09 -0800 From: Jonathan Biggar To: Steve Vinoski CC: Jon Goldberg , Michi Henning , cxx_revision@omg.org, issues@omg.org Subject: Re: string member initialization References: <3.0.5.32.19980401074333.007c55d0@mail.boston.iona.ie> Steve Vinoski wrote: > > Hi Jon, > > Making String_vars initialize to the empty string would be a > terrible > mistake. Their entire reason for being is to adopt char* and ensure > that they are eventually string_free'd. It would be very strange to > have all _var types initialize themselves to the null pointer > *except* String_var. > > Michi's suggestions actually make the mapping simpler, not more > complex. If you consider all constructed types -- structs, > sequences, > arrays, but not unions -- the current mapping allows all of them to > be default constructed. The result of such default construction can > be passed over the wire in all cases except if there are any strings > contained therein (the values of any contained integers might be > nonsensical, but they're still legal values). Michi's proposal > merely > removes the special cases involving strings. In fact, what he's > proposing for strings is exactly the way things work for object > references today. > > Finally, I disagree that the C++ mapping is confusing -- it actually > has a very small set of rules that are consistently applied. Learn > those rules, and it's a piece of cake. Also, there is probably already a large body of code that depends on the fact that String_vars are initialized to 0, for example, using the null pointer as a test of whether the String_var has been set to a value yet or not. This change would break all of that code. It is much less likely that struct or sequence string members are used in the same way. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Sender: jon@floorboard.com Date: Wed, 01 Apr 1998 08:38:22 -0800 From: Jonathan Biggar To: Paul H Kyzivat CC: Jon Goldberg , Michi Henning , cxx_revision@omg.org, issues@omg.org Subject: Re: string member initialization References: <3521E84C.BBC78794@corp.borland.com> <352247D1.A2A0B238@noblenet.com> Paul H Kyzivat wrote: > While this change wouldn't necessarily make implementations harder, > it will certainly make them slower. Consider the following: > > char* test() > { > CORBA::String_var s1; // (1) > // other computations > s1 = CORBA::string_dup("abc"); // (2) > // other computations > return s1._retn(); // (3) > } // (4) > > Under your suggestion, at (1) there must be a dynamic allocation of > a > null string - effectively a CORBA::string_dup(""). At (2) this is > freed. At (3), when the old value is extracted by _retn, it must > be replaced by a valid null string, so there must be another > allocation. > Then at (4), when the destructor for s1 is called, that value must > be freed. So, in this very simple and common example there are two > unnecessary allocations and two unnecessary frees; this will > certainly > make implementations slower. It doesn't have to make them slower. The implementor can use a special pointer to an empty string (such as a static member of the String_var class) to initialize the value, and test in the destructor (and string_free) for that value and avoid the allocations and deallocations. > There are a lot of things wrong with the current string mapping, but > these kinds of patches only seem to make it worse. I think the only > useful solution is to replace the mapping with a real string class. Yup, it would be nice if one of these years the C++ mapping used the STL string for strings, the STL vector for sequences, and perhaps auto_ptr instead of _var. However the C++ mapping was not done this way for two reasons: 1. When the mapping was first developed, STL wasn't accepted as part of the C++ standard, in fact there was almost nothing in the C++ standard library. 2. The C++ mapping is designed to make it possible (although not easy) for an ORB to support the C and the C++ mapping in the same address space, with binary compatibility between the data structures. It is arguable whether it was worth it or not, but given the constraint, the C++ mapping had to turn out pretty much this way. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Date: Wed, 01 Apr 1998 18:44:42 -0500 From: Paul H Kyzivat Organization: NobleNet To: Jonathan Biggar CC: Jon Goldberg , Michi Henning , cxx_revision@omg.org, issues@omg.org Subject: Re: string member initialization References: <3521E84C.BBC78794@corp.borland.com> <352247D1.A2A0B238@noblenet.com> <35226D7E.1AF315EE@floorboard.com> Jonathan Biggar wrote: > > It doesn't have to make them slower. The implementor can use a > special > pointer to an empty string (such as a static member of the > String_var > class) to initialize the value, and test in the destructor (and > string_free) for that value and avoid the allocations and > deallocations. Yes, that could be done, but not without catching a lot of other special cases as well, each of which would have to dup the special string. E.g. _retn(), inout(). And, each of these special cases does make things marginally slower. More importantly, they may well prevent something from being inlined that ought to have been. If this must be done, it would probably be cleaner to allow the pointer to be null but force the accessors and conversions to yield up a reference to the null string. > > > There are a lot of things wrong with the current string mapping, > but > > these kinds of patches only seem to make it worse. I think the > only > > useful solution is to replace the mapping with a real string > class. > > Yup, it would be nice if one of these years the C++ mapping used the > STL > string for strings, the STL vector for sequences, and perhaps > auto_ptr > instead of _var. > > However the C++ mapping was not done this way for two reasons: > > 1. When the mapping was first developed, STL wasn't accepted as > part > of > the C++ standard, in fact there was almost nothing in the C++ > standard > library. > > 2. The C++ mapping is designed to make it possible (although not > easy) > for an ORB to support the C and the C++ mapping in the same address > space, with binary compatibility between the data structures. It is > arguable whether it was worth it or not, but given the constraint, > the > C++ mapping had to turn out pretty much this way. Yes, I know that story. but it is history now. The C/C++ coexistence would be more interesting if there was any evidence that the goal had been achieved. (Does any such implementation exist? Does anybody care?) If it was possible to put out the POA specification and deprecate the BOA, it should also be possible to put out an entirely new C++ binding and deprecate the old. Return-Path: X-Authentication-Warning: tigger.dstc.edu.au: michi owned process doing -bs Date: Thu, 2 Apr 1998 09:56:07 +1000 (EST) From: Michi Henning To: Paul H Kyzivat cc: Jonathan Biggar , Jon Goldberg , cxx_revision@omg.org Subject: Re: string member initialization Content-ID: On Wed, 1 Apr 1998, Paul H Kyzivat wrote: > And, each of these special cases does make things marginally slower. > More importantly, they may well prevent something from being inlined > that ought to have been. The cost is one extra if-statement, that is, as close to zero as being unmeasurable. Besides, if you argue for an eventual mapping to STL strings, then that same cost will be incurred (because STL strings are initialized to empty). The issue here is not the cost at run time, in my opinion. Instead, the issue is to provide a good programming model. Something like an extra four or five clock ticks per initialization is nothing compared to the cost of addressing a higher defect rate cause by an arcane API... > If this must be done, it would probably be cleaner to allow the pointer > to be null but force the accessors and conversions to yield up a > reference to the null string. This is an implementation detail. Initialization to the empty string would be mandated at a logical level - if the implementation wants to use lazy evaluation, that's its business and not visible from the outside. > The C/C++ coexistence would be more interesting if there was any > evidence that the goal had been achieved. (Does any such > implementation > exist? Does anybody care?) That would be interesting to know. I have a sneaking suspicion that even though the spec permits the coexistence, actually implementing it would be quite difficult. I am not aware of any ORB that currently has binary C/C++ compatibility (anybody of such a beast?) > If it was possible to put out the POA specification and deprecate the > BOA, it should also be possible to put out an entirely new C++ binding > and deprecate the old. Not in a hurry, but I think migration to a different mapping would be possible over time, by allowing both mappings to co-exist for a while (not necessarily in the same address space though). However, I suspect you will meet with strong vendor resistance - it's hard enough to maintain one C++ mapping, let alone two... Cheers, Michi. -- Michi Henning +61 7 33654310 DSTC Pty Ltd +61 7 33654311 (fax) University of Qld 4072 michi@dstc.edu.au AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html Return-Path: Date: Thu, 02 Apr 1998 04:41:57 -0800 From: Jon Goldberg To: Steve Vinoski CC: goldberg@visigenic.com, michi@dstc.edu.au, cxx_revision@omg.org, issues@omg.org Subject: Re: string member initialization References: Steve Vinoski wrote: > > Hi Jon, > > Making String_vars initialize to the empty string would be a > terrible > mistake. Their entire reason for being is to adopt char* and ensure > that they are eventually string_free'd. It would be very strange to > have all _var types initialize themselves to the null pointer > *except* String_var. I don't really see how these two ideas go together. Clearly the _var is for memory management. Its initial value (NULL or otherwise) seems to have little to do with that. My recommended change makes strings exactly the same as object references with respect to _vars, since both of these types have slightly different semantics with respect to "no value" as it maps to C++. The default for object reference _vars is to a nil value that is marshalable. The default for struct _vars, sequence _vars, etc. is a non-marshalable NULL pointer. Currently the default value for String_var is non-marshalable. I propose that if we make it ever default to a marshalable state, we should consistently make that change just like with object reference _vars. > Michi's suggestions actually make the mapping simpler, not more > complex. If you consider all constructed types -- structs, > sequences, > arrays, but not unions -- the current mapping allows all of them to > be default constructed. The result of such default construction can > be passed over the wire in all cases except if there are any strings > contained therein (the values of any contained integers might be > nonsensical, but they're still legal values). Michi's proposal > merely > removes the special cases involving strings. In fact, what he's > proposing for strings is exactly the way things work for object > references today. Except that for object references today, we CAN marshal the default value of the _var (nil). In the proposal, attempts to marshal the default String_var still raise MARSHAL exceptions, yet the default _var for an object reference is properly marshalable. I believe that my extension of his proposal makes String_vars behave exactly as will happen for default object reference _vars today. I *do* however, agree with the argument that changing this initialization rule could break today's code that tests to see if the contents of a string_var are NULL (as opposed to of length 0). I don't really see why breaking code that expects this behavior when the string is a member of a struct is particularly different from the behavior when the String_var. -Jon Return-Path: X-Authentication-Warning: tigger.dstc.edu.au: michi owned process doing -bs Date: Fri, 3 Apr 1998 06:07:57 +1000 (EST) From: Michi Henning To: Jon Goldberg cc: Steve Vinoski , goldberg@visigenic.com, cxx_revision@omg.org Subject: Re: string member initialization On Thu, 2 Apr 1998, Jon Goldberg wrote: > I *do* however, agree with the argument that changing > this initialization rule could break today's code that > tests to see if the contents of a string_var are NULL > (as opposed to of length 0). I don't really see why > breaking code that expects this behavior when the string > is a member of a struct is particularly different from > the behavior when the String_var. Apparently, there is quite a bit of code out there that uses tests like (s != 0) if s is a String_var -- developers started using String_var not only for IDL types, but also as a general-purpose string auto_ptr. Initializing String_vars to empty would break such code. For nested strings, this is much less likely to be a problem. It also makes some things easier to use. For example, an inout sequence of strings wouldn't have to be explictly initialized for all elements, and things like Naming Service name components become a lot easier to use if I want to ignore the "kind" field. My motivation for suggesting this initialization was to make life easier for the programmer. However, we can't just go and gratuitously break existing code. That's why a plain String_var is still initialized to null. Cheers, Michi. -- Michi Henning +61 7 33654310 DSTC Pty Ltd +61 7 33654311 (fax) University of Qld 4072 michi@dstc.edu.au AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html Return-Path: Date: Mon, 2 Nov 1998 15:05:31 -0330 From: Matthew Newhook To: cxx_revision@omg.org Subject: Text for issue# 1128 Hi, In ptc/98-09-03 (09 Oct, 1998) the text for issue #1128 reads: * Issue 1128: String member initialization. Changed default construction semantics for string members of exceptions and arrays to match those of string members of structs, unions, and sequences. String and wide string members of all of these types are now default constructed to the empty string. Later in the Mapping for Union section: The default union constructor performs no application-visible initialization of the union. It does not initialize the discriminator, nor does it initialize any union members to a state useful to an application. My question is what does the issue text mean then when it says that the string and wide string members are default constructed to the empty string? Best Regards, Matthew -- Matthew Newhook E-Mail: mailto:matthew@ooc.com Software Designer WWW: http://www.ooc.com Object Oriented Concepts, Inc. Phone: (978) 439 9285 x 246 Return-Path: Sender: jon@floorboard.com Date: Mon, 02 Nov 1998 11:23:59 -0800 From: Jonathan Biggar X-Accept-Language: en To: Matthew Newhook CC: cxx_revision@omg.org Subject: Re: Text for issue# 1128 References: <19981102150531.A26630@wiley242h106.roadrunner.nf.net> Matthew Newhook wrote: > > Hi, > In ptc/98-09-03 (09 Oct, 1998) the text for issue #1128 reads: > > * Issue 1128: String member initialization. Changed default > construction > semantics for string members of exceptions and arrays to match those > of string members of structs, unions, and sequences. String and wide > string members of all of these types are now default constructed to > the > empty string. > > Later in the Mapping for Union section: > > The default union constructor performs no application-visible > initialization of the union. It does not initialize the > discriminator, nor > does it initialize any union members to a state useful to an > application. > > My question is what does the issue text mean then when it says that > the > string and wide string members are default constructed to the empty > string? It's not entirely clear what the context of the question is, but I'll give it a shot: ignore unions in the issue text. Unions are initialized to have no value, so the mention of unions in issue 1128 is superfluous. I suspect that unions got thrown into the mix for issue 1128 by an unchecked enthusiasm for completeness. :-) -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: X-Authentication-Warning: azure.dstc.edu.au: michi owned process doing -bs Date: Tue, 3 Nov 1998 05:38:24 +1000 (EST) From: Michi Henning To: Jonathan Biggar cc: Matthew Newhook , cxx_revision@omg.org Subject: Re: Text for issue# 1128 On Mon, 2 Nov 1998, Jonathan Biggar wrote: > > My question is what does the issue text mean then when it says that the > > string and wide string members are default constructed to the empty > > string? > > It's not entirely clear what the context of the question is, but I'll > give it a shot: ignore unions in the issue text. Unions are > initialized to have no value, so the mention of unions in issue 1128 is > superfluous. > > I suspect that unions got thrown into the mix for issue 1128 by an > unchecked enthusiasm for completeness. :-) Yep. It doesn't make sense to say anything about default initialization of union members. Which union member should be initialized when the union is constructed? Because you cannot activate a union member other than by assigning a value to it, there is no time at which it would ever be necessary to default construct a union member. Cheers, Michi. -- Michi Henning +61 7 33654310 DSTC Pty Ltd +61 7 33654311 (fax) University of Qld 4072 michi@dstc.edu.au AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html Return-Path: X-Sender: vinoski@mail.boston.iona.ie Date: Mon, 02 Nov 1998 15:43:01 -0500 To: Michi Henning From: Steve Vinoski Subject: Re: Text for issue# 1128 Cc: cxx_revision@omg.org References: <363E06CF.92E242B1@floorboard.com> At 05:38 AM 11/3/98 +1000, Michi Henning wrote: >On Mon, 2 Nov 1998, Jonathan Biggar wrote: > >> > My question is what does the issue text mean then when it says that the >> > string and wide string members are default constructed to the empty >> > string? >> >> It's not entirely clear what the context of the question is, but I'll >> give it a shot: ignore unions in the issue text. Unions are >> initialized to have no value, so the mention of unions in issue 1128 is >> superfluous. >> >> I suspect that unions got thrown into the mix for issue 1128 by an >> unchecked enthusiasm for completeness. :-) > >Yep. It doesn't make sense to say anything about default initialization >of union members. Which union member should be initialized when the >union is constructed? Because you cannot activate a union member other >than by assigning a value to it, there is no time at which it would >ever be necessary to default construct a union member. I have fixed this editorially. It never fails -- as soon as I publish a new version, even more editorial issues come flying in. I will publish a revised version of ptc/98-09-03 on Friday, November 6, so please get any other editorial issues to me ASAP. --steve