Issue 3574: unclear semantics for valuetype insertion into Any (cxx_revision) Source: Cisco Systems (Mr. Paul Kyzivat, pkyzivat(at)cisco.com) Nature: Uncategorized Issue Severity: Summary: The semantics for insertion of a valuetype into an Any are unclear. (Note, this is related to issue 2531 in the IDL-to-Java RTF. It is also related to orb_revision issue 3205.) In section 1.16.2 of ptc/2000-01-02, two forms of insertion are defined: copying and non-copying. The non-copying form is described as: "The noncopying valuetype insertion consumes the valuetype pointed to by the pointer that T** points to. After insertion, the caller may not access the valuetype instance pointed to by the pointer that T* points to. The caller maintains ownership of the storage for the pointed-to-T* itself." There is no specific description of the copying form specific to valuetypes, so the generic description must apply: "For the copying version of operator<<=, the lifetime of the value in the any is independent of the lifetime of the value passed to operator<<=. The implementation of the any may not store its value as a reference or pointer to the value passed to operator<<=." One possible interpretation (1) is that the copying form should be implemented via a call to the _copy_value virtual function, while the non-copying form should simply retain the provided pointer (without calling _add_ref) and eventually call _remove_ref when done with it. If so, what is the significance of the rule about the caller not continuing to use the pointer? It it only that it has lost a reference count, and may continue using the pointer if it has another reference count? Or does this imply that continued access to the value is forbidden regardless of reference count? Another possible interpretation (2) is that the description is nonsense, and that the non-copying form should use _add_ref and the copying form should use _copy_value. In this interpretation the caller would be free to continue using the original pointer and would be obligated to _remove_ref it eventually. This seems like a more practical interpretation, but is inconsistent with usage for other non-copying insertions. Suggested Resolution: Replace the paragraph on non-copying insertion of valuetypes (quoted above) with: "The noncopying valuetype insertion takes ownership of one reference count to the valuetype pointed to by the pointer that T** points to. After insertion, the caller should treat the pointer as if _remove_ref had been called on it. The caller maintains ownership of the storage for the pointed-to-T* itself." "For copying valuetype insertion, the lifetime of the value in the any is independent of the lifetime of the value provided. The implementation of the any shall duplicate the value using the virtual function _copy_value or an equivalent mechanism. The caller retains ownership of the T* pointer and remains obliged to call _remove_ref on it." Resolution: Revised Text: Actions taken: April 20, 2000: received issue Discussion: deferred in June 2011 to the next RTF End of Annotations:===== From: Paul Kyzivat To: "'issues@omg.org'" , "'cxx_revision@omg.org'" Subject: unclear semantics for valuetype insertion into Any Date: Thu, 20 Apr 2000 17:46:09 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2448.0) Content-Type: text/plain; charset="iso-8859-1" X-UIDL: 8#]!!C5^!!1ipd99#A!! The semantics for insertion of a valuetype into an Any are unclear. (Note, this is related to issue 2531 in the IDL-to-Java RTF. It is also related to orb_revision issue 3205.) In section 1.16.2 of ptc/2000-01-02, two forms of insertion are defined: copying and non-copying. The non-copying form is described as: "The noncopying valuetype insertion consumes the valuetype pointed to by the pointer that T** points to. After insertion, the caller may not access the valuetype instance pointed to by the pointer that T* points to. The caller maintains ownership of the storage for the pointed-to-T* itself." There is no specific description of the copying form specific to valuetypes, so the generic description must apply: "For the copying version of operator<<=, the lifetime of the value in the any is independent of the lifetime of the value passed to operator<<=. The implementation of the any may not store its value as a reference or pointer to the value passed to operator<<=." One possible interpretation (1) is that the copying form should be implemented via a call to the _copy_value virtual function, while the non-copying form should simply retain the provided pointer (without calling _add_ref) and eventually call _remove_ref when done with it. If so, what is the significance of the rule about the caller not continuing to use the pointer? It it only that it has lost a reference count, and may continue using the pointer if it has another reference count? Or does this imply that continued access to the value is forbidden regardless of reference count? Another possible interpretation (2) is that the description is nonsense, and that the non-copying form should use _add_ref and the copying form should use _copy_value. In this interpretation the caller would be free to continue using the original pointer and would be obligated to _remove_ref it eventually. This seems like a more practical interpretation, but is inconsistent with usage for other non-copying insertions. Suggested Resolution: Replace the paragraph on non-copying insertion of valuetypes (quoted above) with: "The noncopying valuetype insertion takes ownership of one reference count to the valuetype pointed to by the pointer that T** points to. After insertion, the caller should treat the pointer as if _remove_ref had been called on it. The caller maintains ownership of the storage for the pointed-to-T* itself." "For copying valuetype insertion, the lifetime of the value in the any is independent of the lifetime of the value provided. The implementation of the any shall duplicate the value using the virtual function _copy_value or an equivalent mechanism. The caller retains ownership of the T* pointer and remains obliged to call _remove_ref on it." Sender: jbiggar@corvette.floorboard.com Message-ID: <38FF83F3.BDC350C7@floorboard.com> Date: Thu, 20 Apr 2000 15:25:55 -0700 From: Jonathan Biggar X-Mailer: Mozilla 4.7 [en] (X11; U; SunOS 5.6 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: Paul Kyzivat CC: "'cxx_revision@omg.org'" Subject: Re: unclear semantics for valuetype insertion into Any References: <9B164B713EE9D211B6DC0090273CEEA926BE1D@bos1.noblenet.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: o4B!!+Z8!!Dg+!!;[!!! Paul Kyzivat wrote: > > The semantics for insertion of a valuetype into an Any are unclear. [ See comments interspersed below.] > In section 1.16.2 of ptc/2000-01-02, two forms of insertion are defined: > copying and non-copying. The non-copying form is described as: > > "The noncopying valuetype insertion consumes the valuetype pointed to by the > pointer that T** points to. After insertion, the caller may not access the > valuetype instance pointed to by the pointer that T* points to. The caller > maintains ownership of the storage for the pointed-to-T* itself." > > There is no specific description of the copying form specific to valuetypes, > so the generic description must apply: > > "For the copying version of operator<<=, the lifetime of the value in the > any is independent of the lifetime of the value passed to operator<<=. The > implementation of the any may not store its value as a reference or pointer > to the value passed to operator<<=." > > One possible interpretation (1) is that the copying form should be > implemented via a call to the _copy_value virtual function, while the > non-copying form should simply retain the provided pointer (without calling > _add_ref) and eventually call _remove_ref when done with it. > > If so, what is the significance of the rule about the caller not continuing > to use the pointer? It it only that it has lost a reference count, and may > continue using the pointer if it has another reference count? Or does this > imply that continued access to the value is forbidden regardless of > reference count? > > Another possible interpretation (2) is that the description is nonsense, and > that the non-copying form should use _add_ref and the copying form should > use _copy_value. In this interpretation the caller would be free to continue > using the original pointer and would be obligated to _remove_ref it > eventually. This seems like a more practical interpretation, but is > inconsistent with usage for other non-copying insertions. > > Suggested Resolution: > > Replace the paragraph on non-copying insertion of valuetypes (quoted above) > with: > > "The noncopying valuetype insertion takes ownership of one reference count > to the valuetype pointed to by the pointer that T** points to. After > insertion, the caller should treat the pointer as if _remove_ref had been > called on it. The caller maintains ownership of the storage for the > pointed-to-T* itself." > > "For copying valuetype insertion, the lifetime of the value in the any is > independent of the lifetime of the value provided. The implementation of the > any shall duplicate the value using the virtual function _copy_value or an > equivalent mechanism. The caller retains ownership of the T* pointer and > remains obliged to call _remove_ref on it." I agree that the text is not clear, but I disagree with your proposed solution. What should be stored in the any is a pointer to the original valuetype, not a copy of the valuetype. By my analysis, the non-copying form consumes a reference to the valuetype (will call _remove_ref() without calling _add_ref()) and the copying form creates its own reference (will call _add_ref() and then _remove_ref()). Besides, anyone who needs the _copy_value() solution you posed can simply do: V *new_v = myv->_copy_value(); Any a; a <<= &new_v; [As an aside, I raised a different issue about the semantics of _copy_value. Some ORB vendors seem to think that _copy_value should do a deep copy, and not a shallow copy. Since _copy_value must be provided by the valuetype implementor, a deep copy is rather difficult to accomplish, since there isn't any place for storing context to avoid graph cycles and infinite recursion. If _copy_value only needs to be shallow, it's usefulness is rather limited, since the ORB must implement it's own copy mechanism anyway. I think we ought to deprecate _copy_value.] Counter-Suggested Resolution: "The noncopying valuetype insertion takes ownership of one reference count to the valuetype pointed to by the pointer that T** points to. After insertion, the caller should treat the pointer as if _remove_ref had been called on it. The caller maintains ownership of the storage for the pointed-to-T* itself." "Copying valuetype insertion calls _add_ref on its argument, and otherwise behaves the same way as noncopying insertion." -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org From: Paul Kyzivat To: "'cxx_revision@omg.org'" Subject: RE: unclear semantics for valuetype insertion into Any Date: Thu, 20 Apr 2000 19:42:53 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2448.0) Content-Type: text/plain; charset="iso-8859-1" X-UIDL: 9#]d9NYid9(p`!!9FMe9 > I agree that the text is not clear, but I disagree with your proposed > solution. What should be stored in the any is a pointer to > the original > valuetype, not a copy of the valuetype. By my analysis, the > non-copying > form consumes a reference to the valuetype (will call _remove_ref() > without calling _add_ref()) and the copying form creates its own > reference (will call _add_ref() and then _remove_ref()). Just goes to show how unclear it is. I am not especially wed to the proposal I made. I made it because it seemed self consistent and philosophically consistent with all the other types. I see no point in having two forms if they are as similar as in your proposal. Would be better to eliminate the non-copying form altogether if the copying form is simply an _add_ref. (We don't need another confusing way to do _remove_ref.) > Besides, anyone who needs the _copy_value() solution you posed can > simply do: > > V *new_v = myv->_copy_value(); > Any a; > > a <<= &new_v; I can't argue with that. > > [As an aside, I raised a different issue about the semantics of > _copy_value. Some ORB vendors seem to think that _copy_value > should do a deep copy, and not a shallow copy. Why shouldn't vendors believe that, considering that is what the description in table 1-2 says? > Since _copy_value must be provided > by the valuetype implementor, a deep copy is rather difficult to > accomplish, since there isn't any place for storing context to avoid > graph cycles and infinite recursion. If _copy_value only needs to > be > shallow, it's usefulness is rather limited, since the ORB > must implement it's own copy mechanism anyway. Why do you say that? In principle I don't see how the orb can do that. The actual valuetype class is written by the user. The orb cannot know how to copy it. > I think we ought to deprecate _copy_value.] > > Counter-Suggested Resolution: > > "The noncopying valuetype insertion takes ownership of one reference > count > to the valuetype pointed to by the pointer that T** points to. After > insertion, the caller should treat the pointer as if _remove_ref had > been > called on it. The caller maintains ownership of the storage for the > pointed-to-T* itself." > > "Copying valuetype insertion calls _add_ref on its argument, and > otherwise behaves the same way as noncopying insertion." Counter-Counter-Resolution: "Copying valuetype insersion calls _add_ref on its argument. The caller remains obligated to call _remove_ref on the pointer, and the Any assumes an obligation to call _remove_ref on its copy of the pointer at destruction or when a new value is inserted. Valuetypes have no non-copying insertion operator." Paul Sender: jbiggar@corvette.floorboard.com Message-ID: <38FF9F42.49A3DB81@floorboard.com> Date: Thu, 20 Apr 2000 17:22:26 -0700 From: Jonathan Biggar X-Mailer: Mozilla 4.7 [en] (X11; U; SunOS 5.6 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: Paul Kyzivat CC: "'cxx_revision@omg.org'" Subject: Re: unclear semantics for valuetype insertion into Any References: <9B164B713EE9D211B6DC0090273CEEA926BE22@bos1.noblenet.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: hmkd94bBe9cTHe9&#Ee9 Paul Kyzivat wrote: > > > I agree that the text is not clear, but I disagree with your proposed > > solution. What should be stored in the any is a pointer to > > the original > > valuetype, not a copy of the valuetype. By my analysis, the > > non-copying > > form consumes a reference to the valuetype (will call _remove_ref() > > without calling _add_ref()) and the copying form creates its own > > reference (will call _add_ref() and then _remove_ref()). > > Just goes to show how unclear it is. > I am not especially wed to the proposal I made. > I made it because it seemed self consistent and philosophically > consistent with all the other types. > > I see no point in having two forms if they are as similar as in your > proposal. Would be better to eliminate the non-copying form altogether if > the copying form is simply an _add_ref. > (We don't need another confusing way to do _remove_ref.) We've already got object reference insertion as an example. :-( > > [As an aside, I raised a different issue about the semantics of > > _copy_value. Some ORB vendors seem to think that _copy_value > > should do a deep copy, and not a shallow copy. > > Why shouldn't vendors believe that, considering that is what the > > description > in table 1-2 says? Well, there's deep copy and there's deeper copy. :-) My interpretation of deep copy here is that it copies each component of the valuetype, but not valuetypes that are nested in those components. Otherwise, you end up with _copy_value() breaking for cyclic graphs. > > Since _copy_value must be provided > > by the valuetype implementor, a deep copy is rather difficult to > > accomplish, since there isn't any place for storing context to > > avoid > > graph cycles and infinite recursion. If _copy_value only needs to > > be > > shallow, it's usefulness is rather limited, since the ORB > > must implement it's own copy mechanism anyway. > > Why do you say that? In principle I don't see how the orb can do > > that. > The actual valuetype class is written by the user. The orb cannot > > know > how to copy it. Sure it can, by brute force marshalling and unmarshalling, if it must, although there are better ways that I can think of. > > I think we ought to deprecate _copy_value.] > > > > Counter-Suggested Resolution: > > > > "The noncopying valuetype insertion takes ownership of one > >reference > > count > > to the valuetype pointed to by the pointer that T** points > >to. After > > insertion, the caller should treat the pointer as if _remove_ref > >had > > been > > called on it. The caller maintains ownership of the storage for > >the > > pointed-to-T* itself." > > > > "Copying valuetype insertion calls _add_ref on its argument, and > > otherwise behaves the same way as noncopying insertion." > > Counter-Counter-Resolution: > > "Copying valuetype insersion calls _add_ref on its argument. The > >caller > remains obligated to call _remove_ref on the pointer, and the Any > >assumes an > obligation to call _remove_ref on its copy of the pointer at > >destruction or > when a new value is inserted. Valuetypes have no non-copying > >insertion > operator." We might as well keep non-copying insertion because the precedent is already there with object references. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org From: Paul Kyzivat To: "'cxx_revision@omg.org'" Subject: RE: unclear semantics for valuetype insertion into Any Date: Fri, 21 Apr 2000 18:57:33 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2448.0) Content-Type: text/plain; charset="iso-8859-1" X-UIDL: b[Qd9\;B!!ch=e9d4&!! > From: Jonathan Biggar [mailto:jon@floorboard.com] > > I see no point in having two forms if they are as similar as in your > > proposal. Would be better to eliminate the non-copying form > altogether if > > the copying form is simply an _add_ref. > > (We don't need another confusing way to do _remove_ref.) > > We've already got object reference insertion as an example. :-( Your point is that we should provide the same for valuetypes in order to be consistent, even though you agree that the precedent being followed is a bad one? I think I would like to hear from someone who can explain the reasoning for providing both kinds of insertion for Object References. > > Why shouldn't vendors believe that, considering that > > is what the description in table 1-2 says? > > Well, there's deep copy and there's deeper copy. :-) > My interpretation of deep copy here is that it copies > each component of the valuetype, but > not valuetypes that are nested in those components. I think the existing words are pretty explicit. They say "the copy has no connections with the original instance and has a lifetime independent of that original". To me that says the people who wrote those words meant a really-deep copy. (Whether that intention was valid is another matter.) > Otherwise, you end > up with _copy_value() breaking for cyclic graphs. A simple recursive implementation will certainly break. I agree that it is asking a lot to require the implementor of a valuetype to manage the bookkeeping required to detect shared values and cycles, but it is not impossible. So first I think we need to decide if the intent was wrong. If so, we can make the change you suggest, and also change all the words that imply a true deep copy. > > The actual valuetype class is written by the user. > > The orb cannot know how to copy it. > > Sure it can, by brute force marshalling and unmarshalling, if it > > must, > although there are better ways that I can think of. I'm still thinking about this. I think you are right that the same techniques used to marshal can be used to make a copy. On the other hand, I am not convinced that there are currently sufficient standardized features to permit a correct unmarshalling in all cases. In particular, there is the question of how to get the reference counts "right" in cyclic graphs. The problem of course is even defining what "right" is. In general, if every reference is counted, then cyclic structures will never be destroyed. When using reference counting, you pretty much have to declare some reference in each cycle as weak and uncounted to prevent this. For the moment, I think the current spec effectively mandates that every cyclic structure is a memory leak. But I think that is a separate problem - using the same technique for unmarshalling and copying doesn't make it worse. Paul Sender: jon@corvette.floorboard.com Message-ID: <39010286.7C3B8605@floorboard.com> Date: Fri, 21 Apr 2000 18:38:14 -0700 From: Jonathan Biggar X-Mailer: Mozilla 4.7 [en] (X11; U; SunOS 5.5.1 sun4m) X-Accept-Language: en MIME-Version: 1.0 To: Paul Kyzivat CC: "'cxx_revision@omg.org'" Subject: Re: unclear semantics for valuetype insertion into Any References: <9B164B713EE9D211B6DC0090273CEEA926BE2A@bos1.noblenet.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii X-UIDL: 3^""!2=Q!!WKj!!e_He9 Paul Kyzivat wrote: > > > From: Jonathan Biggar [mailto:jon@floorboard.com] > > > > I see no point in having two forms if they are as similar as in your > > > proposal. Would be better to eliminate the non-copying form > > altogether if > > > the copying form is simply an _add_ref. > > > (We don't need another confusing way to do _remove_ref.) > > > > We've already got object reference insertion as an example. :-( > > Your point is that we should provide the same for valuetypes in order to be > consistent, even though you agree that the precedent being followed is a bad > one? > > I think I would like to hear from someone who can explain the reasoning for > providing both kinds of insertion for Object References. Beats me. I'd have to go back and look in the archives for that one. My only point is that we be consistent. > > > Why shouldn't vendors believe that, considering that > > > is what the description in table 1-2 says? > > > > Well, there's deep copy and there's deeper copy. :-) > > My interpretation of deep copy here is that it copies > > each component of the valuetype, but > > not valuetypes that are nested in those components. > > I think the existing words are pretty explicit. They say "the copy > > > has no > connections with the original instance and has a lifetime > > > independent of > that original". To me that says the people who wrote those words > > > meant a > really-deep copy. (Whether that intention was valid is another > > > matter.) Yes, but that doesn't in any way indicate what is to be done with valuetypes that are referenced by the valuetype you passed to _copy_value(). So I say that it is still ambiguous. I already raised an issue about this a month or two back. > > Otherwise, you end > > up with _copy_value() breaking for cyclic graphs. > > A simple recursive implementation will certainly break. I agree that > > it is > asking a lot to require the implementor of a valuetype to manage the > bookkeeping required to detect shared values and cycles, but it is > > not > impossible. Not easy either. In fact, effectively impossible if you include valuetypes in the graph whose implementation is not under control of the implementor of _copy_value(). This can easily happen when we start deploying component libraries. > So first I think we need to decide if the intent was wrong. If so, we can > make the change you suggest, and also change all the words that imply a true > deep copy. That's my point in the other issue. > > > The actual valuetype class is written by the user. > > > The orb cannot know how to copy it. > > > > Sure it can, by brute force marshalling and unmarshalling, if it > > > must, > > although there are better ways that I can think of. > > I'm still thinking about this. I think you are right that the same > techniques used to marshal can be used to make a copy. > > On the other hand, I am not convinced that there are currently > > > sufficient > standardized features to permit a correct unmarshalling in all > > > cases. In > particular, there is the question of how to get the reference counts > > > "right" > in cyclic graphs. The problem of course is even defining what > > > "right" is. In > general, if every reference is counted, then cyclic structures will > > > never be > destroyed. When using reference counting, you pretty much have to > > > declare > some reference in each cycle as weak and uncounted to prevent > > > this. For the > moment, I think the current spec effectively mandates that every > > > cyclic > structure is a memory leak. Yes, the reference count/memory leak problem has already been identified. I believe that it implies that ORBs must implement valuetype garbage collecting, since there is no strong/weak link semantic in the C++ binding. Implementing a garbage collector is feasible, even if it is a pain in the butt. Some others have argued that it is up to the programmer to break the cycles, but that can't work, since the programmer has no control over valuetype graphs passed as out or return parameters from a server. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org