Issue 663: Implementation problem with policy objects using root POA (cxx_revision) Source: (, ) Nature: Uncategorized Severity: Summary: Summary: It appears to be impossible to implement the policy objects using the root POA due to race conditions on the Polilicy::destroy() operation. There appears to be no safe time to delete servant Resolution: Close no change. This issue is already addressed in the CORBA 2.3 specification. Revised Text: Actions taken: August 9, 1997: received issue February 25, 1999: moved to cxx_revision June 13, 2000: closed issue Discussion: End of Annotations:===== Return-Path: Date: Sat, 9 Aug 1997 16:27:39 -0700 From: jon@sems.com (Jonathan Biggar) To: issues@omg.org, port-rtf@omg.org Subject: Two more portability issues 2. It appears to be impossible to implement the policy objects using the root POA due to race conditions on the Policy::destroy() operation. One would normally implement the destroy() operation by calling POA::deactivate_object(), followed by deleting the servant. However, there appears to be no safe time to delete the servant, because there is no way to guarantee that no other requests are currently active for the servant. The description of deactivate_object(), when the POA does not have the USE_SERVANT_MANAGER policy does not state that it will delay completion until all requests for that object have completed. Even worse, in this case, deactivate_object() is being called inside a request, so there is a potential deadlock. It isn't feasible to create a child POA with a different policy set to handle this, because any name given to the POA would potentially conflict with a POA name assigned by the user. Here are some possible ways to solve this problem: A. Redefine deactivate_object() in this case to remove the object from the active object map, and then wait until all requests other than the one calling deactivate_object() to complete. Then it would be safe to destroy the servant. B. Add another policy "DestructionPolicy" that specifies whether the POA or the user code will destroy the servant once the last activation for that servant is gone. The policy values would be USER_DESTROY and SYSTEM_DESTROY. Does anyone out there see any way to work around this problem without one of the above changes? Anyone have a better proposed change? Jon Biggar jon@sems.com Return-Path: Sender: jon@floorboard.com Date: Sat, 11 Apr 1998 22:54:04 -0700 From: Jonathan Biggar To: port-rtf@omg.org Subject: Re: Issue 663 proposal Problem: For a POA with RETAIN, but not USE_SERVANT_MANAGER, there is a race condition when calling deactivate_object(), because nothing in the specification indicates what to do with other concurrent calls in the object. Proposal: Add the following text to the description of POA::deactive_object() in Section 9.3.8, just before the note: If there is no servant manager associated with the POA, then the call to deactivate_object() will block until all concurrent operations on the target operation have completed, except for any operations that have been invoked in the context of the thread that called deactivate_object(). Discussion: There is no safe way to destroy the servant for a POA without USE_SERVANT_MANAGER (and RETAIN) because the spec provides no way to guarantee that all concurrent invocations on the target servant have completed. By making deactivate_object() wait until all other invocations have completed, then the programmer can safely destroy the servant once deactivate_object() has returned. The tricky part is that deactivate_object() can be called by the implementation of an operation in the servant itself, which would lead to a deadlock if deactivate_object() were to block in this case. This proposal allows the obvious implementation of the Lifecycle service remove() operation as follows: // C++ PortableServer::Current_var current = ...; PortableServer::POA_var my_poa = ...; MyServant::remove() throw(CORBA::SystemException, CosLifeCycle::NotRemovable) { PortableServer::ObjectId_var oid = current->get_object_id(); try { mypoa->deactivate_object(oid); destroy this; } catch (PortableServer::POA::ObjectNotActive &) { throw CosLifeCycle::NotRemovable("Doh! Object not active"); } catch (PortableServer;:POA::WrongPolicy &) { throw CosLifeCycle::NotRemovable("Doh! Wrong Policy"); } } I suppose that a more involved solution would be possible that wouldn't require the special case for the current thread, but this solution seemed simplest. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Date: Sun, 12 Apr 1998 12:24:20 -0400 From: Paul H Kyzivat Organization: NobleNet To: Jonathan Biggar CC: port-rtf@omg.org Subject: Re: Issue 663 proposal References: <353056FC.B2CD1D27@floorboard.com> Jonathan Biggar wrote: > > Problem: > > For a POA with RETAIN, but not USE_SERVANT_MANAGER, there is a race > condition when calling deactivate_object(), because nothing in the > specification indicates what to do with other concurrent calls in > the > object. > > Proposal: > > Add the following text to the description of POA::deactive_object() > in > Section 9.3.8, just before the note: > > If there is no servant manager associated with the POA, then the > call > to > deactivate_object() will block until all concurrent operations on > the > target operation have completed, except for any operations that have > been invoked in the context of the thread that called > deactivate_object(). > > Discussion: > > There is no safe way to destroy the servant for a POA without > USE_SERVANT_MANAGER (and RETAIN) because the spec provides no way to > guarantee that all concurrent invocations on the target servant have > completed. By making deactivate_object() wait until all other > invocations have completed, then the programmer can safely destroy > the > servant once deactivate_object() has returned. > > The tricky part is that deactivate_object() can be called by the > implementation of an operation in the servant itself, which would > lead > to a deadlock if deactivate_object() were to block in this case. > This > proposal allows the obvious implementation of the Lifecycle service > remove() operation as follows: Your proposal seems to assume that by making some small changes in the POA model it will be possible for a servant to know when it can be deleted, without extra bookkeeping in the implementation itself. This proposal complicates the implementation of a POA, by requiring complicated special checks for use within the current thread, yet doesn't really deliver the goods. In general, with or without the use of a servant manager, the implementation of a servant needs additional information to know whether a particular servant can be deleted. It will, at best, require a lot more patches before this decision can be made based on info available from the POA. I think it is essentially impossible, so the POA should just quit trying. For instance, the following are issues that complicate deletion of a servant: - The same servant can can be activated in more than one POA. - A servant can be default servant for more than one POA. - A servant can be activated in one POA, and be default servant in that POA or in another POA. - The remaining_activations parameter to etherialize may be true at the time the POA begins to call it, but become false before that call is complete, or visa versa. (Calls to incarnate and etherialize are serialized for a give object_id, not for a given servant.) In a particular server the developer will generally know that most of these conditions are impossible, but a given POA cannot reasonably expect to know this, or be able to figure it out. Having given the developer lots of rope to use as he sees fit, I think it is best to give up trying to re-engineer the rope so that it cannot be made into a noose. Return-Path: Sender: jon@floorboard.com Date: Sun, 12 Apr 1998 19:50:45 -0700 From: Jonathan Biggar To: Paul H Kyzivat CC: port-rtf@omg.org Subject: Re: Issue 663 proposal References: <353056FC.B2CD1D27@floorboard.com> <3530EAB4.F88275B0@noblenet.com> Paul H Kyzivat wrote: > > Jonathan Biggar wrote: > > > > Problem: > > > > For a POA with RETAIN, but not USE_SERVANT_MANAGER, there is a > race > > condition when calling deactivate_object(), because nothing in the > > specification indicates what to do with other concurrent calls in > the > > object. > > > > Proposal: > > > > Add the following text to the description of > POA::deactive_object() in > > Section 9.3.8, just before the note: > > > > If there is no servant manager associated with the POA, then the > call > > to > > deactivate_object() will block until all concurrent operations on > the > > target operation have completed, except for any operations that > have > > been invoked in the context of the thread that called > > deactivate_object(). > > > > Discussion: > > > > There is no safe way to destroy the servant for a POA without > > USE_SERVANT_MANAGER (and RETAIN) because the spec provides no way > to > > guarantee that all concurrent invocations on the target servant > have > > completed. By making deactivate_object() wait until all other > > invocations have completed, then the programmer can safely destroy > the > > servant once deactivate_object() has returned. > > Your proposal seems to assume that by making some small changes in > the > POA model it will be possible for a servant to know when it can be > deleted, without extra bookkeeping in the implementation itself. > This proposal complicates the implementation of a POA, by requiring > complicated special checks for use within the current thread, yet > doesn't really deliver the goods. You have missed the point of my proposal. In the particular case I describe, it is impossible for even an implementation that guarantees that a servant is used for only one active object to make sure that it can safely delete the servant. > In general, with or without the use of a servant manager, the > implementation of a servant needs additional information to know > whether > a particular servant can be deleted. It will, at best, require a lot > more patches before this decision can be made based on info > available > from the POA. I think it is essentially impossible, so the POA > should > just quit trying. > > For instance, the following are issues that complicate deletion of a > servant: > > - The same servant can can be activated in more than one POA. > - A servant can be default servant for more than one POA. > - A servant can be activated in one POA, and be default servant > in that POA or in another POA. > - The remaining_activations parameter to etherialize may be > true at the time the POA begins to call it, but become false > before that call is complete, or visa versa. (Calls to incarnate > and etherialize are serialized for a give object_id, not for > a given servant.) In most of the cases you describe, the servant programmer can use a mutex protected reference counter inside the servant to make sure that the servant is not deleted before all active objects using the servant have been activated. For the cases where the servant is used as the default servant in the POA, the programmer can simply wait until the corresponding POA has been inactivated or destroyed before destroying the servant. Here is the scenario that causes the race condition I describe: 1. Create a POA with RETAIN and without USE_SERVANT_MANAGER. 2. Create and activate a servant using POA::activate_object(). 3. A request for the object is invoked on thread A. Before the request completes (perhaps even before the servant implementation code is called), thread A is swapped out of the CPU. 4. A second request is invoked on thread B. The implementation of this operation calls POA::deactivate_object() to deactivate the object, and then destroys the servant. 5. Thread A wakes up and tries to finish the invocation by accessing information in the now destroyed servant. Instant core dump. There is no way for the programmer to protect from this race condition, given the current POA specification, because there is no way for the programmer to guarantee that all other invocations have completed before destroying the servant, and there is no callback (since there is no servant manager) that will let the implementation know that it is safe to destroy the servant. I agree that is it a pain to require the POA to figure out if the active requests on the object are in the context of the thread that calls deactivate_object(), but I haven't been able to come up with a better solution that doesn't require major reengineering of the POA interface. The only other way to fix the problem is to either forbid deletion of servants used by a POA that has no servant manager, or to require that all POAs have a servant manager. The latter doesn't work because there is a bootstrap problem with the Root POA, which is currently defined to have no servant manager, and the former doesn't seem to be palatable either. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: X-Sender: vinoski@mail.boston.iona.ie Date: Sun, 12 Apr 1998 21:05:08 -0400 To: Paul H Kyzivat From: Steve Vinoski Subject: Re: Issue 663 proposal Cc: Jonathan Biggar , port-rtf@omg.org References: <353056FC.B2CD1D27@floorboard.com> At 12:24 PM 4/12/98 -0400, Paul H Kyzivat wrote: >Jonathan Biggar wrote: >> Problem: >> >> For a POA with RETAIN, but not USE_SERVANT_MANAGER, there is a race >> condition when calling deactivate_object(), because nothing in the >> specification indicates what to do with other concurrent calls in the >> object. >> >> Proposal: >> >> Add the following text to the description of POA::deactive_object() in >> Section 9.3.8, just before the note: >> >> If there is no servant manager associated with the POA, then the call >> to >> deactivate_object() will block until all concurrent operations on the >> target operation have completed, except for any operations that have >> been invoked in the context of the thread that called >> deactivate_object(). >> >> Discussion: >> >> There is no safe way to destroy the servant for a POA without >> USE_SERVANT_MANAGER (and RETAIN) because the spec provides no way to >> guarantee that all concurrent invocations on the target servant have >> completed. By making deactivate_object() wait until all other >> invocations have completed, then the programmer can safely destroy the >> servant once deactivate_object() has returned. >> >> The tricky part is that deactivate_object() can be called by the >> implementation of an operation in the servant itself, which would lead >> to a deadlock if deactivate_object() were to block in this case. This >> proposal allows the obvious implementation of the Lifecycle service >> remove() operation as follows: > >Your proposal seems to assume that by making some small changes in the >POA model it will be possible for a servant to know when it can be >deleted, without extra bookkeeping in the implementation itself. > >This proposal complicates the implementation of a POA, by requiring >complicated special checks for use within the current thread, yet >doesn't really deliver the goods. > >In general, with or without the use of a servant manager, the >implementation of a servant needs additional information to know whether >a particular servant can be deleted. It will, at best, require a lot >more patches before this decision can be made based on info available >from the POA. I think it is essentially impossible, so the POA should >just quit trying. I strongly disagree, Paul. Jon is striving to make a common case workable, and the POA spec should have definitely covered this in the first place. As it is now, an application can't safely destroy its own servants under the conditions that Jon has specified, and there is absolutely nothing the application developer can do differently to make such destruction safe. The POA must therefore provide guarantees to make servant destruction under these conditions safe. A proprietary object adapter that I helped develop a few years ago, the HP Simplified Object Adapter (HPSOA) for HP ORB Plus, provided guarantees that are almost exactly what Jon is proposing. It worked extremely well and was heavily counted on by ORB Plus users to allow them to safely clean up their servants. I also know that Orbix users have had problems in this area in the past because the Orbix BOA did not supply these types of safety guarantees. >For instance, the following are issues that complicate deletion of a >servant: > >- The same servant can can be activated in more than one POA. >- A servant can be default servant for more than one POA. >- A servant can be activated in one POA, and be default servant > in that POA or in another POA. >- The remaining_activations parameter to etherialize may be > true at the time the POA begins to call it, but become false > before that call is complete, or visa versa. (Calls to incarnate > and etherialize are serialized for a give object_id, not for > a given servant.) The scenarios you raise are not all that common, in my opinion. Just because a particular uncommon case appears difficult to handle properly does not justify ignoring common simple cases as well. What Jon is proposing is critical for a very common case and has no application-level workaround, so the POA needs to be fixed to handle it. --steve Return-Path: Date: Tue, 14 Apr 1998 09:25:41 -0400 From: Paul H Kyzivat Organization: NobleNet To: Jonathan Biggar , Steve Vinoski CC: port-rtf@omg.org Subject: Re: Issue 663 proposal References: <353056FC.B2CD1D27@floorboard.com> <3530EAB4.F88275B0@noblenet.com> <35317D85.6CFD713C@floorboard.com> I am going to reply to two at once... Jonathan Biggar wrote: [snip] > You have missed the point of my proposal. In the particular case I > describe, it is impossible for even an implementation that guarantees > that a servant is used for only one active object to make sure that it > can safely delete the servant. [snip] > In most of the cases you describe, the servant programmer can use a > mutex protected reference counter inside the servant to make sure that > the servant is not deleted before all active objects using the servant > have been activated. > For the cases where the servant is used as the default servant in the > POA, the programmer can simply wait until the corresponding POA has > been > inactivated or destroyed before destroying the servant. > > Here is the scenario that causes the race condition I describe: [snip] > There is no way for the programmer to protect from this race > condition, > given the current POA specification, because there is no way for the > programmer to guarantee that all other invocations have completed > before > destroying the servant, and there is no callback (since there is no > servant manager) that will > let the implementation know that it is safe to destroy the servant. > > I agree that is it a pain to require the POA to figure out if the > active > requests on the object are in the context of the thread that calls > deactivate_object(), but I haven't been able to come up with a better > solution that doesn't require major reengineering of the POA > interface. Steve Vinoski wrote: [snip] > I strongly disagree, Paul. Jon is striving to make a common case > workable, and the POA spec should have definitely covered this in the > first place. As it is now, an application can't safely destroy its > own servants under the conditions that Jon has specified, and there > is absolutely nothing the application developer can do differently to > make such destruction safe. The POA must therefore provide guarantees > to make servant destruction under these conditions safe. [snip] > The scenarios you raise are not all that common, in my opinion. Just > because a particular uncommon case appears difficult to handle > properly does not justify ignoring common simple cases as well. What > Jon is proposing is critical for a very common case and has no > application-level workaround, so the POA needs to be fixed to handle > it. I didn't miss Jon's point, nor do I disagree with Steve that the spec should be workable. But I am not convinced that the proposed patch plugs the last remaining impediment to simple and safe servant lifetime management. The other cases I listed may be somewhat less likely to be used often. But as long as they are permitted it should be discernable from the spec how lifetime management can be done in each of them. In that regard, Jon's case is just one more scenario to be covered. Jon makes a case that the problem he poses is different in kind from all the others - that it alone cannot be handled via other techniques. I have not had time to ponder that enough to be convinced. Perhaps it is so, in which case I would not object. But it is not entirely obvious that this is the situation. The existing serialization rules already impose a high bookkeeping overhead on the POA implementation. I haven't considered Jon's proposal deeply enough to know for certain if it adds more of that or not. I don't think so; I think it "only" adds extra computation and a possible stall to deactivate_object. But I would at least like to know that adding this extra complexity solves the problem once and for all. It isn't entirely clear that this is true. There are really two issues here: 1) is it *possible* for the programmer to handle the lifetime of the servant. 2) Are there sufficient tools to make the job convenient It would be nice to know if the goal is to solve (1), (2) or both. For (2) there are a lot of other problems. For instance: EX1) calling destroy on a POA will destroy all the subordinate POAs. If they happen to have ServantActivators then they each get a chance to clean up their activated servants in this case. But a POA with a default servant gets no chance to clean that up. So the programmer that uses default servants had better never destroy a POA with child POAs. EX2) There is not (as far as I can see from the spec) any guarantee of the order of events when a hierarchy of POAs is destroyed. Because ServantActivators are themselves activated in some POA, there is no guarantee that when etherialization is performed for one POA that the servant activator itself won't already have been deactivated. In each of these cases, the answer can easily be: don't do that. But this merely says that features have been provided, apparently to help people, but that only work when used in certain combinations, and that fail in other combinations. Yet these are things that I think people would expect to work. For instance, ORB::shutdown implies a destroy of the root POA and hence all its descendent POAs. I suspect that most people would like this to be sufficient to get everything cleaned up, leak free, without having to tear down everything individually first. Paul Return-Path: Sender: jon@floorboard.com Date: Tue, 14 Apr 1998 11:45:06 -0700 From: Jonathan Biggar To: Paul H Kyzivat CC: Steve Vinoski , port-rtf@omg.org Subject: Re: Issue 663 proposal References: <353056FC.B2CD1D27@floorboard.com> <3530EAB4.F88275B0@noblenet.com> <35317D85.6CFD713C@floorboard.com> <353363D5.9AFBEDE6@noblenet.com> Paul H Kyzivat wrote: > I didn't miss Jon's point, nor do I disagree with Steve that the > spec > should be workable. But I am not convinced that the proposed patch > plugs > the last remaining impediment to simple and safe servant lifetime > management. We never said if fixed everything, just that it fixed an obvious problem that the programmer has no workaround for. If we wait until every possible problem is fixed, we won't ever get anywhere. > The other cases I listed may be somewhat less likely to be used often. > But as long as they are permitted it should be discernable from the spec > how lifetime management can be done in each of them. In that regard, > Jon's case is just one more scenario to be covered. For my scenario, you can't give the programmer guidelines how to avoid the problem, since it is unavoidable, other than to tell the programmer to never destroy his servants. I consider this unacceptable. > Jon makes a case that the problem he poses is different in kind from all > the others - that it alone cannot be handled via other techniques. I > have not had time to ponder that enough to be convinced. Perhaps it is > so, in which case I would not object. But it is not entirely obvious > that this is the situation. Steve & I both believe that we understand the problem fully. We can wait until you are convinced, but only until May 18! :-) > The existing serialization rules already impose a high bookkeeping > overhead on the POA implementation. I haven't considered Jon's > proposal > deeply enough to know for certain if it adds more of that or not. I > don't think so; I think it "only" adds extra computation and a > possible > stall to deactivate_object. But I would at least like to know that > adding this extra complexity solves the problem once and for all. It > isn't entirely clear that this is true. I have looked at how it would be implemented. Right now, the POA must keep thread specific data in order to handle the PortableServer::Current interface. It is not difficult to add information to that thread specific data that would allow the POA to walk up the chain of invocations and get a count of how many of those invocations refer to the target object. > There are really two issues here: > 1) is it *possible* for the programmer to handle the lifetime > of the servant. > 2) Are there sufficient tools to make the job convenient > > It would be nice to know if the goal is to solve (1), (2) or > both. For > (2) there are a lot of other problems. For instance: > > EX1) calling destroy on a POA will destroy all the subordinate > POAs. If > they happen to have ServantActivators then they each get a chance to > clean up their activated servants in this case. But a POA with a > default > servant gets no chance to clean that up. So the programmer that uses > default servants had better never destroy a POA with child POAs. This is true, and is a valid defect. You should post an issue for this one. Perhaps something could be added to the AdapterActivator interface to allow the programmer to get a callback when a POA is being destroyed. > EX2) There is not (as far as I can see from the spec) any guarantee of > the order of events when a hierarchy of POAs is destroyed. Because > ServantActivators are themselves activated in some POA, there is no > guarantee that when etherialization is performed for one POA that the > servant activator itself won't already have been deactivated. This is another good thing to bring up as an issue. We could fix this by declaring that POA children are always destroyed before their parents, and if the programmer always uses the Root POA to create servant managers his code would be safe, since the Root POA doesn't have a servant manager. > In each of these cases, the answer can easily be: don't do that. But > this merely says that features have been provided, apparently to > help > people, but that only work when used in certain combinations, and > that > fail in other combinations. Yet these are things that I think people > would expect to work. For instance, ORB::shutdown implies a destroy > of > the root POA and hence all its descendent POAs. I suspect that most > people would like this to be sufficient to get everything cleaned > up, > leak free, without having to tear down everything individually > first. We should have as a goal the ability for the programmer to totally shut down the ORB and reclaim all memory used by the ORB and any servants, regardless of the language binding in use. It may take a few iterations of the RTF to plug all of the holes and add all the hooks necessary to make this possible, but we can get closer over time. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Date: Tue, 14 Apr 1998 17:46:20 -0400 From: Paul H Kyzivat Organization: NobleNet To: Jonathan Biggar CC: Steve Vinoski , port-rtf@omg.org Subject: Re: Issue 663 proposal References: <353056FC.B2CD1D27@floorboard.com> <3530EAB4.F88275B0@noblenet.com> <35317D85.6CFD713C@floorboard.com> <353363D5.9AFBEDE6@noblenet.com> <3533AEB2.59202CA@floorboard.com> Jonathan Biggar wrote: [snip] > We never said if fixed everything, just that it fixed an obvious > problem > that the programmer has no workaround for. If we wait until every > possible problem is fixed, we won't ever get anywhere. [snip] > Steve & I both believe that we understand the problem fully. We can > wait until you are convinced, but only until May 18! :-) [snip] > > EX1) calling destroy on a POA will destroy all the subordinate POAs. > If > > they happen to have ServantActivators then they each get a chance to > > clean up their activated servants in this case. But a POA with a > default > > servant gets no chance to clean that up. So the programmer that uses > > default servants had better never destroy a POA with child POAs. > > This is true, and is a valid defect. You should post an issue for > this > one. > Perhaps something could be added to the AdapterActivator interface to > allow the programmer to get a callback when a POA is being destroyed. > > > EX2) There is not (as far as I can see from the spec) any guarantee > of > > the order of events when a hierarchy of POAs is destroyed. Because > > ServantActivators are themselves activated in some POA, there is no > > guarantee that when etherialization is performed for one POA that > the > > servant activator itself won't already have been deactivated. > > This is another good thing to bring up as an issue. We could fix this > by declaring that POA children are always destroyed before their > parents, and if the programmer always uses the Root POA to create > servant managers his code would be safe, since the Root POA doesn't > have > a servant manager. [snip] > We should have as a goal the ability for the programmer to totally > shut > down the ORB and reclaim all memory used by the ORB and any servants, > regardless of the language binding in use. It may take a few > iterations > of the RTF to plug all of the holes and add all the hooks necessary to > make this possible, but we can get closer over time. I certainly concur with this last goal. My concern is that problems not be patched piecemeal. Your solution may be just fine for the problem it attempts to solve, but not solve other problems of similar nature. Subsequent fixes can then result in much more of a mess than if a solution had been sought for the complete collection of problems. As things stand, it is difficult for someone like myself who didn't participate in writing the POA spec, to understand what is in-scope as an issue. Is anything which a programmer *can* (with a lot of trouble) deal with without a change in the spec out of scope? (Certainly the problems with destroying POAs are of that form.) What is special about May 18? ( Return-Path: Sender: jon@floorboard.com Date: Tue, 14 Apr 1998 15:51:33 -0700 From: Jonathan Biggar To: Paul H Kyzivat CC: Steve Vinoski , port-rtf@omg.org Subject: Re: Issue 663 proposal References: <353056FC.B2CD1D27@floorboard.com> <3530EAB4.F88275B0@noblenet.com> <35317D85.6CFD713C@floorboard.com> <353363D5.9AFBEDE6@noblenet.com> <3533AEB2.59202CA@floorboard.com> <3533D92C.B6B3935D@noblenet.com> Paul H Kyzivat wrote: > I certainly concur with this last goal. > > My concern is that problems not be patched piecemeal. > Your solution may be just fine for the problem it attempts to solve, > but > not solve other problems of similar nature. Subsequent fixes can > then > result in much more of a mess than if a solution had been sought for > the > complete collection of problems. We do always take that risk. In the end, we just have to do the best we can with what we have. > As things stand, it is difficult for someone like myself who didn't > participate in writing the POA spec, to understand what is in-scope > as > an issue. Is anything which a programmer *can* (with a lot of > trouble) > deal with without a change in the spec out of scope? (Certainly the > problems with destroying POAs are of that form.) There don't appear to be any hard and fast rules. I guess we just have to weigh the relative risk/reward of each change and decide what is best. It is really up to the RTF to take a first pass at fixing the issues. We do have to get the concurrence of the ORBOS, PTC, AB, and then a final membership vote, so utimately that is the constituency that we have to satisfy. > What is special about May 18? ( That's when the current RTF report must be complete and turned in. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Date: Wed, 01 Jul 1998 18:25:34 -0400 From: Jonathan Biggar Organization: Floorboard Software To: port-rtf@omg.org Subject: Proposal for Issues 663 and 675 Problem: When using a POA without a Servant Manager, there is no safe time to destroy a servant after calling deactivate_object() because the POA does not guarantee that there are no concurrent operations still in process on the object. Here is an example of attempting to implement the LifeCycle service remove operation() in C++: // IDL interface A : CosLifeCycle::LifeCycleObject { }; // C++ PortableServer::Current_var current; // set somewhere class MyA : public POA_A { public: void remove() throw(CORBA::SystemException, CosLifeCycle::NotRemovable) { PortableServer::POA_var poa = current->get_poa(); PortableServer::ObjectId_var oid = poa->servant_to_id(); poa->deactivate_object(oid); delete this; } }; This won't work, because when the servant is deleted, there is no guarantee that another concurrent operation isn't still using the object. So the thread that calls remove() can end up deleting the servant out from under another thread, causing memory corruption and a probable core dump. Proposal: Add the following text to the description of POA::deactivate_object() in section 9.3.8: If the POA does not have the USE_SERVANT_MANAGER policy, the call to deactivate_object() will not return until all current invocations on the object have completed, except for any invocations that are running in the context of the thread that called deactivate_object(). This provides the application a safe point where it can destroy the servant after it is deactivated while avoiding a deadlock. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Sender: "George Scott" Date: Wed, 01 Jul 1998 20:22:34 -0700 From: "George M. Scott" Organization: Inprise Corporation To: port-rtf@omg.org Subject: Counter proposal for Issue 663 and 675 One of the goals Inprise has as part of our participation in this RTF is to eliminate as many of the current deadlock situations in the POA as possible. The current proposal on the table for issues 663 and 675 would introduce blocking behavior to POA::deactivate_object which has the potential to cause deadlock. For example, imagine a case where object A calls object B which calls deactivate_object() on A. Now object B simultaneously calls object A which also calls deactivate_object() on B. This will deadlock under the recently proposed revision (see Jonathan Biggar's email). When you include multiple remote objects into the mix the chances of deadlock only increase, and it is not even necessary to have co-recursive calls to deactivate_object for this to occur. For example consider object A on one machine and object B on a remote machine. A calls B which then calls a method on A which attempts to call deactivate_object on A. Under the current proposed revision this will always deadlock. These are not corner cases, and should clearly be supported without deadlocking. We feel the only change required to deactivate_object is to state what the behavior of POA calls will be for requests that are currently executing in an object which has been deactivated. For example, if there are two requests executing in the same object and the first request calls deactivate_object, and afterwards the second request calls servant_to_reference() what should happen? Should it throw ServantNotActive or return succesfully. It seems that it should complete successfully. Another situation to consider is what happens when either implicit or explicit (via activate_object) activation occurs after deactivate_object is called on an oid. Is that oid immediately available? Or must all the current requests complete processing before an object with the same oid be activated? We will open this up as a new issue and provide a proposal. The issues raised in 663 and 675 are really getting at problems with the C++ language mapping and not the behavior of deactivate_object. We don't believe we should be changing the behavior of the POA to address language mapping specific behavior such as memory management. Therefore, we propose the following resolutions to 663 and 675. --------------------------------------------------------------- Issue 663: Implementation problem with policy objects using root POA Nature: Revision Resolution: Transfer to C++ RTF (unless Dan thinks the port-rtf still has jurisdiction, I'm personally confused about who owns what in this case, I know the java-rtf own the Java POA mapping, not sure about C++) (We will be proposing a solution to this problem in that context) ---------------------------------------------------------------- Issue 675: deactivate_object() operation Nature: Revision Summary: deactivate_object() cannot be implemented in a single threaded environment because etherealization is delayed. Resolution: Closed with no action. This is an implementation detail which we and others have managed to work around in our single-threaded ORBs. The simplest way, for example, is to have a queue of objects to be etherealized, which the ORB can process as part of its event loop. George Return-Path: Date: Thu, 02 Jul 1998 11:57:55 -0400 From: Jonathan Biggar Organization: Floorboard Software To: "George M. Scott" CC: port-rtf@omg.org Subject: Re: Counter proposal for Issue 663 and 675 References: <359AFCFA.5C211F3E@inprise.com> Ok, I'll be big about it and admit you have found a serious problem with my proposal! :-) It looks like some kind of refcount is going to be necessary for the C++ mapping. I will respond to your proposal (in another message) with my comments. One question about issue 675. I have always wondered why the spec required that deactivate_object() return before the etherealize() call is made. Isn't this specifying implementation detail that is better left to the ORB implementor? Is there a deadlock or race condition that I haven't thought of that makes it a bad idea to just dispatch the etherealize() call directly from deactivate_object()? [Never mind, I just thought of the reason. I've left the question in this message for expository purposes.] The reason is that since concurrent operations may be still running on the object, you can't call etherealize right away until those operations complete, so you can end up with the same deadlock problems again if another thread calls deactivate_object() on the same object. Jon George M. Scott wrote: > One of the goals Inprise has as part of our participation in > this RTF is to eliminate as many of the current deadlock > situations in the POA as possible. The current proposal on the > table for issues 663 and 675 would introduce blocking behavior to > POA::deactivate_object which has the potential to cause deadlock. > > For example, imagine a case where object A calls object B which > calls deactivate_object() on A. Now object B simultaneously > calls object A which also calls deactivate_object() on B. This > will deadlock under the recently proposed revision (see Jonathan > Biggar's email). When you include multiple remote objects into > the mix the chances of deadlock only increase, and it is not even > necessary to have co-recursive calls to deactivate_object for > this to occur. For example consider object A on one machine > and object B on a remote machine. A calls B which then calls > a method on A which attempts to call deactivate_object on A. > Under the current proposed revision this will always deadlock. > These are not corner cases, and should clearly be supported > without deadlocking. > > We feel the only change required to deactivate_object is to > state what the behavior of POA calls will be for requests that > are currently executing in an object which has been deactivated. > For example, if there are two requests executing in the same > object and the first request calls deactivate_object, and > afterwards the second request calls servant_to_reference() what > should happen? Should it throw ServantNotActive or return > succesfully. It seems that it should complete successfully. > Another situation to consider is what happens when either > implicit or explicit (via activate_object) activation occurs > after deactivate_object is called on an oid. Is that oid > immediately available? Or must all the current requests complete > processing before an object with the same oid be activated? > We will open this up as a new issue and provide a proposal. > > The issues raised in 663 and 675 are really getting at problems > with the C++ language mapping and not the behavior of > deactivate_object. We don't believe we should be changing the > behavior of the POA to address language mapping specific behavior > such as memory management. > > Therefore, we propose the following resolutions to 663 and 675. > > --------------------------------------------------------------- > Issue 663: Implementation problem with policy objects using root POA > Nature: Revision > Resolution: Transfer to C++ RTF (unless Dan thinks the port-rtf > still has jurisdiction, I'm personally confused about who > owns what in this case, I know the java-rtf own the Java > POA mapping, not sure about C++) > > (We will be proposing a solution to this problem in that context) > > ---------------------------------------------------------------- > Issue 675: deactivate_object() operation > Nature: Revision > Summary: deactivate_object() cannot be implemented in a single > threaded environment because etherealization is delayed. > Resolution: Closed with no action. This is an implementation > detail which we and others have managed to work > around in our single-threaded ORBs. The simplest way, > for example, is to have a queue of objects to be > etherealized, which the ORB can process as part of > its event loop. > > George -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org