Issue 1428: Blocking POA operations (port-rtf) Source: (, ) Nature: Uncategorized Issue Severity: Summary: Summary: Several operations on the POAManager and the POA itself have a boolean paramter wait_for_completion which is intended to block the thread until the operation has completed. Unfortunately, this is not trivial to implement in an arbitrary multi-threaded implementation of the POA (or even a single-threaded for that matter). The semantics of wait_for_completion are also quite muddy in this situation, and we prefer not to add complexity to a system when the only gain is some very murky semantics. We would like to propose that wait_for_completion parameter is ignored and always set to FALSE, when the operation is invoked from a thread which is executing in a POA context (i.e. if Current::get_POA() returns a POA). Or put another way, wait_for_completion is only meaningful when invoked outside the context of the POA by either a user-created thread, or in user-event loop (in single-thread environments). Resolution: Revised Text: Actions taken: June 2, 1998: received issue March 1, 1999: closed issue Discussion: End of Annotations:===== Return-Path: Sender: gscott@inprise.com Date: Tue, 02 Jun 1998 11:51:32 -0700 From: "George M. Scott" Organization: Borland International, Inc. To: issues@omg.org CC: port-rtf@omg.org Subject: blocking POA operations More POA fun... Several operations on the POAManager and the POA itself have a boolean paramter wait_for_completion which is intended to block the thread until the operation has completed. Unfortunately, this is not trivial to implement in an arbitrary multi-threaded implementation of the POA (or even a single-threaded for that matter). The semantics of wait_for_completion are also quite muddy in this situation, and we prefer not to add complexity to a system when the only gain is some very murky semantics. We would like to propose that wait_for_completion parameter is ignored and always set to FALSE, when the operation is invoked from a thread which is executing in a POA context (i.e. if Current::get_POA() returns a POA). Or put another way, wait_for_completion is only meaningful when invoked outside the context of the POA by either a user-created thread, or in user-event loop (in single-thread environments). The operations that would need to be changed are: POAManager::hold_requests POAManager::discard_requests POAManager::deactivate POA::destroy Discussion: In a multi-threaded POA implementation an arbitrary number of threads may be executing in a given POA context at any one time. If one thread calls destroy() with wait_for_completion set to true. It is somewhat logical to think that thread should block until all other threads have exited, and then return. Now consider the scenario, where two or more threads calls destory() on the same POA with wait_for_completion set to TRUE. What should happen? Should all the threads that called destroy() block until only those threads are the ones that are currently running? If so, what does wait_for_completion even mean in this context? Since when all the threads return, they can continue to execute indefinitely in user code, the wait_for_completion semantics are not even being provided. This case is even more complicated for POA destruction when it involves multiple descendant POAs or the POAManager which may be controlling mutliple POAs. In these cases many different threads running in many different POAs may invoke the same operation such as destory() on the rootPOA or deactivate() on a shared POAManager. If they all set wait_for_completion to TRUE? What should happen? If they are all to block until they are the only threads executing, then the code to implement this is considerably complex. Since coordination must be done across multiple POAs and multiple threads. Even in the single-threaded case, I'm not sure how one would could correctly implement the wait_for_completion semantics. Imagine, a simple recursive program where two servants in two different POAs call each other recursively in the same process. At one point the code in one servant indicates that the other servant's POA should be destroyed. At this point, from the POA's perspective, it appears there are thousands of requests which have not completed if there were thousands of recursions. What should wait_for_completion do? The thread that called destroy() does not appear to be executing in the current POA, but in fact it is. The only way to support this is to keep a POA stack frame and keep track of all POAs a thread has invoked during the processing of a request. In summary, wait_for_completion cannot be supported as defined today without adding considerable complexity to the POA implementation and adding a considerable performance overhead to user applications. George Return-Path: Date: Tue, 02 Jun 1998 19:50:55 -0400 From: Paul H Kyzivat Organization: NobleNet To: "George M. Scott" CC: issues@omg.org, port-rtf@omg.org Subject: Re: blocking POA operations References: <357449B4.B407B0@inprise.com> George M. Scott wrote: > > More POA fun... > > Several operations on the POAManager and the POA itself have > a boolean paramter wait_for_completion which is intended to > block the thread until the operation has completed. Unfortunately, > this is not trivial to implement in an arbitrary multi-threaded > implementation > of the POA (or even a single-threaded for that matter). The > semantics > of > wait_for_completion are also quite muddy in this situation, and we > prefer not > to add complexity to a system when the only gain is some very murky > semantics. > > We would like to propose that wait_for_completion parameter is > ignored > and always set to FALSE, when the operation is invoked from a thread > which is executing in a POA context (i.e. if Current::get_POA() > returns > a POA). Or put another way, wait_for_completion is only meaningful > when invoked outside the context of the POA by either a user-created > thread, or in user-event loop (in single-thread environments). > > The operations that would need to be changed are: > > POAManager::hold_requests > POAManager::discard_requests > POAManager::deactivate > POA::destroy > > Discussion: [snip] I agree with the basic point, and would add that ORB::shutdown needs to be added to the list. But I don't think parameters should ever be ignored - that leads to incomprehensible APIs. Rather, I would propose that requesting to wait for completion from inside an invocation context should cause an exception, such as BAD_INV_ORDER, to be thrown. (We can later discuss whether the requested action should or should not be initiated prior to throwing the exception.) Without some such interpretation I believe the spec to be unimplementable rather than just a complex implementation. Some of the cases George points out, and others, lead to deadlocks. At the least these would have to be detected, and then there would still need to be some action to break them - probably an exception. Not only is detecting such a deadlock very expensive; but it is undesirable to allow it to occur - it would make debugging servers a nightmare. I already posted a similar note a few weeks ago w/r/t shutdown. Return-Path: Sender: gscott@inprise.com Date: Tue, 02 Jun 1998 17:49:46 -0700 From: "George M. Scott" Organization: Borland International, Inc. To: Paul H Kyzivat CC: issues@omg.org, port-rtf@omg.org Subject: Re: blocking POA operations References: <357449B4.B407B0@inprise.com> <35748FDF.7BC95942@noblenet.com> Paul H Kyzivat wrote: > I agree with the basic point, and would add that ORB::shutdown needs to > be added to the list. Agreed. > But I don't think parameters should ever be ignored - that leads to > incomprehensible APIs. Rather, I would propose that requesting to > wait > for completion from inside an invocation context should cause an > exception, such as BAD_INV_ORDER, to be thrown. (We can later > discuss > whether the requested action should or should not be initiated prior > to > throwing the exception.) I have no problem with throwing an exception. It is probably best tonot do anything if the exception is thrown, so users are forced to call the operations with wait_for_completion set to FALSE. George Return-Path: X-Authentication-Warning: tigger.dstc.edu.au: michi owned process doing -bs Date: Wed, 3 Jun 1998 11:39:14 +1000 (EST) From: Michi Henning To: "George M. Scott" cc: Paul H Kyzivat , issues@omg.org, port-rtf@omg.org Subject: Re: blocking POA operations On Tue, 2 Jun 1998, George M. Scott wrote: > I have no problem with throwing an exception. It is probably best tonot do anything if the exception is thrown, so users are > forced to call the > operations with wait_for_completion set to FALSE. Hmmm... I'm doubtful about that. We have an operation with a boolean parameter. Whenever I set the parameter to TRUE, I unconditionally and always get an exception. In other words, the parameter should not be there. Please, let's bite the bullet and *remove* the parameter if we really decide that it shouldn't be there. Ignoring a parameter or unconditionally throwing an exception if a parameter has the wrong value is *really* ugly. Let's not do that please. Cheers, Michi. -- Michi Henning +61 7 33654310 DSTC Pty Ltd +61 7 33654311 (fax) University of Qld 4072 michi@dstc.edu.au AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html Return-Path: Sender: jon@floorboard.com Date: Tue, 02 Jun 1998 19:09:36 -0700 From: Jonathan Biggar To: Michi Henning CC: "George M. Scott" , Paul H Kyzivat , issues@omg.org, port-rtf@omg.org Subject: Re: blocking POA operations References: Michi Henning wrote: > > On Tue, 2 Jun 1998, George M. Scott wrote: > > > I have no problem with throwing an exception. It is probably best > tonot do anything if the exception is thrown, so users are > > forced to call the > > operations with wait_for_completion set to FALSE. > > Hmmm... I'm doubtful about that. We have an operation with a boolean > parameter. > Whenever I set the parameter to TRUE, I unconditionally and always > get > an exception. In other words, the parameter should not be > there. Please, > let's bite the bullet and *remove* the parameter if we really decide > that it shouldn't be there. Ignoring a parameter or unconditionally > throwing an exception if a parameter has the wrong value is *really* > ugly. > Let's not do that please. I don't think they are suggesting that the parameter value be ignored, or always throw an exception if it is true. I think they are saying that if it is called in a situation where it would cause a deadlock to block (such as setting the POAManager to holding state inside an invocation mediated by that POAManager, then that is the time to throw the exception. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Date: Wed, 03 Jun 1998 08:25:20 -0400 From: Paul H Kyzivat Organization: NobleNet To: Jonathan Biggar CC: Michi Henning , "George M. Scott" , issues@omg.org, port-rtf@omg.org Subject: Re: blocking POA operations References: <3574B060.3E00C092@floorboard.com> Jonathan Biggar wrote: > > Michi Henning wrote: > > > > On Tue, 2 Jun 1998, George M. Scott wrote: > > > > > I have no problem with throwing an exception. It is probably > best > tonot do anything if the exception is thrown, so users are > > > forced to call the > > > operations with wait_for_completion set to FALSE. > > > > Hmmm... I'm doubtful about that. We have an operation with a > boolean > parameter. > > Whenever I set the parameter to TRUE, I unconditionally and always > get > > an exception. In other words, the parameter should not be there. > Please, > > let's bite the bullet and *remove* the parameter if we really > decide > > that it shouldn't be there. Ignoring a parameter or > unconditionally > > throwing an exception if a parameter has the wrong value is > *really* > ugly. > > Let's not do that please. > > I don't think they are suggesting that the parameter value be > ignored, > or always throw an exception if it is true. I think they are saying > that if it is called in a situation where it would cause a deadlock > to > block (such as setting the POAManager to holding state inside an > invocation mediated by that POAManager, then that is the time to > throw > the exception. I am suggesting that the exception be thrown if wait_for_completion is TRUE and the call is made from within an invocation context. Such a situation does not guarantee that a deadlock would occur, so someone might claim that this is over enthusiastic deadlock prevention, but I think the alternatives are worse. Detecting just those situations where a deadlock will occur would be exceedingly complex and probably expensive as well. Also, documenting the behavior would be nearly impossible, and without doing so every orb will probably do it differently and portability will be out the windown. The most reasonable alternative is to explicitly push the responsibility back on the caller not to make a request that might deadlock. So for instance, if an operation invocation tries to destroy the poa it is activated in and wait for the result it would just hang forever. The choice is a philosophical one - how much rope to you want to give people to hang themselves with. The ORB::shutdown case is quite clearcut however - if it is called with wait_for_completion true from an invocation context, and no error is reported, then it is guaranteed to hang. I can think of no reason to allow this. Return-Path: X-Authentication-Warning: tigger.dstc.edu.au: michi owned process doing -bs Date: Thu, 4 Jun 1998 03:30:05 +1000 (EST) From: Michi Henning To: Jonathan Biggar cc: "George M. Scott" , Paul H Kyzivat , issues@omg.org, port-rtf@omg.org Subject: Re: blocking POA operations On Tue, 2 Jun 1998, Jonathan Biggar wrote: > I don't think they are suggesting that the parameter value be ignored, > or always throw an exception if it is true. I think they are saying > that if it is called in a situation where it would cause a deadlock to > block (such as setting the POAManager to holding state inside an > invocation mediated by that POAManager, then that is the time to throw > the exception. Maybe I misunderstood something, but wasn't the argument that detecting deadlock would be too expensive? If so, I would hang on to my original vargument. Of course, if I misunderstood, the point is moot... Cheers, Michi. -- Michi Henning +61 7 33654310 DSTC Pty Ltd +61 7 33654311 (fax) University of Qld 4072 michi@dstc.edu.au AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html Return-Path: Sender: jon@floorboard.com Date: Wed, 03 Jun 1998 10:44:21 -0700 From: Jonathan Biggar To: Michi Henning CC: "George M. Scott" , Paul H Kyzivat , issues@omg.org, port-rtf@omg.org Subject: Re: blocking POA operations References: Michi Henning wrote: > > On Tue, 2 Jun 1998, Jonathan Biggar wrote: > > > I don't think they are suggesting that the parameter value be > ignored, > > or always throw an exception if it is true. I think they are > saying > > that if it is called in a situation where it would cause a > deadlock to > > block (such as setting the POAManager to holding state inside an > > invocation mediated by that POAManager, then that is the time to > throw > > the exception. > > Maybe I misunderstood something, but wasn't the argument that > detecting > deadlock would be too expensive? If so, I would hang on to my > original > argument. Of course, if I misunderstood, the point is moot... Yes, you are right. The argument is that detecting the deadlock may be too expensive, so if you are in an invocation context and you call these operations in a blocking mode, then an exception should be raised. However, that still does not argue that the blocking boolean argument should be removed, because is still the case of calling the operation when not in an invocation context where waiting is useful or required. My personal leaning is just to document the dangers and leave it as it is. There are aready enough nooses around when writing multithreaded programs that the programmer can hang himself with for us to get overly concerned about these. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Sender: gscott@inprise.com Date: Wed, 03 Jun 1998 10:50:45 -0700 From: "George M. Scott" Organization: Borland International, Inc. To: Michi Henning CC: Jonathan Biggar , Paul H Kyzivat , issues@omg.org, port-rtf@omg.org Subject: Re: blocking POA operations References: Michi Henning wrote: > On Tue, 2 Jun 1998, Jonathan Biggar wrote: > > > I don't think they are suggesting that the parameter value be ignored, > > or always throw an exception if it is true. I think they are saying > > that if it is called in a situation where it would cause a deadlock to > > block (such as setting the POAManager to holding state inside an > > invocation mediated by that POAManager, then that is the time to throw > > the exception. > > Maybe I misunderstood something, but wasn't the argument that detecting > deadlock would be too expensive? If so, I would hang on to my original > argument. Of course, if I misunderstood, the point is moot... Hi Michi, The deadlock cannot occur if the operation is called from a thread which is not in the context of a POA invocation. Checking whether or not a thread is the context of a POA execution is already required to implement the POACurrent, so this is a trivial operation for ORB vendors to implement. So, in those cases it still makes sense to leave the wait_for_completion parameter. George Return-Path: X-Authentication-Warning: tigger.dstc.edu.au: michi owned process doing -bs Date: Thu, 4 Jun 1998 03:59:55 +1000 (EST) From: Michi Henning To: "George M. Scott" cc: Jonathan Biggar , Paul H Kyzivat , issues@omg.org, port-rtf@omg.org Subject: Re: blocking POA operations On Wed, 3 Jun 1998, George M. Scott wrote: > The deadlock cannot occur if the operation is called from a thread which > is not in the context of a POA invocation. Checking whether or not a thread > is the context of a POA execution is already required to implement the > POACurrent, so this is a trivial operation for ORB vendors to implement. > So, in those cases it still makes sense to leave the wait_for_completion > parameter. I agree, and I think that is what Jon explained also. Sorry for the confusion on my part. I would agree then that your original suggestion makes sense: raise an exception if called from within an invocation. Cheers, Michi. -- Michi Henning +61 7 33654310 DSTC Pty Ltd +61 7 33654311 (fax) University of Qld 4072 michi@dstc.edu.au AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html Return-Path: Sender: "George Scott" Date: Wed, 01 Jul 1998 19:01:35 -0700 From: "George M. Scott" Organization: Inprise Corporation To: port-rtf@omg.org Subject: Proposed resolutions for Issues 1407-1410, 1428 The following are the proposals to address issues 1407-1410, and 1428. The proposals are based on our discussions from the previous POA conference call, and should hopefully be acceptable. I'm attempting to follow Dan's format for consistency. George ----------------------------------------------------------- Issue 1428: Blocking POA Operations Nature: Revision Summary: Several operations added to CORBA as part of the Portability submission provide blocking behavior which can result in deadlock in a large number of cases. These calls include POA::destroy, ORB::shutdown, POAManager::deactivate, POAManager::hold_requests, POAManager::discard_requests. Resolution: Accepted for Corba 2.3 RTF Revision: The following changes are proposed: Replace the second sentence in the paragraph of section 4.9.4, page 4-20, which begins with "If the wait_for_completion ..." with the following: "If the wait_for_completion parameter is TRUE and the current thread is not in an ORB-dispatched invocation context, this operation blocks until all ORB processing (including request processing and object deactivation or other operations associated with object adapters) has completed. If the wait_for_completion parameter is TRUE and the current thread is in an ORB-dispatched invocation context, then the BAD_INV_ORDER exception is thrown." Replace the phrase "If the parameter is TRUE" in the 2nd paragraph, 2nd sentence of the hold_requests description in Section 9.3.2, page 9-18, with the phrase "If the parameter is TRUE and the current thread is not in a POA-dispatched invocation context". Add the sentenence "If the parameter is TRUE and the current thread is in a POA-dispatched invocation context then the BAD_INV_ORDER exception is raised and the state is not changed." Replace the phrase "If the parameter is TRUE" in the 2nd paragraph, 2nd sentence of the discard_requests description in Section 9.3.2, page 9-18, with the phrase "If the parameter is TRUE and the current thread is not in a POA-dispatched invocation context". Add the sentenence "If the parameter is TRUE and the current thread is in a POA-dispatched invocation context then the BAD_INV_ORDER exception is raised and the state is not changed." to the end of the paragraph. Replace the phrase "If the parameter is TRUE" in the 3rd paragraph, 2nd sentence of the deactivate description in Section 9.3.2, page 9-18, with the phrase "If the parameter is TRUE and the current thread is not in a POA-dispatched invocation context". Add the sentenence "If the parameter is TRUE and the current thread is in a POA-dispatched invocation context then the BAD_INV_ORDER exception is raised and the state is not changed." to the end of the paragraph. (editorial note- I changed the text for POA::destory in the resolution of issues 1408 and 1409 above, so it is not shown here but a similar change to the above is necessary. I also felt that 1409 raised an issue which also affects POAManager::deactivate, so the following change is also proposed) Add the following paragraph to the end of the description of deactivate in section 9.3.2, page 9-19. "If deactivate is called multiple times before destruction is complete (because there are active requests), the etherealize_objects parameter will only apply to the first call of deactivate, subsequent calls with conflicting etherealize_objects settings will use the value of the etherealize_objects from the first call. The wait_for_completion parameter will be handled as defined above for each individual call (some callers may choose to block, while others may not)." ---------------------------------------------------------------- Return-Path: Date: Mon, 06 Jul 1998 13:55:46 -0400 From: Paul H Kyzivat Organization: NobleNet To: "George M. Scott" CC: port-rtf@omg.org Subject: Re: Proposed resolutions for Issues 1407-1410, 1428 References: <359AE9FF.441D15F7@inprise.com> George M. Scott wrote: > > The following are the proposals to address issues 1407-1410, > and 1428. The proposals are based on our discussions from > the previous POA conference call, and should hopefully be > acceptable. > > I'm attempting to follow Dan's format for consistency. > > George > > ----------------------------------------------------------- > Issue 1407: Interaction of find_POA(), AdapterActivators and > create_POA() > Nature: Clarification > > Summary: There is a race condition in the POA when an > AdapterActivator is invoked as a result of a > call to find_POA() or an over-the-wire request, and > simultaneously another thread calls create_POA() > attempting to create the same POA manually. > > Resolution: Accepted for Corba RTF 2.3 > > Revision: Section 9.3.3, Page 9-20. After first paragraph > in page beginning with "if unknown_adapter raises..." > add the following Note: > > Note - It is possible for another thread to create the same > POA the AdapterActivator is being asked to create if > AdapterActivators are used in conjunction with other threads > calling create_POA with the same POA name. To avoid potential > race conditions, it is recommended that AdapterActivators and > manual creation of POAs (via the create_POA call) not be used > in conjunction to create a particular POA. > > ------------------------------------------------------------- > Issue 1408: POA destory() is ill-defined > Issue 1409: Multiple threads calling destroy() once destroy() > has begun > Nature: Revision > > Summary: POA destroy is not defined sufficiently enough to > prevent multiple activations of the same POA name in > the same process. POA::destory semantics are not defined > for multiple threads calling destroy. > > Resolution: Accepted for Corba RTF 2.3 > > Revision: The following text will replace the current describing > the destroy operation in Section 9.3.8, Page 9-31. > > This operation destroys the POA and all descendant POAs. > Descendant POAs are destroyed (recursively) before the destruction > of the containing POA. The POA so destroyed (that is, the POA > with its name) may be re-created later in the same process. > (This differs from the POAManager::deactivate operation that does > not allow a re-creation of its associated POA in the same process. > > When a POA is destroyed, any requests that have started execution > continue to completion. Any requests that have not started > execution are processed as if the POA were in the holding state, > until execution of all active requests has completed. Once > all active requests have completed, all queued requests (if > any) will behave as if they were newly arrived, that is, the POA > will attempt to cause recreation of the POA by invoking one or > more adapter activators. > > If the etherealize_objects parameter is TRUE, the POA has the > RETAIN policy, and a servant manager is registered with the POA, > the etherealize operation on the servant manager will be called > for each active object in the Active Object Map. Etherealization > will not occur until all active requests have completed execution. > The apparent destruction of the POA occurs before any calls > to etherealize are made. Thus, for example, an etherealize > method that attempts to invoke operations on the POA will > receive the OBJECT_NOT_EXIST exception. > > {editorial note - below I have merged text from these two issues > and issue 1428 because they change the same text. If > one of these resolutions fails, we will need to edit the following > text) > > The wait_for_completion parameter is handled as follows: > > - If wait_for_completion is TRUE and the current > thread is not in a POA-dispatched invocation context, the > destroy operation will return only after all active requests > have completed and all invocations of etherealize have > completed. > > - If wait_for_completion is TRUE and the current thread is in a > POA-dispatched invocation context, then the BAD_INV_ORDER > exception is thrown and POA destruction does not occur. > > - If wait_for_completion is FALSE, the destroy operation > destroys all POAs but does not wait for active requests to > complete nor for etherealization to occur. > > If destroy is called multiple times before destruction is > complete (because there are active requests), the > etherealize_objects parameter will only apply to the first > call of destroy, subsequent calls with conflicting > etherealize_objects settings will use the value of the > etherealize_objects from the first call. The wait_for_completion > parameter will be handled as defined above for each individual > call (some callers may choose to block, while others may not). > This seems reasonable. > -------------------------------------------------------------- > Issue 1410: Changing default servant, etc. after POA activation > Nature: Revision > > Summary: Changing default servants, servant managers, etc. > after POA activation can be problematic. After some > discussion it was agreed that this is only truly > problematic for ServantLocators which need to have > matching preinvoke/postinvoke operations. Failing > to do so could result in memory leaks (due to the > Cookie parameter) or other strange side effects, > including inproper transaction management. > > Resolution: Accepted for Corba 2.3 RTF > > Revision: There are two current proposed revisions, which are > outlined below. Inprise will vote for proposal (a) > over proposal (b). > > (a) > Change the second paragraph describing the set_servant_manager > operation on page 9-33, Section 9.3.8 to the following: > > "This operation sets the default servant manager associated > with the POA. This operation may only be invoked once after > a POA has been created. Attempting to set the servant manager > after one has already been set will result in the BAD_INV_ORDER > exception being thrown." I find this overly restrictive. > > (b) > Add a third paragraph to the description of set_servant_manager > on page 9-33, Section 9.3.8: > > "Changing a ServantLocator will result in newly received requests > calling preinvoke and postinvoke on the new ServantLocator. > However, currently active requests will call postinvoke on the > ServantLocator on which they called preinvoke, which may not be > the same as the current ServantLocator." I also find this overly restrictive. It imposes a particular implementation discipline to cover an obscure case that will rarely arise in practice and which a developer can work around if it does arise. I would assert: - in most cases there will be no need to change the locator, - when it is necessary, the developer will probably know that the change is safe, - when the developer doesn't know it is safe, the POAManager state can be set to holding before making the change to guarantee safety. So, I don't think any change is needed, except for some words of warning, replacing the ones above: "If a ServantLocator is replaced while operations are outstanding, then the old ServantLocator may be called for preinvoke and the replacement for postinvoke of the same request. If this is unacceptable, then the POA's POAManager may be placed in the HOLDING state prior to making the change." > > ---------------------------------------------------------------- > Issue 1428: Blocking POA Operations > Nature: Revision > > Summary: Several operations added to CORBA as part of the > Portability submission provide blocking behavior which > can result in deadlock in a large number of cases. > These calls include POA::destroy, ORB::shutdown, > POAManager::deactivate, POAManager::hold_requests, > POAManager::discard_requests. > > Resolution: Accepted for Corba 2.3 RTF > > Revision: The following changes are proposed: > > Replace the second sentence in the paragraph of section 4.9.4, > page 4-20, which begins with "If the wait_for_completion ..." > with the following: > > "If the wait_for_completion parameter is TRUE and the current > thread is not in an ORB-dispatched invocation context, this > operation blocks until all ORB processing (including request > processing and object deactivation or other operations associated > with object adapters) has completed. If the wait_for_completion > parameter is TRUE and the current thread is in an ORB-dispatched > invocation context, then the BAD_INV_ORDER exception is thrown." > > Replace the phrase "If the parameter is TRUE" in the 2nd > paragraph, 2nd sentence of the hold_requests description in > Section 9.3.2, page 9-18, with the phrase "If the parameter > is TRUE and the current thread is not in a POA-dispatched > invocation context". Add the sentenence "If the parameter > is TRUE and the current thread is in a POA-dispatched invocation > context then the BAD_INV_ORDER exception is raised and the > state is not changed." > > Replace the phrase "If the parameter is TRUE" in the 2nd > paragraph, 2nd sentence of the discard_requests description in > Section 9.3.2, page 9-18, with the phrase "If the parameter > is TRUE and the current thread is not in a POA-dispatched > invocation context". Add the sentenence "If the parameter > is TRUE and the current thread is in a POA-dispatched invocation > context then the BAD_INV_ORDER exception is raised and the > state is not changed." to the end of the paragraph. > > Replace the phrase "If the parameter is TRUE" in the 3rd > paragraph, 2nd sentence of the deactivate description in > Section 9.3.2, page 9-18, with the phrase "If the parameter > is TRUE and the current thread is not in a POA-dispatched > invocation context". Add the sentenence "If the parameter > is TRUE and the current thread is in a POA-dispatched invocation > context then the BAD_INV_ORDER exception is raised and the > state is not changed." to the end of the paragraph. > > (editorial note- I changed the text for POA::destory in the > resolution of issues 1408 and 1409 above, so it is not shown > here but a similar change to the above is necessary. > I also felt that 1409 raised an issue which also affects > POAManager::deactivate, so the following change is also > proposed) > > Add the following paragraph to the end of the description of > deactivate in section 9.3.2, page 9-19. > > "If deactivate is called multiple times before destruction is > complete (because there are active requests), the > etherealize_objects parameter will only apply to the first > call of deactivate, subsequent calls with conflicting > etherealize_objects settings will use the value of the > etherealize_objects from the first call. The wait_for_completion > parameter will be handled as defined above for each individual > call (some callers may choose to block, while others may not)." This seems reasonable Return-Path: Date: Wed, 08 Jul 1998 00:57:46 -0700 From: "Jon Goldberg" To: port-rtf@omg.org Subject: Re: Proposed resolutions for Issues 1407-1410, 1428 References: <359AE9FF.441D15F7@inprise.com> Hi Folks- I've modified and am now resubmitting George's original proposal for issues 1408-9, 1428 to take into account the conference call feedback from July 7 (tuesday). Based on my notes, 1428 actually was not changed from the conference call but I've included it in this message since they are all inter-dependent. My assessment of the changes is: 1. description of destroy changed to explicitly talk about setting the POA and its descendents as if they have already been destroyed prior to etherealization beginning for any active objects of any of those POAs. This ensures that all of the POAs pending destruction will reject/queue (see #3 below) requests before any etherealization is started. In the initial proposal, each child POA was independently destroyed before destruction of the parent POA. This allows a possibility of infinite looping in the destruction if etherealization for two child POAs cause each other to be re-created on each POA destruction. 2. the current description compares the POA's temporary behavior as if it is in the "holding state". We agreed to change this terminology to describe it as if the POA's POAManager(s) are in a state, since POAs themselves do not have state. 3. We also concluded that the state in question should be "inactive", not "holding". The implication of this change is that for the duration of the destruction (which may include etherealization), any requests for that POA will now result in an exception. Once the destruction is complete, any new requests might cause the POA to be recreated. Although this avoids deadlocks where etherealization attempts to access another POA which is pending destruction, it means clients will see inconsistent results depending on when destruction occurs for a POA. If the state is "holding", these requests will be queued and the client will always see a consistent result, although the request may take longer to complete depending on when POA destruction occurs. In addition, with change #1 above, if the state is "holding" deadlocks can occur if one POA's etherealization depends on another POA being destructed. I'm stating this implication here to make sure that the RTF voters are all aware of the issue. 4. We decided to mandate bottom-up etherealization. This means that a POA always attempts etherialization on its descendant POAs before proceeding with etherealization of its own active objects. take care, Jon Issue 1428: Blocking POA Operations Nature: Revision Summary: Several operations added to CORBA as part of the Portability submission provide blocking behavior which can result in deadlock in a large number of cases. These calls include POA::destroy, ORB::shutdown, POAManager::deactivate, POAManager::hold_requests, POAManager::discard_requests. Resolution: Accepted for Corba 2.3 RTF Revision: The following changes are proposed: Replace the second sentence in the paragraph of section 4.9.4, page 4-20, which begins with "If the wait_for_completion ..." with the following: "If the wait_for_completion parameter is TRUE and the current thread is not in an ORB-dispatched invocation context, this operation blocks until all ORB processing (including request processing and object deactivation or other operations associated with object adapters) has completed. If the wait_for_completion parameter is TRUE and the current thread is in an ORB-dispatched invocation context, then the BAD_INV_ORDER exception is thrown." Replace the phrase "If the parameter is TRUE" in the 2nd paragraph, 2nd sentence of the hold_requests description in Section 9.3.2, page 9-18, with the phrase "If the parameter is TRUE and the current thread is not in a POA-dispatched invocation context". Add the sentenence "If the parameter is TRUE and the current thread is in a POA-dispatched invocation context then the BAD_INV_ORDER exception is raised and the state is not changed." Replace the phrase "If the parameter is TRUE" in the 2nd paragraph, 2nd sentence of the discard_requests description in Section 9.3.2, page 9-18, with the phrase "If the parameter is TRUE and the current thread is not in a POA-dispatched invocation context". Add the sentenence "If the parameter is TRUE and the current thread is in a POA-dispatched invocation context then the BAD_INV_ORDER exception is raised and the state is not changed." to the end of the paragraph. Replace the phrase "If the parameter is TRUE" in the 3rd paragraph, 2nd sentence of the deactivate description in Section 9.3.2, page 9-18, with the phrase "If the parameter is TRUE and the current thread is not in a POA-dispatched invocation context". Add the sentenence "If the parameter is TRUE and the current thread is in a POA-dispatched invocation context then the BAD_INV_ORDER exception is raised and the state is not changed." to the end of the paragraph. (editorial note- I changed the text for POA::destory in the resolution of issues 1408 and 1409 above, so it is not shown here but a similar change to the above is necessary. I also felt that 1409 raised an issue which also affects POAManager::deactivate, so the following change is also proposed) Add the following paragraph to the end of the description of deactivate in section 9.3.2, page 9-19. "If deactivate is called multiple times before destruction is complete (because there are active requests), the etherealize_objects parameter will only apply to the first call of deactivate, subsequent calls with conflicting etherealize_objects settings will use the value of the etherealize_objects from the first call. The wait_for_completion parameter will be handled as defined above for each individual call (some callers may choose to block, while others may not)." ---------------------------------------------------------------- Return-Path: Sender: jon@floorboard.com Date: Wed, 08 Jul 1998 08:13:06 -0700 From: Jonathan Biggar To: Jon Goldberg CC: port-rtf@omg.org Subject: Re: Proposed resolutions for Issues 1407-1410, 1428 References: <359AE9FF.441D15F7@inprise.com> <35A3267A.181DCAD8@inprise.com> Jon Goldberg wrote: > > Hi Folks- > > I've modified and am now resubmitting George's original proposal for > issues 1408-9, 1428 to take into account the conference call > feedback > from July 7 (tuesday). Based on my notes, 1428 actually was not > changed from the conference call but I've included it in this > message > since they are all inter-dependent. My assessment of the changes > is: > > 1. description of destroy changed to explicitly talk about setting > the POA and its descendents as if they have already been destroyed > prior to etherealization beginning for any active objects of any > of those POAs. This ensures that all of the POAs pending > destruction > will reject/queue (see #3 below) requests before any etherealization > is started. In the initial proposal, each child POA was > independently > destroyed before destruction of the parent POA. This allows a > possibility of infinite looping in the destruction if > etherealization > for two child POAs cause each other to be re-created on each > POA destruction. > > 2. the current description compares the POA's temporary behavior as > if > it is in the "holding state". We agreed to change this terminology > to > describe it as if the POA's POAManager(s) are in a state, since POAs > themselves do not have state. > > 3. We also concluded that the state in question should be > "inactive", > not "holding". The implication of this change is that for the > duration of the destruction (which may include etherealization), any > requests for that POA will now result in an exception. Once the > destruction is complete, any new requests might cause the POA to be > recreated. Although this avoids deadlocks where etherealization > attempts to access another POA which is pending destruction, it > means > clients will see inconsistent results depending on when destruction > occurs for a POA. If the state is "holding", these requests will be > queued and the client will always see a consistent result, although > the request may take longer to complete depending on when POA > destruction occurs. In addition, with change #1 above, if the state > is > "holding" deadlocks can occur if one POA's etherealization depends > on > another POA being destructed. I'm stating this implication here to > make sure that the RTF voters are all aware of the issue. I don't like this much because it does cause visibility of the destroy process to the clients, particularly if the destroy comes from a call to ORB::shutdown(). However, I don't see a good way around this. Perhaps it would be a good idea to also state that when ORB::shutdown() is called that the input side of the protocols (GIOP) are blocked before the POAs are destroyed in order to avoid client interaction while the shutdown is processing? Then the process would look like this: 1. Someone calls ORB::shutdown(). 2. All server side protocol connections are blocked from processing further input. 3. All pending requests are drained from the ORB. 4. Once the last outstanding reply is sent on a server connection, the connection is shutdown (via CloseConnection if GIOP). 4. All POAs are destroyed. 5. Any other ORB level cleanup is done. Of course this could be left as a quality of ORB implementation issue, but that is how I would implement this. > 4. We decided to mandate bottom-up etherealization. This means that a > POA always attempts etherialization on its descendant POAs before > proceeding with etherealization of its own active objects. Seems reasonable, since it is likely that child POAs are using servant managers or adapter activators that are servants of parent POAs, and this avoids race conditions for those. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Sender: "Jon Goldberg" Date: Wed, 08 Jul 1998 10:56:26 -0700 From: Jon Goldberg To: Jonathan Biggar CC: port-rtf@omg.org Subject: Re: Proposed resolutions for Issues 1407-1410, 1428 References: <359AE9FF.441D15F7@inprise.com> <35A3267A.181DCAD8@inprise.com> <35A38C82.5C41F600@floorboard.com> Jonathan Biggar wrote: > > Jon Goldberg wrote: > > > > Hi Folks- > > > > I've modified and am now resubmitting George's original proposal > for > > issues 1408-9, 1428 to take into account the conference call > feedback > > from July 7 (tuesday). Based on my notes, 1428 actually was not > > changed from the conference call but I've included it in this > message > > since they are all inter-dependent. My assessment of the changes > > is: > > > > 1. description of destroy changed to explicitly talk about setting > > the POA and its descendents as if they have already been destroyed > > prior to etherealization beginning for any active objects of any > > of those POAs. This ensures that all of the POAs pending > destruction > > will reject/queue (see #3 below) requests before any > etherealization > > is started. In the initial proposal, each child POA was > independently > > destroyed before destruction of the parent POA. This allows a > > possibility of infinite looping in the destruction if > etherealization > > for two child POAs cause each other to be re-created on each > > POA destruction. > > > > 2. the current description compares the POA's temporary behavior > as if > > it is in the "holding state". We agreed to change this > terminology to > > describe it as if the POA's POAManager(s) are in a state, since > POAs > > themselves do not have state. > > > > 3. We also concluded that the state in question should be > "inactive", > > not "holding". The implication of this change is that for the > > duration of the destruction (which may include etherealization), > any > > requests for that POA will now result in an exception. Once the > > destruction is complete, any new requests might cause the POA to > be > > recreated. Although this avoids deadlocks where etherealization > > attempts to access another POA which is pending destruction, it > means > > clients will see inconsistent results depending on when > destruction > > occurs for a POA. If the state is "holding", these requests will > be > > queued and the client will always see a consistent result, > although > > the request may take longer to complete depending on when POA > > destruction occurs. In addition, with change #1 above, if the > state is > > "holding" deadlocks can occur if one POA's etherealization depends > on > > another POA being destructed. I'm stating this implication here > to > > make sure that the RTF voters are all aware of the issue. > > I don't like this much because it does cause visibility of the > destroy > process to the clients, particularly if the destroy comes from a > call to > ORB::shutdown(). However, I don't see a good way around this. > Perhaps > it would be a good idea to also state that when ORB::shutdown() is > called that the input side of the protocols (GIOP) are blocked > before > the POAs are destroyed in order to avoid client interaction while > the > shutdown is processing? Then the process would look like this: > > 1. Someone calls ORB::shutdown(). > > 2. All server side protocol connections are blocked from processing > further input. > > 3. All pending requests are drained from the ORB. > > 4. Once the last outstanding reply is sent on a server connection, > the > connection is shutdown (via CloseConnection if GIOP). > > 4. All POAs are destroyed. In the case of ORB::shutdown this may work because all new requests will be rejected and therefore POAs can be destroyed without deadlock or infinite looping. If the ORB isn't being shutdown, however (we are just destroying a POA and its descendants), new requests can still come in and we're back in the same hole as before. > 5. Any other ORB level cleanup is done. > > Of course this could be left as a quality of ORB implementation > issue, > but that is how I would implement this. After further consideration, I no longer think it is tolerable for the POAs to behave as if INACTIVE. The common case is POA destruction without mutual reference to sibling POAs, and this should not be penalized by opening up a window for client requests to be rejected. Furthermore, I no longer think it is possible to solve this deadlock problem. I think the best we can do is as follows: 1. use the scheme outlined in my revised proposal which first marks the POA and its descendants as pending destruction. 2. keep the original meaning of 'pending destruction', which means they behave as if their POAManagers are in the HOLDING state. The POAs will not behave as if their POAManagers are in the INACTIVE state because this surfaces the destruction to clients which is the wrong behavior. 3. declare that POA etherealization which depends on access to other POAs that are pending destruction is dangerous and should be avoided since it can cause deadlocks. I think #1 and #2 are necessary to maintain atomicity (a POA should not be recreated if its parent POA is pending destruction) and to protect clients. any thoughts? -Jon Return-Path: Date: Thu, 09 Jul 1998 12:40:46 -0400 From: Paul H Kyzivat Organization: NobleNet To: Jon Goldberg CC: Jonathan Biggar , port-rtf@omg.org Subject: Re: Proposed resolutions for Issues 1407-1410, 1428 References: <359AE9FF.441D15F7@inprise.com> <35A3267A.181DCAD8@inprise.com> <35A38C82.5C41F600@floorboard.com> <35A3B2CA.694FC370@inprise.com> Jon Goldberg wrote: > After further consideration, I no longer think it is tolerable for > the POAs to behave as if INACTIVE. The common case is POA > destruction > without mutual reference to sibling POAs, and this should not be > penalized by opening up a window for client requests to be rejected. I agree. > > Furthermore, I no longer think it is possible to solve this deadlock > problem. I think the best we can do is as follows: > > 1. use the scheme outlined in my revised proposal which first marks > the POA and its descendants as pending destruction. > > 2. keep the original meaning of 'pending destruction', which means > they > behave as if their POAManagers are in the HOLDING state. The > POAs will not behave as if their POAManagers are in the INACTIVE > state > because this surfaces the destruction to clients which is the > wrong behavior. > > 3. declare that POA etherealization which depends on access to other > POAs that are pending destruction is dangerous and should be avoided > since it can cause deadlocks. I am not sure I fully understand how you want this to work, but I think there is a problem: It should be possible to destroy all the POAs by destroying the root POA, and have something reasonable happen. This is implicitly what happens during shutdown, and it ought to be possible to do the same thing explicitly. But doing something reasonable in this case means that etherealization ought to take place (except perhaps in pathological cases). If destroying a parent POA first marks its entire tree as destroyed, and then starts etherealizing things, the POAs at the bottom of the tree can't use servant managers that are nearer to the root. The best way I can see to make this work reasonably is to have destroy on a POA first work bottom-up, destroying the most derived children first while leaving the higher up POAs free to service calls to any servant managers they may contain. Once a POA has given its children a chance this way, it can finally make itself unusable and etherealize its own children. This process eventually percolates all the way to the the POA that the first destroy request was sent to. There are still failure modes - when a POA has a ServantActivator in a peer or child POA, or an etherealize method that activates new objects in another POA. In the worst cases etherealization of some servants fails. I would just like to get the common cases working reasonably. > > I think #1 and #2 are necessary to maintain atomicity (a POA should > not be recreated if its parent POA is pending destruction) and > to protect clients. > > any thoughts? Yes - above. Return-Path: Date: Fri, 10 Jul 1998 19:18:36 -0700 From: "Jon Goldberg" To: port-rtf@omg.org Subject: Revision of 1408-9, 1428 Hi Folks- We seemed to have reached some consensus in the last conference call on the behavior of destroy (1408-9) and wait_for_completion (1428). The following proposal is assumed to withdraw any previous proposals and to invalidate previous votes. Please consider the new proposal and cast your vote. Even if you have already voted on a previous proposal, you need to vote again because Dan is wiping the slate clean. 1. The new proposal is mostly George's original which had the following characteristics: a. the atomicity for a destroy() is a single POA. If you destroy a parent POA, it will *first* destroy its children and then destroy itself. While the children are being destroyed, there is no indication that the parent POA is pending destruction. b. the behavior of a POA pending destruction is as if its POAManager is in the holding state. New requests will be queued. 2. That proposal is now modified to indicate that the parent POA destroys its children recursively, and then destroys itself once there are no more children. This covers the case where a new child is created (or recreated) during destruction. 3. I realized the following new problem while writing this message: a. Object A calls (remote) object B which calls back into Object A. b. If destroy is called on A's POA before it calls over to object B, the rules we discussed previously will cause deadlock since destruction is defined as pending until all active requests have completed. This call pattern is too common to have this deadlock allowed. Therefore, I've modified the proposal such that the POA pending destruction does *not* wait for all currently outstanding requests to complete. Instead, it sets itself as if its POAManager is in the holding state and then starts etherealization. If any currently executing requests then recursively call back into that POA, they will block until the etherealization is complete. This should be fine even in multi-threaded environments, assuming etherealization no longer explicitly destroys the Servant (which is a language-mapping issue). If the etherealization happens to cause a call back into that same POA, there will be a deadlock. The new proposal calls this out as a warning about complicated processing during etherealization. The proposal for 1428 is very stable at this point and I'm hoping it will be adopted quickly. I only modified the phrase "POA-dispatched execution context" to "execution context dispatched by some POA" for clarity. fire away, Jon ---------------------------------------------------------------- Issue 1408: POA destroy() is ill-defined Issue 1409: Multiple threads calling destroy() once destroy() has begun Nature: Revision Summary: POA destroy is not defined sufficiently enough to prevent multiple activations of the same POA name in the same process. POA::destroy semantics are not defined for multiple threads calling destroy. Resolution: Accepted for Corba RTF 2.3 Revision: The following text will replace the current describing the destroy operation in Section 9.3.8, Page 9-31. This operation destroys the POA and all descendant POAs. All descendant POAs are destroyed (recursively) before the destruction of the containing POA. The POA so destroyed (that is, the POA with its name) may be re-created later in the same process. (This differs from the POAManager::deactivate operation that does not allow a re-creation of its associated POA in the same process. After a deactivate, re-creation is allowed only if the POA is later destroyed.) When a POA is destroyed, any requests that have started execution continue to completion. Any requests that have not started execution are processed as if the POA's POAManager were in the holding state, until POA destruction is complete. Once POA destruction is complete, all queued requests (if any) will behave as if they were newly arrived, that is, the POA will attempt to cause recreation of the POA by invoking one or more adapter activators. POA destruction does not block until all active requests complete execution, as this can cause deadlock. If the etherealize_objects parameter is TRUE, the POA has the RETAIN policy, and a servant manager is registered with the POA, the etherealize operation on the servant manager will be called for each active object in the Active Object Map. Etherealization can occur while active requests are still executing on those servants. The POA behaves as if its POAManager were in the holding state while all calls to etherealize are made. Therefore, an etherealize method that attempts to invoke operations on the POA will deadlock. POA destruction is considered complete once all active objects have been etherealized. {editorial note - below I have merged text from these two issues and the relevant text from issue 1428 because they change the same text. If one of these resolutions fails, we will need to edit the following text) The wait_for_completion parameter is handled as follows: - If wait_for_completion is TRUE and the current thread is not in an invocation context dispatched from any POA, the destroy operation will return only after all active requests have completed and all invocations of etherealize have completed. - If wait_for_completion is TRUE and the current thread is in an invocation context dispatched from any POA, then the BAD_INV_ORDER exception is thrown and POA destruction does not occur. - If wait_for_completion is FALSE, the destroy operation destroys the POA but does not wait for active requests to complete nor for etherealization to occur. If destroy is called multiple times before destruction is complete (because there are active requests), the etherealize_objects parameter will only apply to the first call of destroy. Subsequent calls with conflicting etherealize_objects settings will use the value of the etherealize_objects from the first call. The wait_for_completion parameter will be handled as defined above for each individual call (some callers may choose to block, while others may not). ---------------------------------------------------------------- Issue 1428: Blocking POA Operations Nature: Revision Summary: Several operations added to CORBA as part of the Portability submission provide blocking behavior which can result in deadlock in a large number of cases. These calls include POA::destroy, ORB::shutdown, POAManager::deactivate, POAManager::hold_requests, POAManager::discard_requests. Resolution: Accepted for Corba 2.3 RTF Revision: The following changes are proposed: Replace the second sentence in the paragraph of section 4.9.4, page 4-20, which begins with "If the wait_for_completion ..." with the following: "If the wait_for_completion parameter is TRUE and the current thread is not in an invocation context dispatched by some ORB, this operation blocks until all ORB processing (including request processing and object deactivation or other operations associated with object adapters) has completed. If the wait_for_completion parameter is TRUE and the current thread is in an invocation context dispatched by some ORB, then the BAD_INV_ORDER exception is thrown." Replace the phrase "If the parameter is TRUE" in the 2nd paragraph, 2nd sentence of the hold_requests description in Section 9.3.2, page 9-18, with the phrase "If the parameter is TRUE and the current thread is not in an invocation context dispatched by some POA". Add the sentence "If the parameter is TRUE and the current thread is in an invocation context dispatched by some POA then the BAD_INV_ORDER exception is raised and the state is not changed." Replace the phrase "If the parameter is TRUE" in the 2nd paragraph, 2nd sentence of the discard_requests description in Section 9.3.2, page 9-18, with the phrase "If the parameter is TRUE and the current thread is not in an invocation context dispatched by some POA". Add the sentenence "If the parameter is TRUE and the current thread is in an invocation context dispatched by some POA then the BAD_INV_ORDER exception is raised and the state is not changed." to the end of the paragraph. Replace the phrase "If the parameter is TRUE" in the 3rd paragraph, 2nd sentence of the deactivate description in Section 9.3.2, page 9-18, with the phrase "If the parameter is TRUE and the current thread is not in an invocation context dispatched by some POA". Add the sentenence "If the parameter is TRUE and the current thread is in an invocation context dispatched by some POA then the BAD_INV_ORDER exception is raised and the state is not changed." to the end of the paragraph. (editorial note- I changed the text for POA::destroy in the resolution of issues 1408 and 1409 above, so it is not shown here but a similar change to the above is necessary. I also felt that 1409 raised an issue which also affects POAManager::deactivate, so the following change is also proposed) Add the following paragraph to the end of the description of deactivate in section 9.3.2, page 9-19. "If deactivate is called multiple times before destruction is complete (because there are active requests), the etherealize_objects parameter will only apply to the first call of deactivate, subsequent calls with conflicting etherealize_objects settings will use the value of the etherealize_objects from the first call. The wait_for_completion parameter will be handled as defined above for each individual call (some callers may choose to block, while others may not)." ---------------------------------------------------------------- Return-Path: Sender: jon@floorboard.com Date: Fri, 10 Jul 1998 20:13:34 -0700 From: Jonathan Biggar To: Jon Goldberg CC: port-rtf@omg.org Subject: Re: Revision of 1408-9, 1428 References: <35A6CB7C.DCD123B@inprise.com> Jon Goldberg wrote: > > Hi Folks- > > We seemed to have reached some consensus in the last conference call > on > the behavior of destroy (1408-9) and wait_for_completion (1428). > The > following proposal is assumed to withdraw any previous proposals and > to invalidate previous votes. Please consider the new proposal and > cast your vote. Even if you have already voted on a previous > proposal, you need to vote again because Dan is wiping the slate > clean. I vote yes on both proposals. > 1. The new proposal is mostly George's original which had > the following characteristics: > a. the atomicity for a destroy() is a single POA. If you > destroy a parent POA, it will *first* destroy its children > and then destroy itself. While the children are being destroyed, > there is no indication that the parent POA is pending destruction. > b. the behavior of a POA pending destruction is as if its > POAManager > is in the holding state. New requests will be queued. > > 2. That proposal is now modified to indicate that the parent POA > destroys its children recursively, and then destroys itself once > there are no more children. This covers the case where a new child > is created (or recreated) during destruction. > > 3. I realized the following new problem while writing this message: > > a. Object A calls (remote) object B which calls back into > Object A. > b. If destroy is called on A's POA before it calls over to > object B, the rules we discussed previously will cause > deadlock since destruction is defined as pending until > all active requests have completed. > > This call pattern is too common to have this deadlock allowed. > Therefore, I've modified the proposal such that the POA pending > destruction does *not* wait for all currently outstanding requests > to > complete. Instead, it sets itself as if its POAManager is in the > holding state and then starts etherealization. If any currently > executing requests then recursively call back into that POA, they > will > block until the etherealization is complete. This should be fine > even > in multi-threaded environments, assuming etherealization no longer > explicitly destroys the Servant (which is a language-mapping issue). > If the etherealization happens to cause a call back into that same > POA, there will be a deadlock. The new proposal calls this out as a > warning about complicated processing during etherealization. I think we are going to end up chasing our tails forever if we try to work around every possible deadlock condition. I think we have this tuned about as good as it's going to get. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Date: Sun, 12 Jul 1998 22:08:09 -0400 From: Bob Kukura Organization: IONA Technologies To: Jon Goldberg CC: port-rtf@omg.org Subject: Re: Revision of 1408-9, 1428 References: <35A6CB7C.DCD123B@inprise.com> I want to vote YES on these, but can't. I have to vote NO (for Martin) on 140[89] because, unless I am completely misinterpreting Jon's last-minute deadlock workaround, it seems to be allowing a POA to call etherealize() on a Servant while that Servant is still processing a request dispatched by that POA. If I am completely misinterpreting the text, then I still have to vote NO because the proposed text is ambiguous. Assuming I understand the intent, this seems like a fundamental change to the "serialization rules" for incarnation and etherealization, and to the role etherealize() plays in a server application. My initial reaction is that, with this change, etherealization becomes totally meaningless, and might as well be deprecated. The etherealize() call was intended to be used by applications as an indication of when a Servant is no longer used by a particular POA - allowing the application to save the Servant's state, free the Servant's storage, or do whatever else the application needed to do when that POA was done with that Servant. But I don't think any of these uses are possible any longer with this change, since the Servant may still be processing requests when etherealize() is called. We voted YES on the reference counting proposal on the basis that use of reference counting remains optional and the existing programs are not invalidated. But this additional change seems to force whatever work used to be done in etherialize() to be moved into _remove_ref(). Not only would this break existing programs, but the POA would lose significant functionality, since there would then be no way to share a Servant among several POAs and know for certain when a particular POA is done with that Servant. I have to vote NO on the latest proposal for 1428 because it introduces cross-ORB dependencies that are not necessary. It describes ORB::shutdown()'s wait_for_completion flag as being valid when "the current thread is not in an invocation context dispatched by some ORB". I see no reason why any invocation context of any ORB instance other than the one being shutdown should matter. Similarly, the wait_for_completion flag should only be invalid in POA operations when called from invocation contexts dispatched by some POA belonging to the same ORB as the POA on which the operation was invoked. -Bob Jon Goldberg wrote: > > Hi Folks- > > We seemed to have reached some consensus in the last conference call > on > the behavior of destroy (1408-9) and wait_for_completion (1428). > The > following proposal is assumed to withdraw any previous proposals and > to invalidate previous votes. Please consider the new proposal and > cast your vote. Even if you have already voted on a previous > proposal, you need to vote again because Dan is wiping the slate > clean. > > 1. The new proposal is mostly George's original which had > the following characteristics: > a. the atomicity for a destroy() is a single POA. If you > destroy a parent POA, it will *first* destroy its children > and then destroy itself. While the children are being destroyed, > there is no indication that the parent POA is pending destruction. > b. the behavior of a POA pending destruction is as if its > POAManager > is in the holding state. New requests will be queued. > > 2. That proposal is now modified to indicate that the parent POA > destroys its children recursively, and then destroys itself once > there are no more children. This covers the case where a new child > is created (or recreated) during destruction. > > 3. I realized the following new problem while writing this message: > > a. Object A calls (remote) object B which calls back into > Object A. > b. If destroy is called on A's POA before it calls over to > object B, the rules we discussed previously will cause > deadlock since destruction is defined as pending until > all active requests have completed. > > This call pattern is too common to have this deadlock allowed. > Therefore, I've modified the proposal such that the POA pending > destruction does *not* wait for all currently outstanding requests > to > complete. Instead, it sets itself as if its POAManager is in the > holding state and then starts etherealization. If any currently > executing requests then recursively call back into that POA, they > will > block until the etherealization is complete. This should be fine > even > in multi-threaded environments, assuming etherealization no longer > explicitly destroys the Servant (which is a language-mapping issue). > If the etherealization happens to cause a call back into that same > POA, there will be a deadlock. The new proposal calls this out as a > warning about complicated processing during etherealization. > > The proposal for 1428 is very stable at this point and I'm > hoping it will be adopted quickly. I only modified the > phrase "POA-dispatched execution context" to "execution context > dispatched by some POA" for clarity. > > fire away, > Jon > > ---------------------------------------------------------------- > Issue 1408: POA destroy() is ill-defined > Issue 1409: Multiple threads calling destroy() once destroy() > has begun > Nature: Revision > > Summary: POA destroy is not defined sufficiently enough to > prevent multiple activations of the same POA name in > the same process. POA::destroy semantics are not defined > for multiple threads calling destroy. > > Resolution: Accepted for Corba RTF 2.3 > > Revision: The following text will replace the current describing > the destroy operation in Section 9.3.8, Page 9-31. > > This operation destroys the POA and all descendant POAs. All > descendant POAs are destroyed (recursively) before the destruction > of > the containing POA. The POA so destroyed (that is, the POA with its > name) may be re-created later in the same process. (This differs > from > the POAManager::deactivate operation that does not allow a > re-creation > of its associated POA in the same process. After a deactivate, > re-creation is allowed only if the POA is later destroyed.) > > When a POA is destroyed, any requests that have started execution > continue to completion. Any requests that have not started > execution are processed as if the POA's POAManager were in the > holding state, until POA destruction is complete. Once > POA destruction is complete, all queued requests (if > any) will behave as if they were newly arrived, that is, the POA > will attempt to cause recreation of the POA by invoking one or > more adapter activators. POA destruction does not block > until all active requests complete execution, as this can > cause deadlock. > > If the etherealize_objects parameter is TRUE, the POA has the RETAIN > policy, and a servant manager is registered with the POA, the > etherealize operation on the servant manager will be called for each > active object in the Active Object Map. Etherealization can occur > while active requests are still executing on those servants. The > POA > behaves as if its POAManager were in the holding state while all > calls > to etherealize are made. Therefore, an etherealize method that > attempts to invoke operations on the POA will deadlock. POA > destruction is considered complete once all active objects > have been etherealized. > > {editorial note - below I have merged text from these two issues and > the relevant text from issue 1428 because they change the same text. > If one of these resolutions fails, we will need to edit the > following > text) > > The wait_for_completion parameter is handled as follows: > > - If wait_for_completion is TRUE and the current > thread is not in an invocation context dispatched from any POA, > the destroy operation will return only after all active requests > have completed and all invocations of etherealize have > completed. > > - If wait_for_completion is TRUE and the current thread is in an > invocation context dispatched from any POA, then the > BAD_INV_ORDER > exception is thrown and POA destruction does not occur. > > - If wait_for_completion is FALSE, the destroy operation > destroys the POA but does not wait for active requests to > complete nor for etherealization to occur. > > If destroy is called multiple times before destruction is > complete (because there are active requests), the > etherealize_objects parameter will only apply to the first > call of destroy. Subsequent calls with conflicting > etherealize_objects settings will use the value of the > etherealize_objects from the first call. The wait_for_completion > parameter will be handled as defined above for each individual > call (some callers may choose to block, while others may not). > > ---------------------------------------------------------------- > Issue 1428: Blocking POA Operations > Nature: Revision > > Summary: Several operations added to CORBA as part of the > Portability submission provide blocking behavior which > can result in deadlock in a large number of cases. > These calls include POA::destroy, ORB::shutdown, > POAManager::deactivate, POAManager::hold_requests, > POAManager::discard_requests. > > Resolution: Accepted for Corba 2.3 RTF > > Revision: The following changes are proposed: > > Replace the second sentence in the paragraph of section 4.9.4, > page 4-20, which begins with "If the wait_for_completion ..." > with the following: > > "If the wait_for_completion parameter is TRUE and the current > thread is not in an invocation context dispatched by some ORB, this > operation blocks until all ORB processing (including request > processing and object deactivation or other operations associated > with object adapters) has completed. If the wait_for_completion > parameter is TRUE and the current thread is in an invocation context > dispatched by some ORB, then the BAD_INV_ORDER exception is thrown." > > Replace the phrase "If the parameter is TRUE" in the 2nd > paragraph, 2nd sentence of the hold_requests description in > Section 9.3.2, page 9-18, with the phrase "If the parameter > is TRUE and the current thread is not in an invocation context > dispatched by some POA". Add the sentence "If the parameter > is TRUE and the current thread is in an invocation context > dispatched by some POA then the BAD_INV_ORDER exception is raised > and the state is not changed." > > Replace the phrase "If the parameter is TRUE" in the 2nd > paragraph, 2nd sentence of the discard_requests description in > Section 9.3.2, page 9-18, with the phrase "If the parameter > is TRUE and the current thread is not in an invocation context > dispatched by some POA". Add the sentenence "If the parameter > is TRUE and the current thread is in an invocation context > dispatched by some POA then the BAD_INV_ORDER exception is raised > and the state is not changed." to the end of the paragraph. > > Replace the phrase "If the parameter is TRUE" in the 3rd > paragraph, 2nd sentence of the deactivate description in > Section 9.3.2, page 9-18, with the phrase "If the parameter > is TRUE and the current thread is not in an invocation context > dispatched by some POA". Add the sentenence "If the parameter > is TRUE and the current thread is in an invocation context > dispatched > by some POA then the BAD_INV_ORDER exception is raised and the > state is not changed." to the end of the paragraph. > > (editorial note- I changed the text for POA::destroy in the > resolution of issues 1408 and 1409 above, so it is not shown > here but a similar change to the above is necessary. > I also felt that 1409 raised an issue which also affects > POAManager::deactivate, so the following change is also > proposed) > > Add the following paragraph to the end of the description of > deactivate in section 9.3.2, page 9-19. > > "If deactivate is called multiple times before destruction is > complete (because there are active requests), the > etherealize_objects parameter will only apply to the first > call of deactivate, subsequent calls with conflicting > etherealize_objects settings will use the value of the > etherealize_objects from the first call. The wait_for_completion > parameter will be handled as defined above for each individual > call (some callers may choose to block, while others may not)." > > ---------------------------------------------------------------- Return-Path: Sender: jon@floorboard.com Date: Sun, 12 Jul 1998 20:01:53 -0700 From: Jonathan Biggar To: Bob Kukura CC: Jon Goldberg , port-rtf@omg.org Subject: Re: Revision of 1408-9, 1428 References: <35A6CB7C.DCD123B@inprise.com> <35A96C09.C54EBC6@iona.com> Bob Kukura wrote: > > I want to vote YES on these, but can't. > > I have to vote NO (for Martin) on 140[89] because, unless I am > completely misinterpreting Jon's last-minute deadlock workaround, it > seems to be allowing a POA to call etherealize() on a Servant while > that > Servant is still processing a request dispatched by that POA. If I > am > completely misinterpreting the text, then I still have to vote NO > because the proposed text is ambiguous. > > Assuming I understand the intent, this seems like a fundamental > change > to the "serialization rules" for incarnation and etherealization, > and to > the role etherealize() plays in a server application. My initial > reaction is that, with this change, etherealization becomes totally > meaningless, and might as well be deprecated. The etherealize() > call > was intended to be used by applications as an indication of when a > Servant is no longer used by a particular POA - allowing the > application > to save the Servant's state, free the Servant's storage, or do > whatever > else the application needed to do when that POA was done with that > Servant. But I don't think any of these uses are possible any > longer > with this change, since the Servant may still be processing requests > when etherealize() is called. > > We voted YES on the reference counting proposal on the basis that > use of > reference counting remains optional and the existing programs are > not > invalidated. But this additional change seems to force whatever > work > used to be done in etherialize() to be moved into _remove_ref(). > Not > only would this break existing programs, but the POA would lose > significant functionality, since there would then be no way to share > a > Servant among several POAs and know for certain when a particular > POA is > done with that Servant. I have to withdraw my yes vote, because Bob's argument is convincing. The POA must wait until all requests on an object are completed before it can etherealize it. If that causes deadlocks for poorly written applications, so be it. > I have to vote NO on the latest proposal for 1428 because it introduces > cross-ORB dependencies that are not necessary. It describes > ORB::shutdown()'s wait_for_completion flag as being valid when "the > current thread is not in an invocation context dispatched by some ORB". > I see no reason why any invocation context of any ORB instance other > than the one being shutdown should matter. Similarly, the > wait_for_completion flag should only be invalid in POA operations when > called from invocation contexts dispatched by some POA belonging to the > same ORB as the POA on which the operation was invoked. But you can get the situation where a process 1 calls process 2 which calls back to process 1 which then calls back to process 2 with an operation that calls shutdown on process 2. This will deadlock. Perhaps this is too complicated an should just be warned about, but it isn't just local calls that can cause a deadlock. It just much easier from a documentation and implementation point of view to simply disallow shutdown(TRUE) inside any POA dispatched request. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Sender: jon@floorboard.com Date: Sun, 12 Jul 1998 20:19:03 -0700 From: Jonathan Biggar To: Bob Kukura , Jon Goldberg , port-rtf@omg.org Subject: Re: Revision of 1408-9, 1428 References: <35A6CB7C.DCD123B@inprise.com> <35A96C09.C54EBC6@iona.com> <35A978A1.104341A7@floorboard.com> Jonathan Biggar wrote: > I have to withdraw my yes vote, because Bob's argument is > convincing. > The POA must wait until all requests on an object are completed > before > it can etherealize it. If that causes deadlocks for poorly written > applications, so be it. To clarify my vote, if the proposal is changed back to require requests to complete before the etherealization of the object, then I will vote yes. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: From: "Daniel R. Frantz" To: "'Bob Kukura'" Cc: Subject: RE: Revision of 1408-9, 1428 Date: Mon, 13 Jul 1998 08:19:04 -0400 X-MSMail-Priority: Normal Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V4.72.2106.4 X-MIME-Autoconverted: from 8bit to quoted-printable by beasys.com id FAA07096 >-----Original Message----- >From: Bob Kukura [mailto:kukura@iona.com] >Sent: Sunday, July 12, 1998 10:08 PM >To: Jon Goldberg >Cc: port-rtf@omg.org >Subject: Re: Revision of 1408-9, 1428 > > >I want to vote YES on these, but can't. > >I have to vote NO (for Martin) on 140[89] because, unless I am >completely misinterpreting Jon's last-minute deadlock workaround, it >seems to be allowing a POA to call etherealize() on a Servant >while that >Servant is still processing a request dispatched by that POA. ... >The etherealize() call >was intended to be used by applications as an indication of when a >Servant is no longer used by a particular POA - allowing the application >to save the Servant's state, free the Servant's storage, or do whatever >else the application needed to do when that POA was done with that >Servant. Hmmm... The current spec already does what you don't like. I think your understanding of etherealize is incorrect. A POA calls etherealize only as the result of taking a single ObjectId from the AOM, not strictly for the purpose of deleting servants. That means that etherealize can indeed save the Servant's state for this particular object. It doesn't necessarily mean that the POA is done with the Servant. The current spec makes a distinction between a Servant executing multiple requests on the same object and a Servant executing requests on other objects, so that may be the source of confusion. (see below) So, if a POA can already call etherealize when a Servant is still processing, the proposals for 1408-9 and 1428 don't change etherealize at all. They are only trying to clean up some problems regarding POA::destroy(). If 1408-9,28 aren't good enough for destroy(), we should wait for the next round, but if they clean up destroy() without changing etherealize, it is useful to put them in now. I think they don't change etherealize and they're good enough. I vote YES. Now, what does etherealize really mean? Under deactivate, p. 9-35 This operation causes the association of the Object Id specified by the oid parameter and its servant to be removed from the Active Object Map. If a servant manager is associated with the POA, ServantLocator::etherealize will be invoked with the oid and the servant. . . Note cator::etherealize may be invoked multiple times with the same servant when the other objects are deactivated. It is the responsibility of the object implementation to refrain from destroying the servant while it is active with any Id. The very signature of etherealize clearly shows the intent to allow etherealization while other request processing is still going on. Etherealize is on pages 9-22/23. void etherealize ( in ObjectId oid, in POA adapter, in Servant serv, in boolean cleanup_in_progress, in boolean remaining_activations ); and then the description of "remaining_activation" says (last paragraph in the section). In a multi-threaded environment, the POA makes certain guarantees that allow servant managers to safely destroy servants. Specifically, the servant's entry in the Active Object Map corresponding to the target object is removed before etherealize() is called. Because calls to incarnate() and etherealize() are serialized, this prevents new requests for the target object from being invoked on the servant during etherealization. After removing the entry from the Active Object Map, if the POA determines before invoking etherealize() that other requests for the same target object are already in progress on the servant, it delays the call to etherealize() until all active methods for the target object have completed. Therefore, when etherealize() is called, the servant manager can safely destroy the servant if it wants to, unless the remaining_activations argument is TRUE. The last two sentence seem pretty clear: the POA will wait till all processing is done for the single object that was deactivated, but won't wait till processing is done on all objects. If it is the case the processing is done on all objects, then remaining_activation is FALSE and the etherealize can delete the Servant, but not otherwise. In the case of POA::destroy(), the POA will deactivate each object, leading to an etherealize call for each object in the AOM. Eventually will call etherealize when there are no more objects in the AOM for that servant, so that etherealize can indeed delete the servant. Dan Return-Path: Date: Mon, 13 Jul 1998 09:51:23 -0400 From: Paul H Kyzivat Organization: NobleNet To: Jon Goldberg CC: port-rtf@omg.org Subject: Re: Revision of 1408-9, 1428 References: <35A6CB7C.DCD123B@inprise.com> I agree that calling etherealize with operations outstanding is a bad idea. In addition, the following doesn't seem possible: > - If wait_for_completion is FALSE, the destroy operation > destroys the POA but does not wait for active requests to > complete nor for etherealization to occur. The new proposal has recursive destruction of child POAs happening first, during which time the parent seems to be fully functional. I believe that in effect, the destruction of the child POAs must be done with wait_for_completion TRUE, to guarantee that they are done using the parent POAs for etherealization. For that to work, it means that if destroy is called on a POA with wait_for_completion FALSE, the call may return with the POA still appearing to be functional, undestroyed. However a sequence of events has been initiated that will eventually lead to its being destroyed, once all of its children are destroyed. This only requires a change to the wording in the section quoted above, something like: - If wait_for_completion is FALSE, the destroy operation initiates destruction of the POA but does not wait for active requests to complete nor for etherealization to occur. Upon return from the operation the POA may not yet be destroyed, but will become destroyed after the destruction and etherealization of child POAs is complete. Return-Path: Date: Mon, 13 Jul 1998 11:04:09 -0400 From: Bob Kukura Organization: IONA Technologies To: "Daniel R. Frantz" CC: port-rtf@omg.org Subject: Re: Revision of 1408-9, 1428 References: <008601bdae58$6e26e600$3fc5bdce@idler.beasys.com> Daniel R. Frantz wrote: > > >-----Original Message----- > >From: Bob Kukura [mailto:kukura@iona.com] > >Sent: Sunday, July 12, 1998 10:08 PM > >To: Jon Goldberg > >Cc: port-rtf@omg.org > >Subject: Re: Revision of 1408-9, 1428 > > > > > >I want to vote YES on these, but can't. > > > >I have to vote NO (for Martin) on 140[89] because, unless I am > >completely misinterpreting Jon's last-minute deadlock workaround, > it > >seems to be allowing a POA to call etherealize() on a Servant > >while that > >Servant is still processing a request dispatched by that POA. > ... > >The etherealize() call > >was intended to be used by applications as an indication of when a > >Servant is no longer used by a particular POA - allowing the > application > >to save the Servant's state, free the Servant's storage, or do > whatever > >else the application needed to do when that POA was done with that > >Servant. > > Hmmm... The current spec already does what you don't like. I think > your > understanding of etherealize is incorrect. A POA calls etherealize > only > as the result of taking a single ObjectId from the AOM, not strictly > for > the purpose of deleting servants. That means that etherealize can > indeed > save the Servant's state for this particular object. It doesn't > necessarily mean that the POA is done with the Servant. I apologize for not being clear enough in my message. I do understand that a Servant can be etherealized for one OID while it is still serving requests for other OIDs. My reading of Jon's proposal and his associated discussion was that he wanted to change the rules to allow a Servant to be etherealized for an OID while it is still serving requests for that same OID. It is that change that I object to. Jon, if I am missinterpreting you here, please speak up. > > The current spec makes a distinction between a Servant executing > multiple requests on the same object and a Servant executing > requests on > other objects, so that may be the source of confusion. (see below) > > So, if a POA can already call etherealize when a Servant is still > processing, the proposals for 1408-9 and 1428 don't change > etherealize > at all. They are only trying to clean up some problems regarding > POA::destroy(). If 1408-9,28 aren't good enough for destroy(), we > should > wait for the next round, but if they clean up destroy() without > changing > etherealize, it is useful to put them in now. I think they don't > change > etherealize and they're good enough. I vote YES. > > Now, what does etherealize really mean? Under deactivate, p. 9-35 > > This operation causes the association of the Object Id > specified by the oid parameter and its servant to be > removed from the Active Object Map. If a servant manager > is associated with the POA, ServantLocator::etherealize > will be invoked with the oid and the servant. > . > . > Note or::etherealize > may be invoked multiple times with the same servant when > the other objects are deactivated. It is the > responsibility of the object implementation to refrain > from destroying the servant while it is active with any > Id. > > The very signature of etherealize clearly shows the intent to allow > etherealization while other request processing is still going on. > Etherealize is on pages 9-22/23. > > void etherealize ( > in ObjectId oid, > in POA adapter, > in Servant serv, > in boolean cleanup_in_progress, > in boolean remaining_activations ); > > and then the description of "remaining_activation" says (last > paragraph > in the section). > > In a multi-threaded environment, the POA makes > certain guarantees that allow servant managers > to safely destroy servants. Specifically, the > servant's entry in the Active Object Map > corresponding to the target object is removed > before etherealize() is called. Because calls > to incarnate() and etherealize() are serialized, > this prevents new requests for the target object > from being invoked on the servant during > etherealization. After removing the entry from > the Active Object Map, if the POA determines > before invoking etherealize() that other > requests for the same target object are already > in progress on the servant, it delays the call > to etherealize() until all active methods for > the target object have completed. Therefore, > when etherealize() is called, the servant > manager can safely destroy the servant if it > wants to, unless the remaining_activations > argument is TRUE. This text pretty clearly requires that the POA not call etherealize() for a particular Servant/OID combination until all dispatched invocations on that Servant for that OID have completed. If Jon had intended to change this behaviour, he would have had to change this paragraph as well. Either he missed this, or I am completely misinterpreting his proposal. > > The last two sentence seem pretty clear: the POA will wait till all > processing is done for the single object that was deactivated, but > won't > wait till processing is done on all objects. If it is the case the > processing is done on all objects, then remaining_activation is > FALSE > and the etherealize can delete the Servant, but not otherwise. > > In the case of POA::destroy(), the POA will deactivate each object, > leading to an etherealize call for each object in the > AOM. Eventually > will call etherealize when there are no more objects in the AOM for > that > servant, so that etherealize can indeed delete the servant. If this remains the specified behaviour, I don't understand what Jon means by "the POA pending destruction does *not* wait for all currently outstanding requests to complete" in the discussion and by "POA destruction does not block until all active requests complete execution, as this can cause deadlock" in the proposed text. If this can be explained to me without invalidating the above "remaining_activations" paragraph, and the proposed text can be clarified if necessary (if its not just me), then I'd be happy to vote YES. -Bob > > Dan Return-Path: Date: Mon, 13 Jul 1998 08:30:47 -0700 From: "Jon Goldberg" To: Jonathan Biggar vCC: kukura@iona.com, jgoldberg@inprise.com, port-rtf@omg.org Subject: Re: Revision of 1408-9, 1428 References: <35A6CB7C.DCD123B@inprise.com> <35A96C09.C54EBC6@iona.com> On 1428, I think we *should* prevent shutdown(TRUE) from being called in any dispatched thread since we should be preventing even that complex deadlock scenario from being possible. (In case it isn't recorded, Jeff M. has voted YES on both of these proposals). As far as 1408-9, if we change it to only allow etherealization after all active requests have completed, we will allow deadlock in the very simple recursive scenario. I think the RTF is just stuck on this one and I won't bother amending the proposal further since we're out of time. -Jon G. Return-Path: Sender: "George Scott" Date: Tue, 14 Jul 1998 20:37:00 -0700 From: "George M. Scott" Organization: Inprise Corporation To: port-rtf@omg.org CC: gscott@inprise.com Subject: urgent POA issues Unfortunately, I missed all of the exciting debate on the port-rtf list last week, but I did notice a number of the most significant problems (IMHO) did not reach resolution. I think it is fairly important that we reach resolution on these for CORBA 2.3, as they fix a number of potentially serious problems that affect portability of multi-threaded or recursive POA programs. It seemed to me that we were very close to resolution and that if we can get agreement among the members of the POA RTF, we could then try and find a way to get the changes into CORBA 2.3 through some means (It's a shame that July is such a popular vacation month. ;-) ). The issues of concern are 1408, 1409, 1428, and 1627 (which was mistakenly transferred to the C++ RTF IMHO). I intend to send out concrete proposals for these issues in separate messages. If necessary I will reopen 1627 as a new issue, since it is not a language mapping issue, but a general POA issue. First let me summarize the problems and then provide some discussion: 1428 - blocking POA operations. This issue relates to potential deadlock situations in various ORB/POA calls which take a wait_for_completion parameter. It was not clear to me why this did not pass since it looked like there was consensus last Friday, and it should have been included in the RTF report. 1408/1409 - problems with destroy(). The behavior of destroy is not defined explicitly enough to guarantee consistency of programs. 1627 - problems with deactivate_object(). The behavior of deactivate_object is not defined explicitly enough to guarantee consistency of programs. Note the solution to this issue is parallel to the issue of 1408/1409. Also, a very important note, this issue has nothing to do with C++ memory management of servants, which is a separate language mapping specific issue and should be addressed in the C++ RTF (IMHO). I don't think we need much further discussion on 1428, so I mostly want to discuss destroy and deactivate_object which are essentially the same problem. The issue is when can an object be safely deactivated and when can a POA be safely destroyed. More importantly, when does the apparent destruction of the POA occur and when does the apparent deactivation of the object occur? Let's consider deactivate_object first because that is an easy case to illustrate the problems in the current spec. Imagine a server which represents millions of database records by encapsulating them in CORBA Objects. A servant activator is installed to incarnate and etherealize the objects as necessary. The CORBA server essentially serves as a cache for the database. Objects are periodically deactivated to conserve memory which results in etherealize being called and their current state being written back to the database. For improved startup time, a number of objects are pre-activated when the server is started by explicitly calling activate_object rather than wait for the ServantManager to create them. Now let's look at the problems in the current spec and what happens to database consistency, because deactivate_object semantics are not strong enough. Let's say there is an active object, A, which was explicitly activated by calling activate_object. The servant associated with object A is currently processing several requests for A, and the server decides it needs to persist A to the database because resources are running low. A thread in the server will call deactivate_object passing the id for A as an argument. According to the spec, A will immediately be removed from the active object map, but will not be etheralized until all current requests for A have completed. So we now have requests executing in an object which is not in the active object map and has not been etherealized meaning, its state is not in the database. There are many nasty scenarios which can now occur: - A new request arrives for A. The object is not in the active object map, so incarnate() is called on the ServantActivator. (Note, this does not violate the rules for serialization of ServantActivators which state that incarnation may not overlap for objects which were incarnated by a ServantActivator because the original object A was activated explicitly using activate_object.) Because A is still in the server and has not been etherealized, a "stale" version of A will be incarnated from the database, which we will call A'. Eventually the requests executing in A will complete and A will be etherealized to the database. When A' is etherealized it will overwrite A. The database is now most likely completely inconsistent. - Now consider what happens to requests that are executing in A after A has been deactivated. If they call any POA operation which requires the use of the active object map (i.e. the RETAIN policy), they will get the incorrect result. For instance, if they call id_to_servant they may get ObjectNotActive, however if the A' in the above scenario has already been created they may get the Servant for A' which will most likely be different than the Servant for A. The results could be disastorous for the application. There are many variations on the above scenarios which can all occur because the apparent deactivation of an object occured before it should have. With the current model users will have to always be prepared to handle this strange behavior in all user code. This will make development of CORBA components which can be dynamically managed in memory by intelligent application servers very difficult if not impossible. Users will have to modify their code to handle the incosistent behavior and such modification will most likely be dependent on the particular server in which a CORBA component is being deployed. We may as well just give the market to Microsoft now, and not waste our time. All dramatics aside, this is a serious problem which needs to be addressed and very soon in my opinion. Now if it isn't clear already, the POA destroy operation has the exact same problem. If the server were to manage objects by destroying entire POAs instead of individual objects then it is possible to have a POA B and a POA B' in existence at the same time. And not only have a single object inconsistent but an entire set of objects managed by that POA totally inconsistent. I'm sure that will sell a lot of POA implementations.... So here are the requirements as we see them: 1. Etherealize may only be called for an object which has no currently executing requests. 2. Apparent destruction of a POA or deactivation of an object does not occur until all active requests in that POA or object have completed. 3. After destruction or deactivation has commenced no new requests can begin processing until destruction or deactivation has completed. 4. The system should not deadlock even in the presence of recursive calls across multiple servers. Our proposal satisifies all of the above requirements without changing any POA APIs or drastically changing behavior. In all cases the POA behavior is merely clarified and made more explicit so programs may be written in a portable fashion. Let's look at each in more detail: 1. Etheralization may only occur after all requests have completed. This is what the current spec states and we do not intend to change this behavior. (We take back what we said in our earlier proposal) 2. Apparent destruction/deactivation occurs after all requests have completed. Today the spec only states that apparent destruction occurs before etherealization is called. Since etherealization is called after all requests have completed, this change is consistent with the current spec, but adds a stronger requirement that apparent destruction must not occur until all active requests have completed. So this change strengthens the semantics of the spec, but does not change or weaken the current semantics. 3. After destruction or deactivation no new requests can begin. This is also consistent with the current spec, though it does result in a creation which is delayed because the apparent destruction (#2 above) may be delayed due to active requests. 4. Deadlocks should not occur. Now, I will be the first to admit that this is nearly impossible to prove in a complex distributed system. But I would like to eliminate the obvious ones. Our previous proposals could deadlock because two objects in two different processes could have a mutual recursion which could result in deadlock. For example object A could call object B in another server, object B then attempts to call object A again. However, right before object B calls object A another thread in object A's server attempts to deactivate A or destroy A's POA. This will result in deadlock because our previous proposal would act as if the POA's POAManager was in the holding state which will queue requests and hence block, resulting in deadlock. Our new proposal states that the object will behave as if it is in the discarding state which means it will not queue the request but instead throw the TRANSIENT exception. What a client ORB or application does with TRANSIENT is not currently defined in any CORBA spec, but it does say the request should be reissued. So an ORB may simply repeatedly reissue the request resulting in a livelock, or it may be intelligent and realize that after five minutes of receiving transients it may as well give up. This is implementation dependent but the important thing is that the ORB/application is deadlock free and whether or not it will make progress or livelock is an implementation decision that vendors may choose to make. Before anybody screams about this please remember that whether or not applications deadlock in CORBA today (even with the POA) is dependent on the server's thread model. A single threaded ORB will always deadlock in the above scenario because it can't handle distributed mutual recursion. A multi- threaded ORB with a fixed size thread pool will deadlock when it runs out of threads. All we are trying to accomplish is to allow a well written ORB/POA to not deadlock even in the most extreme cases. We believe this is possible, and the spec should allow for such implementations. Now there are a few improvements we can also make to help this new model out a little bit. For example, if an object is being deactivated but it has the policy USE_ACTIVE_OBJECT_MAP_ONLY (the default policy) then it is possible to actual behave as if the POA were in the inactive state and immediately throw OBJECT_NOT_EXIST. Similiarly if a POA is destroyed and its parent does not have an adapter activator then it is also possible to immediately return an OBJECT_NOT_EXIST because the POA will not be automatically created after it is destroyed and OBJECT_NOT_EXIST will be the result. This will handle a lot of the common cases and improve performance because the requests will not need to be reissued. As I said previously I will send out our proposals in separate messages. We would greatly appreciate feedback as soon as possible and would like to build consensus behind a solution to this problem this week be it our solution or any other proposed solution. Thanks. George Return-Path: Sender: "George Scott" Date: Tue, 14 Jul 1998 20:38:41 -0700 From: "George M. Scott" Organization: Inprise Corporation To: port-rtf@omg.org CC: gscott@inprise.com Subject: Proposal for issue 1428 Fire away.... Issue 1428: Blocking POA Operations Nature: Revision Summary: Several operations added to CORBA as part of the Portability submission provide blocking behavior which can result in deadlock in a large number of cases. These calls include POA::destroy, ORB::shutdown, POAManager::deactivate, POAManager::hold_requests, POAManager::discard_requests. Resolution: Accepted for Corba 2.3 RTF Revision: The following changes are proposed: Replace the second sentence in the paragraph of section 4.9.4, page 4-20, which begins with "If the wait_for_completion ..." with the following: "If the wait_for_completion parameter is TRUE and the current thread is not in an invocation context dispatched by some ORB, this operation blocks until all ORB processing (including request processing and object deactivation or other operations associated with object adapters) has completed. If the wait_for_completion parameter is TRUE and the current thread is in an invocation context dispatched by some ORB, then the BAD_INV_ORDER exception is thrown." Replace the phrase "If the parameter is TRUE" in the 2nd paragraph, 2nd sentence of the hold_requests description in Section 9.3.2, page 9-18, with the phrase "If the parameter is TRUE and the current thread is not in an invocation context dispatched by some POA". Add the sentence "If the parameter is TRUE and the current thread is in an invocation context dispatched by some POA then the BAD_INV_ORDER exception is raised and the state is not changed." Replace the phrase "If the parameter is TRUE" in the 2nd paragraph, 2nd sentence of the discard_requests description in Section 9.3.2, page 9-18, with the phrase "If the parameter is TRUE and the current thread is not in an invocation context dispatched by some POA". Add the sentenence "If the parameter is TRUE and the current thread is in an invocation context dispatched by some POA then the BAD_INV_ORDER exception is raised and the state is not changed." to the end of the paragraph. Replace the phrase "If the parameter is TRUE" in the 3rd paragraph, 2nd sentence of the deactivate description in Section 9.3.2, page 9-18, with the phrase "If the parameter is TRUE and the current thread is not in an invocation context dispatched by some POA". Add the sentenence "If the parameter is TRUE and the current thread is in an invocation context dispatched by some POA then the BAD_INV_ORDER exception is raised and the state is not changed." to the end of the paragraph. (editorial note- I changed the text for POA::destroy in the resolution of issues 1408 and 1409 above, so it is not shown here but a similar change to the above is necessary. I also felt that 1409 raised an issue which also affects POAManager::deactivate, so the following change is also proposed) Add the following paragraph to the end of the description of deactivate in section 9.3.2, page 9-19. "If deactivate is called multiple times before destruction is complete (because there are active requests), the etherealize_objects parameter will only apply to the first call of deactivate, subsequent calls with conflicting etherealize_objects settings will use the value of the etherealize_objects from the first call. The wait_for_completion parameter will be handled as defined above for each individual call (some callers may choose to block, while others may not)." Return-Path: Sender: jon@floorboard.com Date: Wed, 15 Jul 1998 14:07:39 -0700 From: Jonathan Biggar To: "George M. Scott" CC: port-rtf@omg.org Subject: Re: Proposal for issue 1428 References: <35AC2441.76DBF058@inprise.com> George M. Scott wrote: > > Fire away.... > > Issue 1428: Blocking POA Operations > Nature: Revision > > Summary: Several operations added to CORBA as part of the > Portability submission provide blocking behavior which > can result in deadlock in a large number of cases. > These calls include POA::destroy, ORB::shutdown, > POAManager::deactivate, POAManager::hold_requests, > POAManager::discard_requests. > > Resolution: Accepted for Corba 2.3 RTF > > Revision: The following changes are proposed: [ Text snipped for brevity.] I agree with this proposal as written. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Sender: jon@floorboard.com Date: Wed, 15 Jul 1998 14:24:29 -0700 From: Jonathan Biggar To: "George M. Scott" CC: port-rtf@omg.org Subject: Re: urgent POA issues References: <35AC23DC.7B2F1ECF@inprise.com> George M. Scott wrote: > > Unfortunately, I missed all of the exciting debate on the port-rtf > list last week, but I did notice a number of the most significant > problems (IMHO) did not reach resolution. > > I think it is fairly important that we reach resolution on these for > CORBA 2.3, as they fix a number of potentially serious problems that > affect portability of multi-threaded or recursive POA programs. It > seemed to me that we were very close to resolution and that if we > can > get agreement among the members of the POA RTF, we could then try > and > find a way to get the changes into CORBA 2.3 through some means > (It's > a shame that July is such a popular vacation month. ;-) ). > > The issues of concern are 1408, 1409, 1428, and 1627 (which was > mistakenly transferred to the C++ RTF IMHO). I intend to send out > concrete proposals for these issues in separate messages. If > necessary I will reopen 1627 as a new issue, since it is not a > language mapping issue, but a general POA issue. > > First let me summarize the problems and then provide some > discussion: > > 1428 - blocking POA operations. This issue relates to potential > deadlock situations in various ORB/POA calls which take a > wait_for_completion parameter. It was not clear to me why this did > not pass since it looked like there was consensus last Friday, and > it > should have been included in the RTF report. > > 1408/1409 - problems with destroy(). The behavior of destroy is not > defined explicitly enough to guarantee consistency of programs. There was not consensus on 1428, 1408/1409 due to the requirement that the POA act as if it were in the discarding state. Many RTF members want to find a solution that is more transparent to clients. > - A new request arrives for A. The object is not in the active object map, > so incarnate() is called on the ServantActivator. (Note, this does not > violate the rules for serialization of ServantActivators which state that > incarnation may not overlap for objects which were incarnated by a > ServantActivator because the original object A was activated explicitly > using activate_object.) Because A is still in the server and has not > been etherealized, a "stale" version of A will be incarnated from the > database, which we will call A'. Eventually the requests executing in > A will complete and A will be etherealized to the database. When A' > is etherealized it will overwrite A. The database is now most > likely completely inconsistent. It would be better to explicitly modify the text to state that when the POA uses a ServantActivator, that explicit calls to activate_object() and deactivate_object() are also serialied in the same way as incarnate() and etherealize() are. > - Now consider what happens to requests that are executing in A after > A has been deactivated. If they call any POA operation which requires > the use of the active object map (i.e. the RETAIN policy), they will > get the incorrect result. For instance, if they call id_to_servant > they may get ObjectNotActive, however if the A' in the above scenario > has already been created they may get the Servant for A' which will > most likely be different than the Servant for A. The results could > be disastorous for the application. No A' should exist due to my above comment. It is questionable whether this is a pracital scenario. Since the request is operating in the context of the servant already, why should the operation implementation need to call id_to_servant()? > There are many variations on the above scenarios which can all occur > because the apparent deactivation of an object occured before it > should have. With the current model users will have to always be > prepared to handle this strange behavior in all user code. This > will > make development of CORBA components which can be dynamically > managed > in memory by intelligent application servers very difficult if not > impossible. Users will have to modify their code to handle the > incosistent behavior and such modification will most likely be > dependent on the particular server in which a CORBA component is > being > deployed. We may as well just give the market to Microsoft now, and > not waste our time. All dramatics aside, this is a serious problem > which > needs to be addressed and very soon in my opinion. A bit heavy on the retoric, don't you think? > Let's look at each in more detail: > > 1. Etheralization may only occur after all requests have completed. > This is > what the current spec states and we do not intend to change this > behavior. > (We take back what we said in our earlier proposal) Fine. > 2. Apparent destruction/deactivation occurs after all requests have completed. > Today the spec only states that apparent destruction occurs before > etherealization is called. Since etherealization is called after all > requests have completed, this change is consistent with the current > spec, but adds a stronger requirement that apparent destruction must not > occur until all active requests have completed. So this change strengthens > the semantics of the spec, but does not change or weaken the current > semantics. Right. > 3. After destruction or deactivation no new requests can begin. This is > also consistent with the current spec, though it does result in a creation > which is delayed because the apparent destruction (#2 above) may be > delayed due to active requests. Right. > 4. Deadlocks should not occur. Now, I will be the first to admit that this > is nearly impossible to prove in a complex distributed system. But I would > like to eliminate the obvious ones. Our previous proposals could deadlock > because two objects in two different processes could have a mutual > recursion which could result in deadlock. For example object A could call > object B in another server, object B then attempts to call object A again. > However, right before object B calls object A another thread in object > A's server attempts to deactivate A or destroy A's POA. This will result > in deadlock because our previous proposal would act as if the POA's > POAManager was in the holding state which will queue requests and hence > block, resulting in deadlock. Our new proposal states that the object > will behave as if it is in the discarding state which means it will not > queue the request but instead throw the TRANSIENT exception. > > What a client ORB or application does with TRANSIENT is not currently > defined in any CORBA spec, but it does say the request should be reissued. > So an ORB may simply repeatedly reissue the request resulting in a > livelock, or it may be intelligent and realize that after five minutes > of receiving transients it may as well give up. True, it is not explicitly stated, but the weight of the evidence suggests that an ORB is supposed to make the TRANSIENT exception visible to client code. A more general question is whether this tightly coupled design (A calls B which calls A) is desireable in the first place. I think most designers would see this as a trouble spot right away, and redesign the system to use an event channel or oneway call to resolve the deadlock. > This is implementation > dependent but the important thing is that the ORB/application is > deadlock > free and whether or not it will make progress or livelock is an > implementation decision that vendors may choose to make. Before > anybody > screams about this please remember that whether or not > applications > deadlock in CORBA today (even with the POA) is dependent on the > server's > thread model. A single threaded ORB will always deadlock in the > above > scenario because it can't handle distributed mutual recursion. This is not necessarily true for single threaded servers. Some can handle recursive dispatch of requests while blocked waiting for a remote invocation. > A multi- > threaded ORB with a fixed size thread pool will deadlock when it > runs out > of threads. All we are trying to accomplish is to allow a well > written > ORB/POA to not deadlock even in the most extreme cases. We > believe this > is possible, and the spec should allow for such implementations. A laudable goal. > Now there are a few improvements we can also make to help this new model > out a little bit. For example, if an object is being deactivated but > it has the policy USE_ACTIVE_OBJECT_MAP_ONLY (the default policy) then > it is possible to actual behave as if the POA were in the inactive > state and immediately throw OBJECT_NOT_EXIST. True, this would be a useful deadlock avoidance technique. > Similiarly if a POA is destroyed and its parent does not have an > adapter activator then it is also possible to immediately return > an > OBJECT_NOT_EXIST because the POA will not be automatically > created after > it is destroyed and OBJECT_NOT_EXIST will be the result. Also valid. > This will handle a lot of the common cases and improve performance because > the requests will not need to be reissued. Again, the weight of the evidence does not suggest that an ORB implementation should be free to intercept TRANSIENT exceptions and reissue them transparently for the client. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org Return-Path: Date: Wed, 15 Jul 1998 18:54:35 -0400 From: Paul H Kyzivat Organization: NobleNet To: "George M. Scott" CC: port-rtf@omg.org Subject: Re: urgent POA issues References: <35AC23DC.7B2F1ECF@inprise.com> George M. Scott wrote: > Imagine a server which represents millions of database records by > encapsulating them in CORBA Objects. A servant activator is > installed > to incarnate and etherealize the objects as necessary. The CORBA > server essentially serves as a cache for the database. Objects are > periodically deactivated to conserve memory which results in > etherealize being called and their current state being written back > to > the database. For improved startup time, a number of objects are > pre-activated when the server is started by explicitly calling > activate_object rather than wait for the ServantManager to create > them. I presume you are assuming a UNIQUE_ID policy so that a given servant is associated with at most one object. > > Now let's look at the problems in the current spec and what happens > to > database consistency, because deactivate_object semantics are not > strong enough. Let's say there is an active object, A, which was > explicitly activated by calling activate_object. The servant > associated with object A is currently processing several requests > for > A, and the server decides it needs to persist A to the database > because resources are running low. A thread in the server will call > deactivate_object passing the id for A as an argument. According to > the spec, A will immediately be removed from the active object map, > but will not be etheralized until all current requests for A have > completed. So we now have requests executing in an object which is > not in the active object map and has not been etherealized meaning, > its state is not in the database. There are many nasty scenarios > which can now occur: > > - A new request arrives for A. The object is not in the active > object map, > so incarnate() is called on the ServantActivator. (Note, this > does > not > violate the rules for serialization of ServantActivators which > state that > incarnation may not overlap for objects which were incarnated by > a > ServantActivator because the original object A was activated > explicitly > using activate_object.) Because A is still in the server and has > not > been etherealized, a "stale" version of A will be incarnated from > the > database, which we will call A'. Eventually the requests > executing > in > A will complete and A will be etherealized to the database. When > A' > is etherealized it will overwrite A. The database is now most > likely completely inconsistent. OK, I think I see your problem. The spec implied to me (although it isn't explicit) that the serialization of incarnate and etherealize must extent to stalling a subsequent incarnation while an etherealization is pending for the same objectId. This is different than requiring the object to remain active until the etherealization begins - tests for the presence of the object can still fail (because it has been deactivated) and an explict activation is OK. > > - Now consider what happens to requests that are executing in A > after > A has been deactivated. If they call any POA operation which > requires > the use of the active object map (i.e. the RETAIN policy), they > will > get the incorrect result. For instance, if they call > id_to_servant > they may get ObjectNotActive, however if the A' in the above > scenario > has already been created they may get the Servant for A' which > will > most likely be different than the Servant for A. The results > could > be disastorous for the application. Given the interpretation I gave above, there would be no A' while there are requests executing in A. It indeed might be the case that id_to_servant would fail. This has nothing to do with the timing of etherealization - it only has to do with explicit multithreading code written by the developer. Why should it succeed? It is unwise to be be writing code that depends on activations in a POA while at the same time writing concurrent code that removes them. And this isn't a hard thing to avoid in this case since the servant ought to already know who it is, or it can find out from POACurrent. > > There are many variations on the above scenarios which can all occur > because the apparent deactivation of an object occured before it > should have. With the current model users will have to always be > prepared to handle this strange behavior in all user code. This > will > make development of CORBA components which can be dynamically > managed > in memory by intelligent application servers very difficult if not > impossible. There are (at least) three things going on here: - removal of the oid:servant association from the map - etherealization of the oid:servant association - deletion of the servant No matter what we do, all of these need to be dealt with, and these may all occur at (more-or-less) the same time, or at widely spaced times, or not at all, in various combinations. The existing policies make some combinations easier to use than others. You are not going to get agreement about when these things "should" occur because there is no one answer. You seem to think that one particular combination of interest to you is not currently supported (or at least isn't easily used), and are also proposing that another combination that is currently allowed is bad or useless and should be replaced by the one you want.