Issue 3675: Heuristic exception (ots-rtf) Source: Hewlett-Packard (Dr. Malik Saheb, ) Nature: Uncategorized Issue Severity: Summary: 1) The OTS specification indicates that the Resource::prepare operation can raise a heuristic exception. This can appears when the Resource acts as a sub-coordinator and at least one of its resources takes a heuristic decision. However I didn't find a clear text explaining the behavior or "protocol" of a coordinator which receives such exception. What should be the decision of this coordinator regarding to others resource having replied with VoteCommit? Resolution: Roll back the transaction (see 3600 above) Revised Text: Actions taken: June 8, 2000: received issue July 1, 2003: closed issue Discussion: End of Annotations:===== From: "Peter Furniss" To: "OTS-RTF" Subject: ots - resource & recoverycoordinator no longer there Date: Thu, 1 Jun 2000 15:00:42 +0100 Message-ID: MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Importance: Normal Content-Type: text/plain; charset="iso-8859-1" X-UIDL: ink!!(h4!!;1)e9e`Ce9 Apologies if this is already on the issues list - omg site seems to inaccessible to me today (again). I had originally thought the recovery mechanisms were unambiguous, but then met others who read them equally unambiguously, but not quite the same. Background: Failures during the commit exchanges can lead to recovery attempts trying to get to either Resources or RecoveryCoordinators that no longer exist (depending on exactly when the failure occurred). They can also lead to attempts to access one or the other when the object instance isn't available. If a target Resource really does not exist, the coordinator can infer that an earlier Commit got through and the response was lost, so the coordinator can stop trying (and forget it's own logs). If a RecoveryCoordinator does not exist, the Resource can infer that the transaction rolledback. This behaviour seems to be summarised in the Failures and Recovery section. In the section "If No Heuristic Decision is Made", describing Resource behaviour, it explicitly states that OBJECT_NOT_EXIST to replay_completion, it will know the transaction rolledback, whereas COMM_FAILURE means it must try again. Questions: 1) There is no corresponding statement for a Coordinator (or recovered coordinator - strictly a client of Resource) getting exceptions on attempting to access the Resource. Should there be ? 2) Is it only OBJECT_NOT_EXIST that will definitively mean the object does not now and never will again exist. What about INV_OBJREF ? Can all orb's be trusted not to throw these exceptions if the object is being still possibly going to be recreated by some recovering server ? 3) replay_completion supplies a Resource parameter, which (since there is ("implicitly")) a separate RecoveryCoordinator for each Resource, can be a replacement for the original Resource reference. Should this be explained more fully. 4) If the (original) commit did get through, is the Resource perhaps expected to remain available for some time (how long), rather than become non-existent. (The protocol would work if any request targetted on an extinct Resource caused the temporary creation of an instance that just replied to the commit, rather than the coordinator treating OBJECT_NOT_EXIST as "gone away") Peter Furniss -------------------------------------- Associate Open-IT Limited 58 Alexandra Crescent, Bromley, Kent BR1 4EX, UK Phone : +44 (0) 20 7729 9012 Email : P.Furniss@mailbox.ulcc.ac.uk From: "Mark Little" To: "Peter Furniss" , "OTS-RTF" References: Subject: Re: ots - resource & recoverycoordinator no longer there Date: Thu, 1 Jun 2000 15:38:17 +0100 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2919.6600 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6600 Content-Type: text/plain; charset="iso-8859-1" X-UIDL: c=md9,m-e9%LTd9=nA!! ----- Original Message ----- > 2) Is it only OBJECT_NOT_EXIST that will definitively mean the object does > not now and never will again exist. What about INV_OBJREF ? Can all orb's > be trusted not to throw these exceptions if the object is being still > possibly going to be recreated by some recovering server ? I could name a certain ORB that does not even support OBJECT_NOT_EXIST, and uses INV_OBJREF instead. > 4) If the (original) commit did get through, is the Resource perhaps > expected to remain available for some time (how long), rather than become > non-existent. (The protocol would work if any request targetted on an > extinct Resource caused the temporary creation of an instance that just > replied to the commit, rather than the coordinator treating OBJECT_NOT_EXIST > as "gone away") But this then leads to the question: how long does the "temporary" remain around? Since failures don't happen that often it seems wrong to modify the protocol such that extra work is always being done (in the form of creating these temporaries) even if a failure doesn't happen. Granted you don't know when a failure will happen. Creating "dummy" CORBA objects isn't a null-op in terms of performance either. Why not simply rely upon the ORB and OBJECT_NOT_EXIST (or INV_OBJREF if you're on ANother ORB)? If the application is correctly implemented, the ORB should eventually return such an exception, and the (recovering) coordinator can complete the transaction. That way the same exception may be raised (and therefore checked for) in the event of a "failure" during normal commit processing, and during recovery of a transaction coordinator (a coordinator could crash just after issuing commit and upon recovery will have to re-issue commit: this "temporary" would potentially have to stay around for a long time). As for updating the text, I would agree. Cheers, Mark. ----------------------------------------------------------------------- SENDER : Dr. Mark Little, Arjuna Project, Distributed Systems Research. PHONE : +44 191 222 8066, FAX : +44 191 222 8232 POST : Department of Computing Science, University of Newcastle upon Tyne, UK, NE1 7RU EMAIL : M.C.Little@newcastle.ac.uk