Issue 3789: "RequiredTraits" exception issue (pids-rtf2) Source: Care Data Systems (Mr. Jon Farmer, jfarmer18@nc.rr.com) Nature: Uncategorized Issue Severity: Summary: he "RequiredTraits" exception returns a sequence of traits. But make_ids_permenant takes an ARRAY of input ids. So this exception if thrown won't provide any usefull information as to which id had the problem. Our implementation, rather than throwing the exception, just creates the ID in a temporary state. However, as we recently considered supporting the exception in conjunction with a probabilistic matching algorithm, we found the following ambiguity: The RequiredTraits exception inlcudes a list of required traits, but the spec doesn't clarify that this is the complete list of required traits - assuming that its the same set for any input profile - so that the client can figure out which IDs need for traits. The problem with that semantic is as follows. Some probabilisitc matching algorithms can only determine the adequacy of the incoming trait set by inspecting the VALUES of the trait types supplied. for example, if the incoming request tries to register two ids, one with first=John , Last=Smith, and the second with First=Mergetroid Last=Pepperdyne, a probabilisitc algorithm will likely accept the second but reject the first; YET BOTH supplied the same traits. Hwo could the client figure out what's missing? How the server even figure out what's missing? Neither can. ONLY THE ALGORITHM KNOWS....... AND IT ONLY KNOWS ONCE IT HAS SEEN THE INPUTS. Therefore we need to define clearly what the exception semantics are, and in a way that doesn't preclude the use of probabilistic matchers that cannot assess adequacy in a "one trait list fits all" manner. One possible solution is to have the exception return a SEQUENCE OF offending ids - so the client can know which are offensive. Instead of also including for each the set of traits needed, the return structure could show the confidence deficit for each. Resolution: see below Revised Text: The RequiredTraits exception will be modified to return a MultipleFailureSeq structure, which supplies the following return info for each problematic input ID: the_index identifies the entry that this structure responds to ExceptionReason Indicates the specific reason for failure of this entry. TraitNameSeq if the input failed due to lack of some required traits, then this structure names the specific traits that were missing for this entry. This solution would enable the client to efficiently submit a batch of temporary IDs to the operation, get the exception if appropriate, easily segregate the inputs and then resubmit only the "keepers". While this solution completely and robustly resolves the issue, it does require that we modify the IDL for the RequiredTraits exception. (Rejected) IDL-Saving Alternative The following narrative, if added to the spec, would render the existing IDL usable, but would still prevent the use of the operation with mutliple input IDs. In other words, the spec would remain broken, but usable. "If the server determines that the profile for any of the IDs in question is inadequate for permanent status, the it shall raise the RequiredTraits exception. Furthermore: 1. If the server defines a set of required (mandatory traits), then the exception shall return the sequence of trait names for the required traits. 2. In the case where the server either does not define a set of required traits, or does define a set of required traits but has also disqualified one or more Ids based on other rules, it shall return an empty sequence of trait names. This solution leaves the client unable to discover which of the inputs failed for what reason. For example, even if there were mutliple input IDs specified, if the exception were thrown with the return structure empty, the client would not be able to determine which ones were bad except by submitting each input in its own separate invocation of the operation. This could be prohibitively difficult, to the point where the operation is never used except with only one input entry per invocation. Disposition: Resolved Revised Text: 1. In the description of the RequiredTraits exception, insert the following text: "Note that since a "trait" includes both a trait name and trait value, the Required Traits exception denotes the fact that the operation's inputs may have been lacking not only in "which traits had values", but in the values themselves. In both cases, the RequiredTraits exception would be thrown. However, in the fomer case the reason would be "RequiredTraits" and in the latter case the reason would be "InsuffientConfidence". These specific reasons for failure are represented in the second element of the MultipleFailureSeq structure (show the table here): 2. Add InsufficientConfidence to the enum of failure reasons. And add supportive text explaining: "The specification of InsufficientConfidence as a failure reason in a make_ids_permanent operation indicates that the underlying logic has rejecting the profile's confidence by virtue of trait values." Actions taken: August 10, 2000: received issue February 27, 2001: closed issue Discussion: End of Annotations:===== Reply-To: "Jon Farmer" From: "Jon Farmer" To: Subject: issue for PIDS RTF3 Date: Thu, 10 Aug 2000 13:57:58 -0400 Organization: Care Data Systems MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2314.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Content-Type: text/plain; charset="iso-8859-1" X-UIDL: O$g!!E5(e9($(!!:G*!! Issue: The "RequiredTraits" exception returns a sequence of traits. But make_ids_permenant takes an ARRAY of input ids. So this exception if thrown won't provide any usefull information as to which id had the problem. Our implementation, rather than throwing the exception, just creates the ID in a temporary state. However, as we recently considered supporting the exception in conjunction with a probabilistic matching algorithm, we found the following ambiguity: The RequiredTraits exception inlcudes a list of required traits, but the spec doesn't clarify that this is the complete list of required traits - assuming that its the same set for any input profile - so that the client can figure out which IDs need for traits. The problem with that semantic is as follows. Some probabilisitc matching algorithms can only determine the adequacy of the incoming trait set by inspecting the VALUES of the trait types supplied. for example, if the incoming request tries to register two ids, one with first=John , Last=Smith, and the second with First=Mergetroid Last=Pepperdyne, a probabilisitc algorithm will likely accept the second but reject the first; YET BOTH supplied the same traits. Hwo could the client figure out what's missing? How the server even figure out what's missing? Neither can. ONLY THE ALGORITHM KNOWS....... AND IT ONLY KNOWS ONCE IT HAS SEEN THE INPUTS. Therefore we need to define clearly what the exception semantics are, and in a way that doesn't preclude the use of probabilistic matchers that cannot assess adequacy in a "one trait list fits all" manner. One possible solution is to have the exception return a SEQUENCE OF offending ids - so the client can know which are offensive. Instead of also including for each the set of traits needed, the return structure could show the confidence deficit for each. Reply-To: "Jon Farmer" From: "Jon Farmer" To: , "Juergen Boldt" Cc: "Tim Brinson" References: <4.2.0.58.20000824172128.00c6ceb0@emerald.omg.org> Subject: Re: issue 3789 -- PIDS RTF 3 issue Date: Mon, 30 Oct 2000 12:08:56 -0500 Organization: Care Data Systems MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2314.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Content-Type: text/plain; charset="iso-8859-1" X-UIDL: 26Rd9g4#e9=End9iC'!! I can;t see anything in the Adopted or Available spec that describes how to interpret the trait sequence that comes back in this exception. Does anybody recall what the semantics are on that sequence? I can imagine that it simply contains the set of traits that are required, but can anybody confirm this? If that's the case, then we can resolve it simply with explanatory text. Jon ----- Original Message ----- From: Juergen Boldt To: ; Sent: Thursday, August 24, 2000 4:22 PM Subject: issue 3789 -- PIDS RTF 3 issue > This is issue # 3789 (Jon Farmer" ) > > > "RequiredTraits" exception issue > > he "RequiredTraits" exception returns a sequence of traits. But > make_ids_permenant takes an ARRAY > of input ids. So this exception if thrown won't provide any usefull > information as to which id > had the problem. > > Our implementation, rather than throwing the exception, just creates >the ID > in a temporary state. However, as we recently considered supporting >the > exception in conjunction with a probabilistic matching algorithm, we >found > the following ambiguity: > > The RequiredTraits exception inlcudes a list of required traits, but >the > spec doesn't clarify that this is the complete list of required >traits - > assuming that its the same set for any input profile - so that the >client > can figure out which IDs need for traits. > > The problem with that semantic is as follows. Some probabilisitc >matching > algorithms can only determine the adequacy of the incoming trait set >by > inspecting the VALUES of the trait types supplied. for example, if >the > incoming request tries to register two ids, one with first=John , > Last=Smith, and the second with First=Mergetroid Last=Pepperdyne, a > probabilisitc algorithm will likely accept the second but reject the first; > YET BOTH supplied the same traits. Hwo could the client figure out >what's > missing? How the server even figure out what's missing? Neither >can. ONLY > THE ALGORITHM KNOWS....... AND IT ONLY KNOWS ONCE IT HAS SEEN THE >INPUTS. > > Therefore we need to define clearly what the exception semantics >are, and in > a way that doesn't preclude the use of probabilistic matchers that >cannot > assess adequacy in a "one trait list fits all" manner. > > One possible solution is to have the exception return a SEQUENCE OF > offending ids - so the client can know which are offensive. Instead >of also > including for each the set of traits needed, the return structure >could show > the confidence deficit for each. > > ================================================================ > > Juergen Boldt > Senior Member of Technical Staff > > Object Management Group Tel. +1-781 444 0404 ext. 132 > 250 First Avenue, Suite 201 Fax: +1-781 444 0320 > Needham, MA 02494, USA Email: juergen@omg.org > > > > ================================================================ > Reply-To: "Jon Farmer" From: "Jon Farmer" To: , "David Forslund" References: <4.2.0.58.20000928133426.00c71780@emerald.omg.org> <5.0.0.25.2.20001030092221.02906e80@cic-mail.lanl.gov> Subject: Re: issue 3859 -- PIDS RTF issue Date: Mon, 30 Oct 2000 11:59:02 -0500 Organization: Care Data Systems MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2314.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Content-Type: text/plain; charset="iso-8859-1" X-UIDL: fm=e9YN!"!;?p!!me$!! I like yours best. I will put it into the draft report and diseminate it. Jon ----- Original Message ----- From: David Forslund To: Jon Farmer ; Sent: Monday, October 30, 2000 11:25 AM Subject: Re: issue 3859 -- PIDS RTF issue > What we have done, which was recommended to us by others as a "generic" > solution to illegal IDL of this type, is to add the PersonIdService:: > string to the beginning of each of the offending types as indicated > below. This is simple, elegant and not arbitrary. > > > struct TraitSelector { > PersonIdService::Trait trait; > float weight; > }; > typedef sequence TraitSelectorSeq; > > struct Candidate { > PersonId id; > float confidence; > PersonIdService::Profile profile; > }; > .. > struct TaggedProfile { > PersonId id; > PersonIdService::Profile profile; > }; > typedef sequence TaggedProfileSeq; > > struct QualifiedTaggedProfile { > QualifiedPersonId id; > PersonIdService::Profile profile; > }; > At 10:05 AM 10/30/2000 -0500, Jon Farmer wrote: > >Hello folks, > >the PIDS RTF3 comment deadline has expired (October 9) and our Final Report > >is Due November 20. > >We only have two issues to discuss and resolve. I will very soon set up a > >conference call to resolve these, but first I'll start with some discussion > >on 3859: > > > >= 3859 ======= this one simply needs to fix constructs "Profile profile" and > >"Trait trait", whcih are not legal in 2.3. > > > >The "fix" to this is highty arbitrary (unless anyone can find an OMG > >precedent), but here is what we did, to get the discussion rolling. > > > > struct TaggedProfile { > > PersonId id; > > Profile __profile; > > }; > > typedef sequence TaggedProfileSeq; > > > > struct QualifiedTaggedProfile { > > QualifiedPersonId id; > > Profile __profile; > > }; > > typedef sequence QualifiedTaggedProfileSeq; > > > > struct ProfileUpdate { > > PersonId id; > > TraitNameSeq del_list; > > TraitSeq modify_list; > > }; > > typedef sequence< ProfileUpdate > ProfileUpdateSeq; > > > > struct MergeStruct { > > PersonId id; > > PersonId preferred_id; > > }; > > typedef sequence< MergeStruct > MergeStructSeq; > > > > struct TraitSelector { > > Trait __trait; > > float weight; > > }; > > typedef sequence TraitSelectorSeq; > > > > struct Candidate { > > PersonId id; > > float confidence; > > Profile __profile; > > }; > > > >----- Original Message ----- > >From: Juergen Boldt > >To: ; > >Sent: Thursday, September 28, 2000 12:35 PM > >Subject: issue 3859 -- PIDS RTF issue > > > > > > > This is issue # 3859 (David Ellis, dellis@esitechnology.com ) > > > > > > The IDL contained in the PIDS specification is not CORBA compliant > > > > > > The IDL contained in the PIDS specification is not CORBA compliant. > > > > > > Specifically, the declarations > > > > > > struct TaggedProfile { PersonId id; Profile profile; }; > > > > > > and > > > > > > struct TraitSelector { Trait trait; float weight; }; > > > > > > fail to comply with the rule that the semantics of OMG IDL forbids > > > identifiers in the same scope to differ only in case. > > > > > > The PIDS IDL will compile under the OrbixWeb ORB, but it fails to compile > > > under ORBacus unless the --case-sensitive option (which is not CORBA > > > compliant) is used. > > > > > > ================================================================ > > > > > > Juergen Boldt > > > Senior Member of Technical Staff > > > > > > Object Management Group Tel. +1-781 444 0404 ext. 132 > > > 250 First Avenue, Suite 201 Fax: +1-781 444 0320 > > > Needham, MA 02494, USA Email: juergen@omg.org > > > > > > > > > > > > ================================================================ > > > > > David W. Forslund dwf@lanl.gov > Computer and Computational Sciences http://www.acl.lanl.gov/~dwf > Los Alamos National Laboratory Los Alamos, NM 87545 > 505-665-1907 FAX: 505-665-4939 > Reply-To: "Jon Farmer" From: "Jon Farmer" To: "pids-rtf2" Subject: PIDS RTF3 conference call and email poll Date: Tue, 7 Nov 2000 10:59:42 -0500 Organization: Care Data Systems MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2314.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Content-Type: text/plain; charset="iso-8859-1" X-UIDL: F11!!!RD!!-5m!![<7!! Hi Folks, This note is to schedule a conference call and to put our only two issues conveniently in front of you - all in one email! Please look over the issues and gather your thoughts for the conference call and the email poll. ******* conf call info ***** In the conference call we will establish our draft proposed resolutions to these two issues. Monday, November 13, 12:00 noon Eastern time, for 90 minutes. The call-in number is 1-203-748-8964. When you dial that number you will be prompted to enter the last 4 numbers (8964) again and then enter the passcode ID which is 24715. The first person to call won't hear anything until the next person calls in. **********email issues poll info *********: The email poll will begin when the conference call ends. the conference call info is: your email poll responses will be due (please respond to the whole list) 9am Nov 20 so I can prepare the final report the same day. We have only two issues to deal with (3859 and . I pulled their text into tail of this email for your convenience. In each case I put the issue description for 3859 under a double line (=========) and proposed fix under single line (---------) Issue 3859:============================================== the issue itself: > This is issue # 3859 (David Ellis, dellis@esitechnology.com ) > > The IDL contained in the PIDS specification is not CORBA compliant > > The IDL contained in the PIDS specification is not CORBA compliant. > > Specifically, the declarations > > struct TaggedProfile { PersonId id; Profile profile; }; > > and > > struct TraitSelector { Trait trait; float weight; }; > > fail to comply with the rule that the semantics of OMG IDL forbids > identifiers in the same scope to differ only in case. > > The PIDS IDL will compile under the OrbixWeb ORB, but it fails to compile > under ORBacus unless the --case-sensitive option (which is not CORBA > compliant) is used. > --------------------------------- Dave Forslunds proposed fix ------------------------ What we have done, which was recommended to us by others as a "generic" solution to illegal IDL of this type, is to add the PersonIdService:: string to the beginning of each of the offending types as indicated below. This is simple, elegant and not arbitrary. struct TraitSelector { PersonIdService::Trait trait; float weight; }; typedef sequence TraitSelectorSeq; struct Candidate { PersonId id; float confidence; PersonIdService::Profile profile; }; .. struct TaggedProfile { PersonId id; PersonIdService::Profile profile; }; typedef sequence TaggedProfileSeq; struct QualifiedTaggedProfile { QualifiedPersonId id; PersonIdService::Profile profile; }; At 10:05 AM 10/30/2000 -0500, Jon Farmer wrote: --------- my proposed fix (I like DWF's above fix etter) ------------------ The "fix" to this is highty arbitrary (unless anyone can find an OMG precedent), but here is what we did, to get the discussion rolling. struct TaggedProfile { PersonId id; Profile __profile; }; typedef sequence TaggedProfileSeq; struct QualifiedTaggedProfile { QualifiedPersonId id; Profile __profile; }; typedef sequence QualifiedTaggedProfileSeq; struct ProfileUpdate { PersonId id; TraitNameSeq del_list; TraitSeq modify_list; }; typedef sequence< ProfileUpdate > ProfileUpdateSeq; struct MergeStruct { PersonId id; PersonId preferred_id; }; typedef sequence< MergeStruct > MergeStructSeq; struct TraitSelector { Trait __trait; float weight; }; typedef sequence TraitSelectorSeq; struct Candidate { PersonId id; float confidence; Profile __profile; }; issue 3789 ============================= This is issue # 3789 (Jon Farmer" ) "RequiredTraits" exception issue he "RequiredTraits" exception returns a sequence of traits. But make_ids_permenant takes an ARRAY of input ids. So this exception if thrown won't provide any usefull information as to which id had the problem. Our implementation, rather than throwing the exception, just creates the ID in a temporary state. However, as we recently considered supporting the exception in conjunction with a probabilistic matching algorithm, we found the following ambiguity: The RequiredTraits exception inlcudes a list of required traits, but the spec doesn't clarify that this is the complete list of required traits - assuming that its the same set for any input profile - so that the client can figure out which IDs need for traits. The problem with that semantic is as follows. Some probabilisitc matching algorithms can only determine the adequacy of the incoming trait set by inspecting the VALUES of the trait types supplied. for example, if the incoming request tries to register two ids, one with first=John , Last=Smith, and the second with First=Mergetroid Last=Pepperdyne, a probabilisitc algorithm will likely accept the second but reject the first; YET BOTH supplied the same traits. Hwo could the client figure out what's missing? How the server even figure out what's missing? Neither can. ONLY THE ALGORITHM KNOWS....... AND IT ONLY KNOWS ONCE IT HAS SEEN THE INPUTS. Therefore we need to define clearly what the exception semantics are, and in a way that doesn't preclude the use of probabilistic matchers that cannot assess adequacy in a "one trait list fits all" manner. ------------------------------- solution A ----------------------------- One possible solution is to have the exception return a SEQUENCE OF offending ids - so the client can know which are offensive. Instead of also including for each the set of traits needed, the return structure could show the confidence deficit for each. this fix would constitute an IDL change, justified by a classification of the issue as an IDL error - that is, the IDL doesn;t square with the semantics of a probabilistic implementation. ---------------------------------- solution B ----------------------------- A weasly but somewhat effective solution would be to simply return the list of "required traits", understanding tht a fgvien impelmentation, if it uses a probabilistic implementation - may not have any "required traits!" - however, if the implementation qualifies profiles BEFORE matching based on an extremely bare minimum set of mandatory trait types, then it could return these.