Issue 17495: Section B6 - identification algorithm isn't terribly clear (canonical-xmi-ftf)
Source: Oracle (Mr. Dave Hawkins, dave.hawkins(at)oracle.com)
Nature: Uncategorized Issue
Severity: 
Summary: The identification algorithm isn't terribly clear and has
several variants that could be simplified into a single
approach. It references UML qualified names, which doesn't
make sense as this algorithm should work for a general MOF
metamodel. Overloading '-' for separating objects and
numeric discriminators is unwise as it means uniqueness
checks must go beyond the peer elements.


Alternative algorithm
1) The base name for an object is its name*. If there is no
   name, the base name is the name of the owning property or
   "_" for a top level object.
2) Any base name characters that are not (NCNameChar - '-')
   should be replaced with '_'. If a top level object does
   not start with (Letter | '_') a '_' should be prefixed.
3) If the object has no name or the base name is a duplicate
   of an earlier (by export order) peer base name append a
   numeric qualifier:
   a) append '_' if the last character is not already one
   b) append a sequence number, starting with 1 when the
      object has no name and 2 if it does. It is possible
      that an earlier peer name contains an '_n' suffix
      that creates a name collision. In this case increment
      the sequence number until no collision exists.
4) The id for a root object is the base name.
   The id for a nested object is parent id + '-' + base name.


*There is no universal name property. So a tag should be
 introduced to identify valid name properties.


Resolution: Adopt the suggestion, but making use of the existing isID property and making the syntax a bit less cryptic
Revised Text: In Section B6. Identification, replace the 3 bullets giving the xmi:id rules by the following:
1)	The identifier of an object is the value of the first property, ordered according to section B5.2, that has isID = true and a non-empty value. If this gives no identifier, the value of a property called "name" is used if one exists.
2)	The base name for an object is its identifier. If there is no identifier, the base name is "_" for a top level object, otherwise the name of the property containing the object (e.g. packagedElement) [in other words the unprefixed name of the XML element which has the xmi:id attribute].
3)	Any base name characters that are not valid XML id characters (defined using the production NCNameChar in http://www.w3.org/TR/REC-xml-names/ ) should be replaced with underscore  '_'.  Hyphen ‘-‘ characters should also be replaced with ‘_’. If a top level object does not start with a Letter or underscore ‘_’ then an underscore '_' should be prefixed.
4)	If the object has no identifier, or the base name (after character replacement) is a duplicate of an earlier (by export order) sibling base name, then:
a.	append underscore '_' if the last character is not already underscore ‘_’;
b.	append a sequence number, starting with 1 when the object has no name, and 2 if it does. It is possible that an earlier sibling name contains a '_n' suffix that creates a name collision. In this case increment the sequence number until no collision exists.
5)	The xmi:id for a root object is the base name. The xmi:id for a nested object is the xmi:id of its parent followed by hyphen ‘-‘ followed by its base name.


Replace the xmi:ids in example B7 to be as follows. Note that this is an update to the revised example contained in the resolution to issue 17496, already accepted in Ballot 1:

&lt;uml:Operation xmi:id="op1" xmi:uuid="DCE:1234" xmi:type="uml:Operation"&gt; 
&lt;name&gt;op1&lt;/name&gt; 
&lt;ownedRule xmi:id="op1-c01" xmi:uuid="DCE:abcd" xmi:type="uml:Constraint"&gt; 
&lt;name&gt;co1&lt;/name&gt; 
&lt;specification xmi:id="op1-c01-specification” xmi:uuid="DCE:abcde1" 
xmi:type="uml:OpaqueExpression"&gt; 
&lt;body&gt;First Constraint definition&lt;/body&gt; 
&lt;/specification&gt; 
&lt;constrainedElement xmi:idref="op1"/&gt; 
&lt;/ownedRule&gt; 
&lt;ownedRule xmi:id="op1-co2" xmi:uuid="DCE:efgh" xmi:type="uml:Constraint"&gt;
&lt;name&gt;co2&lt;/name&gt; 
&lt;specification xmi:id="op1-co2-specification" xmi:uuid="DCE:abcde2" 
xmi:type="uml:OpaqueExpression"&gt; 
&lt;body&gt;Second Constraint definition&lt;/body&gt; 
&lt;/specification&gt; 
&lt;constrainedElement xmi:idref="op1"/&gt; 
&lt;/ownedRule&gt; 
&lt;ownedRule xmi:id="op1-co3" xmi:uuid="DCE:ijkl" xmi:type="uml:Constraint"&gt;
&lt;name&gt;co3&lt;/name&gt; 
&lt;specification xmi:id="op1-co3-specification" xmi:uuid="DCE:abcde3" 
xmi:type="uml:OpaqueExpression"&gt; 
&lt;body&gt;Third Constraint definition&lt;/body&gt; 
&lt;/specification&gt; 
&lt;constrainedElement xmi:idref="op1"/&gt; 
&lt;/ownedRule&gt; 
&lt;ownedRule href="doc2.xml#co4"/&gt; 
&lt;/uml:Operation&gt; 


Actions taken:
July 13, 2012: received issue
December 23, 2013: closed issue
Discussion: 

End of Annotations:=====
s is issue # 17495   From: Dave Hawkins <dave.hawkins@oracle.com>

Section B6 - identification algorithm isn't terribly clear 

The identification algorithm isn't terribly clear and has
several variants that could be simplified into a single
approach. It references UML qualified names, which doesn't
make sense as this algorithm should work for a general MOF
metamodel. Overloading '-' for separating objects and
numeric discriminators is unwise as it means uniqueness
checks must go beyond the peer elements.


Alternative algorithm
1) The base name for an object is its name*. If there is no
   name, the base name is the name of the owning property or
   "_" for a top level object.
2) Any base name characters that are not (NCNameChar - '-')
   should be replaced with '_'. If a top level object does
   not start with (Letter | '_') a '_' should be prefixed.
3) If the object has no name or the base name is a duplicate
   of an earlier (by export order) peer base name append a
   numeric qualifier:
   a) append '_' if the last character is not already one
   b) append a sequence number, starting with 1 when the
      object has no name and 2 if it does. It is possible
      that an earlier peer name contains an '_n' suffix
      that creates a name collision. In this case increment
      the sequence number until no collision exists.
4) The id for a root object is the base name.
   The id for a nested object is parent id + '-' + base name.


*There is no universal name property. So a tag should be
 introduced to identify valid name properties.