Issue 16573: Please provide a complete set of production rules
Issue 16574: XMI schema and document production rules in both XMI2.4 and in Canonical XMI are particularly difficult to follow
Issue 16575: xmi:uuid in Canonical XMI
Issue 16576: B.6 Identification
Issue 16888: Grammar for XML Schema - XMI Canonical
Issue 17260: Section B5.1
Issue 17261: The clause in B5.2 is a bit opaque
Issue 17262: In B5.2 should clarify that redefining a property does not affect its order within its class
Issue 17487: What order should be used for the namespaces?
Issue 17488: Section B2 (Point 5)
Issue 17489: Section B2 (Point 8)
Issue 17490: Section B2
Issue 17491: Section B5.1
Issue 17492: Section B5.2
Issue 17493: Section B5.3
Issue 17494: Section B6
Issue 17495: Section B6 - identification algorithm isn't terribly clear
Issue 17496: Section B7 - The example has a number of constraints that are exported as top level elements
Issue 17497: Section B7 - href="#xpointer(...)" is wrong
Issue 17552: opposite properties need to be serialized
Issue 18288: Canonical XMI: Problems with B6 Identification rules - better startefy for handling speciel characters needed
Issue 18289: DDTV spec had named element whose qualified name matched name generated by procedure described in section B6
Issue 18786: Canonical XMI should mandate use of Canonical Lexical Representation for XML
Issue 16573: Please provide a complete set of production rules (canonical-xmi-ftf)
Click here for this issue's archive.
Source: NASA (Dr. Nicolas F. Rouquette, nicolas.f.rouquette(at)jpl.nasa.gov)
Nature: Uncategorized Issue
Severity:
Summary:
In B.3.2 and B.4, it would be easier to see the complete set of rules instead of just those that Canonical XMI supersedes from XMI 2.4. Perhaps the rules that are modified in Canonical XMI could be highlighted in boldface. Their organization is already non-trivial, jumping back & forth makes reading the rules a lot more difficult.
2) The XMI schema and document production rules in both XMI2.4 and in Canonical XMI are particularly difficult to follow because it is unclear at which level one should interpret references to "Class", "DataType" and "Property". For example, B3.2 in Canonical XMI reads: 4. ClassTypeDef ::= "<xsd:complexType name=’" //Name of Class// "’>" ... 4a. ClassTypeName ::= 1h:Namespace //Name of Class// 4d. ClassAttributes ::= ( "<xsd:element name=’" //Name of DataType-typed Property// "’" Does "Class" here refer to a Class in a user model (sometimes called an M1 model) or a metaclass (i.e., sometimes called an M2 model) or something else? The fact that the rules only mention "Class", "DataType" and "Property" is rather perplexing because there is no reference as to where these concepts are defined. Are these terms references to instances of MOF metaclasses or something else? It would help to provide a fully worked out example of a CMOF metamodel, produce the XSD for that metamodel from the rules (assuming that one can follow how the rules are applied to produce the resulting schema) and apply the document production rules to an instance of that CMOF metamodel to obtain the corresponding XML document that should be valid w.r.t. the XSD.
This attribute is used for ordering elements as described in B.5 Since Canonical XMI makes xmi:uuid mandatory, it seems reasonable to expect that some tools may change the value of xmi:ids on import/export. It seems that Canonical XMI should require a scheme such as option 3 in XMI 2.4, section 7.10.2.2 for linking across XMI documents. Currently, Canonical XMI does not make such restriction on the scheme for linking across documents. Since the xmi:id based scheme (option 1 in XMI 2.4, section 7.10.2.2) is the most common, it means that Canonical XMI serialization implicitly requires tools to preserve xmi:id values on import/export. If the requirement on preserving xmi:id values is too strong, then Canonical XMI needs to require a cross-document linking scheme that is based on xmi:uuid values and that specifically prohibits using xmi:id values except for references within the same document.
The rule in the second bullet is insufficient to guarantee stable generation of xmi:id values.
Consider the following excerpt from UML2.4.1's XMI (http://www.omg.org/spec/UML/20101101/UML.xmi)
<packagedElement xmi:type="uml:Class" xmi:id="Namespace" name="Namespace" isAbstract="true">
...
<ownedAttribute xmi:type="uml:Property" xmi:id="Namespace-importedMember" name="importedMember" visibility="public" type="PackageableElement" isReadOnly="true" isDerived="true" subsettedProperty="Namespace-member" association="A_importedMember_namespace">
<ownedComment xmi:type="uml:Comment" xmi:id="Namespace-importedMember-_ownedComment.0" annotatedElement="Namespace-importedMember">
<body>References the PackageableElements that are members of this Namespace as a result of either PackageImports or ElementImports.</body>
</ownedComment>
<upperValue xmi:type="uml:LiteralUnlimitedNatural" xmi:id="Namespace-importedMember-_upperValue" value="*"/>
<lowerValue xmi:type="uml:LiteralInteger" xmi:id="Namespace-importedMember-_lowerValue"/>
</ownedAttribute>
...
<ownedAttribute xmi:type="uml:Property" xmi:id="Namespace-ownedMember" name="ownedMember" visibility="public" type="NamedElement" isReadOnly="true" isDerived="true" isDerivedUnion="true" aggregation="composite" subsettedProperty="Namespace-member Element-ownedElement" association="A_ownedMember_namespace">
<ownedComment xmi:type="uml:Comment" xmi:id="Namespace-ownedMember-_ownedComment.0" annotatedElement="Namespace-ownedMember">
<body>A collection of NamedElements owned by the Namespace.</body>
</ownedComment>
<upperValue xmi:type="uml:LiteralUnlimitedNatural" xmi:id="Namespace-ownedMember-_upperValue" value="*"/>
<lowerValue xmi:type="uml:LiteralInteger" xmi:id="Namespace-ownedMember-_lowerValue"/>
</ownedAttribute>
...
<ownedOperation xmi:type="uml:Operation" xmi:id="Namespace-importedMember.1" name="importedMember" visibility="public" isQuery="true" bodyCondition="Namespace-importedMember.1-spec">
<ownedComment xmi:type="uml:Comment" xmi:id="Namespace-importedMember.1-_ownedComment.0" annotatedElement="Namespace-importedMember.1">
<body>The importedMember property is derived from the ElementImports and the PackageImports. References the PackageableElements that are members of this Namespace as a result of either PackageImports or ElementImports.</body>
</ownedComment>
<ownedRule xmi:type="uml:Constraint" xmi:id="Namespace-importedMember.1-spec" name="spec" constrainedElement="Namespace-importedMember.1 Namespace-importedMember">
<specification xmi:type="uml:OpaqueExpression" xmi:id="Namespace-importedMember.1-spec-_specification">
<language>OCL</language>
<body>result = self.importMembers(self.elementImport.importedElement.asSet()-
>union(self.packageImport.importedPackage->collect(p | p.visibleMembers())))</body>
</specification>
</ownedRule>
<ownedParameter xmi:type="uml:Parameter" xmi:id="Namespace-importedMember.1-result" name="result" visibility="public" type="PackageableElement" direction="return">
<upperValue xmi:type="uml:LiteralUnlimitedNatural" xmi:id="Namespace-importedMember.1-result-_upperValue" value="*"/>
<lowerValue xmi:type="uml:LiteralInteger" xmi:id="Namespace-importedMember.1-result-_lowerValue"/>
</ownedParameter>
</ownedOperation>
...
<ownedOperation xmi:type="uml:Operation" xmi:id="Namespace-ownedMember.1" name="ownedMember" visibility="public" isQuery="true" bodyCondition="Namespace-ownedMember.1-spec">
<ownedComment xmi:type="uml:Comment" xmi:id="Namespace-ownedMember.1-_ownedComment.0" annotatedElement="Namespace-ownedMember.1">
<body>Missing derivation for Namespace::/ownedMember : NamedElement</body>
</ownedComment>
<ownedRule xmi:type="uml:Constraint" xmi:id="Namespace-ownedMember.1-spec" name="spec" constrainedElement="Namespace-ownedMember.1 Namespace-ownedMember">
<specification xmi:type="uml:OpaqueExpression" xmi:id="Namespace-ownedMember.1-spec-_specification">
<language>OCL</language>
<body>true</body>
</specification>
</ownedRule>
<ownedParameter xmi:type="uml:Parameter" xmi:id="Namespace-ownedMember.1-result" name="result" visibility="public" type="NamedElement" direction="return">
<upperValue xmi:type="uml:LiteralUnlimitedNatural" xmi:id="Namespace-ownedMember.1-result-_upperValue" value="*"/>
<lowerValue xmi:type="uml:LiteralInteger" xmi:id="Namespace-ownedMember.1-result-_lowerValue"/>
</ownedParameter>
</ownedOperation>
</packagedElement>
UML::Operation is an instance of the metaclass: UML::Class, which has several properties including:
- ownedAttribute: importedMember, ownedMember
- ownedOperation: importedMember, ownedMember
The rule in B5.1 seems incomplete.
It should specify that for a given element (e.g., UML::Operation as above), then nested elements are ordered alphabetically by the name of the meta-property.
In the above example, UML::Operation has metaclass UML::Class; whose meta-properties include, in alphabetical order, UML::Class::ownedAttribute and UML::Class::ownedOperation.
The alphabetical ordering of these meta-properties is used for serializing the values of these metaproperties but also for generating their xmi:ids as well.
That's why the ownedAttribute UML::Operation::importedMember has xmi:id="Namespace-importedMember" whereas the ownedOperation UML::Operation::importedMember has xmi:id="Namespace-importedMember.1"
In particular, the last sentence in the 2nd bullet of B6 is incorrect:
Note that named elements (which satisfy the first rule) are still included in this count.
That is, the "-<N>" suffix starts with N=1 when the generated xmi:id would otherwise conflict with a previously generated xmi:id;
that is, N=2 corresponds to the second element that has the same qualified name; N=3 corresponds to the third, etc...
As part of our UML work for the National Information Exchange Model (NIEM) we have encountered what appear to be errors in the grammars for XML Schema in several XMI specs. Please let us know if we have misinterpreted these issues
Section B5.1 should reference the section in the main XMI spec that determines the element name rather than vaguely saying “based on the metamodel classifier”.
The clause in B5.2 is a bit opaque – it could usefully be spelled out a bit more e.g. “Properties of an element are ordered by the class in which they are defined. Properties defined by a superclass appear before those of its subclasses. Where a class inherits from more than one direct superclass, properties from the class with the alphabetically earlier class name appear before those of an alphabetically later class name.”
In B5.2 should clarify that redefining a property does not affect its order within its class
Section B2 (Point 3) What order should be used for the namespaces?
The ordering is slightly inconsistent with the way root objects are serialised, eg <uml:Class .../>, with the type first. This has made exporting from our tool more complex than would be necessary if xmi:type was first.
How do you serialise a null value where there is a default? This is probably a general issue with the XMI specification rather than canonical
Bullet org.omg.xmi.ordered = true (forces ordering of properties) should probably be described as (forces ordering of values), particularly given that org.omg.xmi.superClassFirst is described in an identical manner, but means something completely different.
Why does this use a different order to nested elements as given in section B5.3?
This imposes an alphabetic order on the superclasses that isn't mentioned in the main specification for the superClassFirst option. It isn't clear to me what this is ordered on. It can't just be the name of the superclass, because there could be multiple superclasses with the same name. It's also a very complicated way to order the properties that will almost certainly be inconsistent with the way the metamodel is defined. Why not just sort the properties alphabetically by name?
Nested elements are ordered by uuid. While there's nothing wrong with that it seems a shame to lose alphabetic ordering by name, which makes the XMI easier to read. Perhaps sorting could be by name if there is one and then by uuid?
I think it would be helpful if uuids were limited to valid URIs. An href is typed as xsd:anyURI, so if a uuid is a URI it is straightforward to use in an href. If the uuid is not a URI the href is more complex, probably requiring the document URL, which, given an id must also be present, would make the uuid pointless. (This comment applies for uuids in general. In JDeveloper we've taken uuids to be URIs.) Additionally, it is impossible for an importing tool to ensure arbitrary string uuids generated by different exporting tools are unique. A standard URI scheme gives some hope that if two uuids are the same, they are the same object.
The identification algorithm isn't terribly clear and has
several variants that could be simplified into a single
approach. It references UML qualified names, which doesn't
make sense as this algorithm should work for a general MOF
metamodel. Overloading '-' for separating objects and
numeric discriminators is unwise as it means uniqueness
checks must go beyond the peer elements.
Alternative algorithm
1) The base name for an object is its name*. If there is no
name, the base name is the name of the owning property or
"_" for a top level object.
2) Any base name characters that are not (NCNameChar - '-')
should be replaced with '_'. If a top level object does
not start with (Letter | '_') a '_' should be prefixed.
3) If the object has no name or the base name is a duplicate
of an earlier (by export order) peer base name append a
numeric qualifier:
a) append '_' if the last character is not already one
b) append a sequence number, starting with 1 when the
object has no name and 2 if it does. It is possible
that an earlier peer name contains an '_n' suffix
that creates a name collision. In this case increment
the sequence number until no collision exists.
4) The id for a root object is the base name.
The id for a nested object is parent id + '-' + base name.
*There is no universal name property. So a tag should be
introduced to identify valid name properties.
The example has a number of constraints that are exported as top level elements. Why is this the case? The canonical form needs to choose a single representation. Composite objects should always be exported as nested elements if possible
href="#xpointer(...)" is wrong. xmi:label is excluded from canonical form, in particular from the example. External references should use a canonical form that doesn't use xpointer. They should probably always use the uuid, assuming that it is a URI.
In XMI generally it’s fine and valid to serialize either end of a pair of opposite properties: it’s not required to serialize both. However for Canonical XMI this must be predictable and consistent so I guess both would need to be serialized. This needs to be made clear in the spec
The spec says "Where the above rules result in characters not permitted for identifiers in XML documents (for example space, ‘/’ or ‘:’ these must be replaced by ‘_’." DTV had elements named like this: "DateTime-Time_Infrastructure-duration1_<_duration2" "DateTime-Time_Infrastructure-duration1_=_duration2" Obviously, if I change < and = to _, two elements will have the same name. We need a better strategy for handling special characters.
The DTV spec had a named element whose qualified name matched a name generated by a procedure described in the spec section B6: In other cases the xmi:id is the xmi:id of the parent XML element (or “_” for top level elements), followed by the separator ‘-‘, followed by the name of the property (XML element. If there is more than one value for the property this is further followed by ‘-‘ followed by the sequence number (from 1) within the parent element and the property. Note that named elements (which satisfy the first rule) are still included in this count. The named element was not a sibling, so the part "Note that named elements...are still included in this count" did not apply. One quasi-solution is to use numbering whenever there is not a qualified name. Simply strike the phrase "If there is more than one value of the property this is further" in the above. The problem with this is that there could still be a element with a qualified name that matches the generated xmi:id (it could end with a number)! Perhaps we need to add "If the resulting name is a duplicate of a name generated using the procedure for qualified names described above, the first sequence number where duplication does not occur is used." I realize that these are pretty complicated rules.
Detail: XML allows a number of serialization options ? for example xs:booleann may be serialized as “0” and “1” in addition to “false” and “true”. However the XSD spec does define a Canonical Lexical Representation that removes this variability: see http://www.w3.org/TR/xmlschema-2/#canonical-lexical-representation This should be mandated by the Canonical XMI Spec.