Issue 19808: Identification rules are too weak and too unpredictable (canonical-xmi-ftf) Source: NASA (Dr. Nicolas F. Rouquette, nicolas.f.rouquette(at)jpl.nasa.gov) Nature: Enhancement Severity: Critical Summary: The Canonical XMI rules for generating xmi:ids (section B6) are too weak to ensure reproducible results. Consider the following procedure: - Start a tool that supports Canonical XMI xmi:id generation - Load an input model - Compute the xmi:ids for all elements in the model Repeated executions of this procedure with the same tool, same input model should always result in the same xmi:ids. Reproducibility depends on the model. For example, the xmi:ids of comments owned by an element result are ordered by the "_n" unique suffix according to the order of their xmi:uuids. However, if modeling do not support xmi:uuids or do not preserve them, then the results can vary. The Canonical XMI rules for generating xmi:ids (section B6) are too unpredictable because of the dependency on XMI serialization. There are several cases where the xmi:id of an element depends on its serialization: - multiple named elements that have the same name, the same owner and the same containing property. - multiple named elements that have the same owner and the same containing property but whose names differ only in characters that indistinguishably map to "_" - multiple non-named elements that have the same owner and the same containing property In all such cases, rule (4) appends a unique "_n" suffix according to the "export order"; which ultimately reduces to the ordering of xmi:uuids of an element amongst its siblings that have the same "base name". This means that changes somewhere in a model can result in unmodified elements elsewhere to have different xmi:ids than before the changes were made. If a tool implements Canonical XMI when saving/exporting models, then the xmi:id rules behavior effectively injects changes into a user's model (on save/export). There could be pathological cases where adding/removing a single element in a model with N (very large) elements could result in changing most of the model! (this is because a change in an xmi:id then propagates into changes to xmi:idrefs that refer to that changed xmi:id). In practice, weaknesses and unpredictability severely undermine the utility of Canonical XMI identification rules. Resolution: Revised Text: Actions taken: June 18, 2015: received issue Discussion: End of Annotations:===== m: webmaster@omg.org Date: 18 Jun 2015 12:53:03 -0400 To: Subject: Issue/Bug Report ******************************************************************************* Name: Nicolas Rouquette Employer: JPL mailFrom: nicolas.f.rouquette@jpl.nasa.gov Terms_Agreement: I agree Specification: Canonical XMI Section: B.6 FormalNumber: ptc/13-08-28 Version: Beta 2 Doc_Year: 2013 Doc_Month: August Doc_Day: 28 Page: 9-10 Title: Identification rules are too weak and too unpredictable Nature: Enhancement Severity: Critical CODE: 3TMw8 B1: Report Issue Remote Name: wildcard.jpl.nasa.gov Remote User: HTTP User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/600.6.3 (KHTML, like Gecko) Version/8.0.6 Safari/600.6.3 Time: 12:53 PM Description: The Canonical XMI rules for generating xmi:ids (section B6) are too weak to ensure reproducible results. Consider the following procedure: - Start a tool that supports Canonical XMI xmi:id generation - Load an input model - Compute the xmi:ids for all elements in the model Repeated executions of this procedure with the same tool, same input model should always result in the same xmi:ids. Reproducibility depends on the model. For example, the xmi:ids of comments owned by an element result are ordered by the "_n" unique suffix according to the order of their xmi:uuids. However, if modeling do not support xmi:uuids or do not preserve them, then the results can vary. The Canonical XMI rules for generating xmi:ids (section B6) are too unpredictable because of the dependency on XMI serialization. There are several cases where the xmi:id of an element depends on its serialization: - multiple named elements that have the same name, the same owner and the same containing property. - multiple named elements that have the same owner and the same containing property but whose names differ only in characters that indistinguishably map to "_" - multiple non-named elements that have the same owner and the same containing property In all such cases, rule (4) appends a unique "_n" suffix according to the "export order"; which ultimately reduces to the ordering of xmi:uuids of an element amongst its siblings that have the same "base name". This means that changes somewhere in a model can result in unmodified elements elsewhere to have different xmi:ids than before the changes were made. If a tool implements Canonical XMI when saving/exporting models, then the xmi:id rules behavior effectively injects changes into a user's model (on save/export). There could be pathological cases where adding/removing a single element in a model with N (very large) elements could result in changing most of the model! (this is because a change in an xmi:id then propagates into changes to xmi:idrefs that refer to that changed xmi:id). In practice, weaknesses and unpredictability severely undermine the utility of Canonical XMI identification rules.