Issue 8708: current XMI expressing the normative PIM is not too compatible
Issue 8709: translation rules should be better documented
Issue 8710: Discrepancies in Glossary section
Issue 8711: Attributes in a subtype
Issue 8712: Section 1.5
Issue 8713: Section 3.1.1
Issue 8714: Section 3.18 - Page 20
Issue 8715: Various - some associations are 'derived'
Issue 8716: Some typos in section 4.1 and elsewhere
Issue 8717: Section 2.1
Issue 8796: Added attribute 'url:string' into Db_xref.
Issue 8797: Added attribute 'version:string' into Db_xref
Issue 8798: Make it possible to Individual have more attributes
Issue 8799: type of all attributes of Geographic_location
Issue 8800: Panel has new attribute 'type'
Issue 8801: Abstact class Ordered_location has now attribute 'position:string
Issue 8802: Value of 'strand' attribute of Reference_genomic_location
Issue 8803: Removed unnecessary multiple inheritances
Issue 8804: Added abstact class Population to have associations visible
Issue 8805: Added classes for describing bibliographic references
Issue 8708: current XMI expressing the normative PIM is not too compatible (snp-ftf)
Click here for this issue's archive.
Source: Japan Biological Informatics Consortium (Mr. Martin Senger, martin.senger(at)gmail.com)
Nature: Uncategorized Issue
Severity:
Summary:
The current XMI expressing the normative PIM is not too compatible. It is possible to read it by the same UML tool that created it (Umbrello 1.2) but hardly by any other tool. Discussion: Because we are aware that there is a general problem with almost all UML tools regarding their interoperability, we will try to use the latest version of Umbrello (our UML tool of choice). That's all we can do/guarantee. Proposed solution: Replace file omgsnp-xmi12.xml in the accompanied document with its new version.
Because we are aware that there is a general problem with almost all UML tools regarding their interoperability, we will try to use the latest version of Umbrello (our UML tool of choice). That's all we can do/guarantee. Resolution: · Replace file omgsnp-xmi12.xml in the accompanying document with its new version, created by Umbrello 1.4. Note that the contents of this new document are the same as before, unless changes are mentioned elsewhere (in other raised issues). · Change the compliance point section.
The use of UML to describe the conceptual model in UML (i.e PIM) and the data interchange model in XML Schema (i.e PSM) is good. It would be better if the translation rules are better documented (some of this is listed, but others are hidden in the tool implementation that is used to produce the schema). Discussion: At the monent we see two alternatives. Both of them, however, need first to solve the issue with UML tools interoperability (see above, number 1). They are: a) Keep the current way of transformation from the XMI to the XML Schema but: i) Better document the rules used in the manual translation from the XMI to the Excel spreadsheet, and ii) Make available the software tool used to transform the Excel spreadsheet to the final XML Schema. b) Write (and make available, including reasonable documentation) a new tool that would transform directly from the XMI to XML Schema. Actually, there is also the third way. We will use it if the resulting XMI (coming from the new version of Umbrello 1.4) is too different from the current XMI. In that case, we remove from the specification the description how to get from the XMI to XML Schema completely, and we will only keep there a statemant that that both XMI and XML Schema are normative. Proposed solution: none yet
At the monent we see two alternatives. Both of them, however, need first to solve the issue with UML tools interoperability (see above, number 1). They are: a) Keep the current way of transformation from the XMI to the XML Schema but: i) Better document the rules used in the manual translation from the XMI to the Excel spreadsheet, and ii) Make available the software tool used to transform the Excel spreadsheet to the final XML Schema. b) Write (and make available, including reasonable documentation) a new tool that would transform directly from the XMI to XML Schema. Actually, there is also the third way. We will use it if the resulting XMI (coming from the new version of Umbrello 1.4) is too different from the current XMI. In that case, we remove from the specification the description how to get from the XMI to XML Schema completely, and we will only keep there a statemant that that both XMI and XML Schema are normative. There is no need to add any statement from the author of this code regarding its legal status because the author is an employee of a submitting company - which is already covered by the general copyright notice at the beginning of the specification. Resolution: There is a new tool that converts XMI files into XSD schema (a schema that defines PML). The changes are: · Add XSDMaker.java file into accompanying files. · Add XSDMaker.txt file - a file documenting how to use the new tool - into accompanying files. The XML Schema expressing this platform specific model was obtained by converting a normative XMI file into XSD file using a SNP specific tool XSDMaker. The tool is freely available, and it is also included in the accompanying file of this specification, including its documentation. Several XML samples were manually created. They have been validated by the XML Schema created in the previous step.
3. I like the way the Glossary is usually consistent with the model
and the XSD. There are some discrepancies though.
Discussion:
The definitions of terms are already in section 3.2. Therefore there
is no need to repeat them again in a separate section called
"Glossary". Having them in one place will guarantee consistency.
Proposed solution:
Replace the complete chapter "Glossary" by text: "The used terms are
defined in section "Model classes, attributes and associations
(details)".
Remove the third paragraph ("Conveniently, most of the used
domain-specific terms are also collected in the Glossary section of
this document.") from the beginning of the Section 3.
Discussion:
The definitions of terms are already in section 3.2. Therefore there is no need to repeat them again in a separate section called "Glossary". Having them in one place will guarantee consistency.
The glossary can easily (because it is generated) be added as a convenient document as an accompanied file.
Resolution:
· Replace the complete contents of the chapter "Glossary" by text: "The used terms are defined in section "Model classes, attributes, and associations (details)."
· Remove the third paragraph ("Conveniently, most of the used domain-specific terms are also collected in the Glossary section of this document.") from the beginning of Section 3.
Disposition: Resolved
As I was reading the spec, I found out that attributes in a subtype are bold and the inherited ones are not. Would be nice if these conventions are spelled out somewhere in a 'how to read the spec section'. Discussion: It will probably be also solved by writing the new software conversion tool, and by re-generating the PSM part of the specification. But I would not rely on fonts indicating some meaning - they may be unintentionally changed by formatting the document. The inheritance is, however, already indicated by text like "defined at ..." that accompanies every attribute. Proposed solution: Reject the issue
Discussion: It will probably be also solved by writing the new software conversion tool, and by re-generating the PSM part of the specification. But I would not rely on fonts indicating some meaning - they may be unintentionally changed by formatting the document. The inheritance is, however, already indicated by text like "defined at ..." that accompanies every attribute. Disposition: Rejected
Section 1.5 - Please refer to exact document numbers of related
specs this proposal is based on (Example : Whether the LSID used in
final adopted spec, available spec etc.). Add these references to the
'references' section.
Proposed solution:
Add the following to Section 1.5
- at the end of BQS section:
"The relevant specification is available as OMG documents
formal/02-05-03, formal/02-05-04 and dtc/02-02-01."
- at the end of LSID section:
"The relevant specification is available as OMG documents
formal/2004-12-01 and dtc/04-08-03."
Add the following to the Section 6 (References):
[9] Bibliographic Query Service, an OMG specification,
http://www.omg.org/technology/documents/formal/bibliographic_query.htm
[10] Life Sciences Identifier, an OMG specification,
http://www.omg.org/technology/documents/formal/life_sciences.htm
Some references are considered normative and some not. Those that are normative should go to the section "Normative references", the others to the "References". Our understanding is that there is only one normative reference - LSID. Special case may be the BQS reference. Normally, it should be considered normative, as well. But because it was written for CORBA we cannot use it here "by reference" - we use id "by copying". Doing that puts BQS reference to the normal (not normative) reference section. Resolution: · Add the following to Section 1.5 - at the end of BQS section: "The relevant specification is available as OMG documents formal/02-05-03, formal/02-05-04, and doc/02-02-01." - at the end of LSID section: "The relevant specification is available as OMG documents formal/2004-12-01 and dtc/04-08-03." · Add the following to Section 6 (References): [9] Bibliographic Query Service, an OMG specification: http://www.omg.org/technology/documents/formal/bibliographic_query.htm [10] Life Sciences Identifier, an OMG specification: http://www.omg.org/technology/documents/formal/life_sciences.htm Disposition: Resolved
6. Section 3.1.1 - In looking at the model and the XML Schema, it is not clear how the model maps to the schema. For example in looking at Assayed_genomic_genotype, the association cardinalities (multiplicities if you will) map differently. for example the '* *' between Polymorphism_Assay and Molecular_Sample, maps to minoccurs = 1, maxoccurs = 1, but between Polymorphism_Assay and Consensus_genomic_genotype maps to minoccurs =0, max occurs = 1. So it is not clear if the semantics in the diagram and the XSD are consistent. Discussion: will be solved by new conversion software tool Proposed solution: not yet
Discussion Will be solved by a new conversion software tool. Resolution: The change solving this issue is described elsewhere, in other issue or issues. General solution is that a new conversion tool XSDMaker (see issue 8709) solves these inconsistencies. Disposition: Resolved
7. Section 3.18 - Page 20 - there is disrepancy between the diagram and the XSD. The XSD has an attribute 'url' - this is not in the diagram. It is not in the model either for the DB-xref class (I checked the XMI file) - Is there some other mapping rule that creates this attribute? Discussion: To modify PIM to add an attribute URL to the DB-xref class. The rest will be solved by new conversion software tool. Proposed solution: not yet
Resolution: The change is solved by other issues: 8709 and 8796 Discussion: To modify PIM to add an attribute URL to the DB-xref class. The rest will be solved by new conversion software tool. Disposition: Resolved
8. Various - some associations are 'derived' - is this just documentation only or is there some derivation algorithm? Proposed solution: Add a bullet to the beginning of Section 3 with the following text: "The words 'derived from' are used in the UML model as comments, they do not imply any specific derivation algorithm."
Resolution: Add a bullet to the beginning of Section 3 with the following text: "The words 'derived from' are used in the UML model as comments, they do not imply any specific derivation algorithm." Disposition: Resolved
- Section 4.1 - XLM instead of XML, - Section 2.0 - 'researches' - should be 'researchers' . - Section 3.1.1 - XSD and model refer to Anatomic_location, the text referes to 'Anatomical_location. - Section 3.14 - twice "information" in the phrase "contain information information". Proposed solution: Change typos in sections 4.1, 2.0, 3.1.1 as reported.
Discussion:
The term "Anatomic_location" will be used to name a class, the term "anatomical location" could be used in descriptive text referring to this class.
Resolution:
Change typos in sections 4.1, 2.0, 3.1.1 as follows:
- Section 4.1 - XLM instead of XML,
- Section 2.0 - 'researches' - should be 'researchers' .
- Section 3.1.1 - XSD and model refer to Anatomic_location, the text
referes to 'Anatomical_location.
- Section 3.14 - twice "information" in the phrase "contain information
information."
Disposition: Resolved
10. Section 2.1 - The UML diagram here and elsewhere should have a figure # for easier cross reference. Proposed solution: Add numbers to all figures, in order as they appear in the document.
Add numbers to all figures, in order as they appear in the document. Disposition: Resolved
Added attribute 'url:string' into Db_xref. Definition: Full URL to the cross-refenced entry
The "Proposed resolution" for all of them is the following: ----------------------------------------------------------- * apply the change into the UML diagram of the PIM model (make a new figure) * apply the change into the XMI file created from the UML diagram (update the accompanied file) * apply the change in the section "3.2 Model classes, attributes and associations (details)" (re-generate the whole section from the XMI)
Added attribute 'version:string' into Db_xref Definition: Version of the database
The "Proposed resolution" for all of them is the following: ----------------------------------------------------------- * apply the change into the UML diagram of the PIM model (make a new figure) * apply the change into the XMI file created from the UML diagram (update the accompanied file) * apply the change in the section "3.2 Model classes, attributes and associations (details)" (re-generate the whole section from the XMI)
- moved attributes 'size' and 'count_unit' from Population to Panel
- made Individual to inherit from Population
- removed association from Individual to Geographic_location and
Taxon as they are now inherited from Population
- removed associations from Molecular_sample to Panel and
Individual and replaced them with one association to Population
- Modified the association from Individual to Panel to allow more
than one Panel
The net result is that Individual can have new attributes
- race
- ethnicity
- primary_language
- language_family
and it can be associated to more than one PanelThe "Proposed resolution" for all of them is the following: ----------------------------------------------------------- * apply the change into the UML diagram of the PIM model (make a new figure) * apply the change into the XMI file created from the UML diagram (update the accompanied file) * apply the change in the section "3.2 Model classes, attributes and associations (details)" (re-generate the whole section from the XMI)
The type of all attributes of Geographic_location are now 'double' to allow fractional degrees Definition: Location of an individual or population in a geographic map. Locations are expressed in decimal degrees. Northern latitides and eastern longitudes have positive values by convention.
The "Proposed resolution" for all of them is the following: ----------------------------------------------------------- * apply the change into the UML diagram of the PIM model (make a new figure) * apply the change into the XMI file created from the UML diagram (update the accompanied file) * apply the change in the section "3.2 Model classes, attributes and associations (details)" (re-generate the whole section from the XMI)
Panel has new attribute 'type' that allows us to annotate how Panel is used. Among the possible uses are: plate, population sample, family.
The "Proposed resolution" for all of them is the following: ----------------------------------------------------------- * apply the change into the UML diagram of the PIM model (make a new figure) * apply the change into the XMI file created from the UML diagram (update the accompanied file) * apply the change in the section "3.2 Model classes, attributes and associations (details)" (re-generate the whole section from the XMI)
Abstact class Ordered_location has now attribute 'position:string' that allows for giving a value to inheriting classes, e.g. Cytogenetic_location
The "Proposed resolution" for all of them is the following: ----------------------------------------------------------- * apply the change into the UML diagram of the PIM model (make a new figure) * apply the change into the XMI file created from the UML diagram (update the accompanied file) * apply the change in the section "3.2 Model classes, attributes and associations (details)" (re-generate the whole section from the XMI)
- Value of 'strand' attribute of Reference_genomic_location is now expressed as a 'string' rather than an 'integer'. Valid values are 'forward', 'reverse' and 'unknown'. 'unknown' is default.
The "Proposed resolution" for all of them is the following: ----------------------------------------------------------- * apply the change into the UML diagram of the PIM model (make a new figure) * apply the change into the XMI file created from the UML diagram (update the accompanied file) * apply the change in the section "3.2 Model classes, attributes and associations (details)" (re-generate the whole section from the XMI)
Removed unnecessary multiple inheritances - Individual and Sequence were inherited explicitly from Identifiable.
The "Proposed resolution" for all of them is the following: ----------------------------------------------------------- * apply the change into the UML diagram of the PIM model (make a new figure) * apply the change into the XMI file created from the UML diagram (update the accompanied file) * apply the change in the section "3.2 Model classes, attributes and associations (details)" (re-generate the whole section from the XMI)
Added abstact class Population to have associations visible
The "Proposed resolution" for all of them is the following: ----------------------------------------------------------- * apply the change into the UML diagram of the PIM model (make a new figure) * apply the change into the XMI file created from the UML diagram (update the accompanied file) * apply the change in the section "3.2 Model classes, attributes and associations (details)" (re-generate the whole section from the XMI)
The model follows closely the OMG Bibliographic Query Service data model. Only differences are the way bibref subject descriptors are handles (giving more flexibility to mix different vocabularies) and addition of boolean 'et_at' attribute to Person to indicate that the list of, for example, authors is not complete. - All the added classes except Bibref_subject and Bibref_description inherit from Identifiable and can have additional properties added to them using Annotation and Db_xref. These two classes are mere groupings of attributes to Bibliographic_reference and can not exist independently.
The "Proposed resolution" for all of them is the following: ----------------------------------------------------------- * apply the change into the UML diagram of the PIM model (make a new figure) * apply the change into the XMI file created from the UML diagram (update the accompanied file) * apply the change in the section "3.2 Model classes, attributes and associations (details)" (re-generate the whole section from the XMI)