PAGE-OM model (Brookes) captures information related to genotype and phenotype observations and their relationships. The core conceptual domain is experiment part (See Figure 01 - Association study), which bring in data from phenotype (Figure 04- Phenotype overview) and genotype domains (Figures 03 – Genotype overview, 05 - Genotype in detail and 06 - Frequency) along with experimental result information that elucidates how genetic variations influence phenotypic variation. A variable site in a Reference_genomic_landmark sequence. Synonyms: polymorphic site, marker, (Genomic_polymorphism in SNP-PML). Downstream flanking sequence (at least 25 residues, if possible) Upstream flanking sequence (at least 25 residues, if possible). Proven phenotype change causing mutation. If type is 'microsatellite', gives the repeat unit, e.g. "CA" The type of the polymorphism. E.g. SNP, microsatellite, indel, translocation,... Validation status, e.g. "Proven", "Suspected" One of several alternative DNA sequences of a Reference_genomic_location as it appears in the population of organisms. Synonym: variant, allele Indicates the alphabet of the sequence molecule e.g. 'DNA', 'RNA', 'protein' The residue sequence string. Date of creation of the object. Date of deletion of the object. Life Science Identifier. Date of last modification of the object. Name can be non unique. Display name Size class for microsatellite alleles when exact size can not be determined. Semantic name. If the Genomic_variation type is 'microsatellite', get number of repeat units as value, e.g. 7 The Genomic_variation type is 'microsatellite' and gets value true if the allele region consists of repeat units only. A single member of a species, where a species is an accessioned taxon defined by a public database, and the individual is accessioned in a public or private database. Synonym: "inbred strain" in homozygous lineages. Date of birth of the individual. May be better abbreviated to birth year to protect the privacy of the individual Date of death of the individual. May be better abbreviated to plain year to protect the privacy of the individual. Id of the father to allow building of pedigrees. Recommended values are 'unknown', 'male' and 'female'. Additional values can be used to denote unusual karyotypes. Id of the mother to allow building of pedigrees. A set of samples from individuals drawn from the same species and used for genetic studies. A panel must be identifiable with a list of accessioned individuals, if possible. Panel can have subpanels. Synonym: SampleSet, Sample from population(s), “Plate” in Coriel sense. Values are 'chromosome' and 'individual'. Default is 'individual'. True if accessioned individuals are not available. The size of the sample. Note that the count_unit field affects how this value is interpreted. Optional identifier of the panel category: e.g. plate, family, population sample An experimental lab protocol and set of reagents for detecting the Genomic_alleles of Genomic_variations carried by an individual or a panel of individuals. Synonym: Assay. Non instrument part of the experiment - same assay can be used in different instrument runs Free text description of the assay protocol. The result of applying a variation assay to an individual, to reveal one or more of the genomic alleles carried by that individual. This term applies to the observed data rather than to the inferred state of the individual. Thus the same individual might have several different genotypes at the same site, where the variation might be due to differing assays, experimental error, dominant systems, missing data, and so forth. Synonym: Measurement. Failure of assay. Value is true if assay has failed Quality score of measurement. Depends on the instrument This class represents consensus from several experiments providing genotypes of the same sample on the same site. A set of Genomic_alleles across an equal number of Genomic_variations in a single chromosome and in a single individual. The Genomic_haplotype is derived from a set of Consensus_genomic_genotype. For each Genomic_variation, the haplotype contains one and only one Genomic_allele. Furthermore, the Genomic_alleles are required to be in phase on the individual, meaning that they are located on the same contiguous strand of DNA. Synonym: Haplotype. The frequency with which a particular Genomic_allele is seen in a particular Panel. This frequency can be measured from pooled samples. Synonyms: Genomic_allele_panel_frequency, allele_frequency. A location within a Reference_genomic_landmark. Attributes of the location are the Reference_genomic_assembly and/or the Reference_genomic_landmark, the start and end range and strand of the feature relative to the Reference_genomic_landmark. End of the location in the reference sequence. Start of the location in the reference sequence. Orientation of the feature in the reference sequence. One of 'forward', 'reverse', 'unknown'. Defaults to 'unknown'. A structure of a gene expressed as location of the CDS and exons. Defines genic coordinate system from start of the CDS downstream. gene symbol for the gene e.g. approved by the HUGO nomenclature committee. Genomic variation with location in genic coordinates. Synonym: mutation (when change from a common allele affects phenotype) Change in the quality or quantity of the mature RNA product. The new codon in the transcript, if applicable. The first affected nucleotide in the codon. Values are: 1, 2 or 3. The affected codon in the transcript. Change in the quality or quantity of (predicted) polypeptide chain (2D). A sample from an Individual or from a Panel defining the molecule and tissue/cell used (Anatomic_locations) in the Variation_assay. Synonym: Sample of individual. The molecule (RNA, DNA, protein) used in the assay. Change in the 3D structure of the polypeptide chain. Change in the function of the final gene product. Collection of variable nucleotides (Genomic_alleles in Genomic_variations) that define a gene. In older usage synonym locus. Large (spanning a few kb to >100 kb) blocks of Genomic_alleles in linkage disequilibrium (LD) and a few haplotypes per block, separated by regions of recombination. Map of haplotypes. Features include: Block length distribution, measures of block variability, relative proportions of common haplotypes, block coverage of chromosomes and/or genome. LD and other values between haplotypes, markers, alleles. Association class describing methods used to derive Genomic_haplotypes from Consensus_genomic_genotypes. Frequency of a Consensus_genomic_genotype in a Panel. OSAGE-OM Has many to one relationship to Latent_genotype (Consensus_genomic_genotype in SNP-PML) Heterozygosity (Heterozygosity) is a measure of observed variability of a polymorphic site (Genomic_variation) in a sub-population (Panel). Frequency of a Genomic_haplotype in a Panel. Another Genomic_variation close enough to affect the primer design. A location in one chromosome of a reference genomic assembly. Instead of the reference sequence being an accessioned sequence, it is a versioned assembly. Name of the chromosome in the assembly. Abstract class for frequencies, expressed in percentages. Alleles (Genomic_alleles), genotypes (Consensus_genomic_genotype) and haplotypes (Genomic_haplotype) can have measured frequencies in population samples (Panels). In addition, heterozygosity (Heterozygosity) is a measure of observed variability of a polymorphic site (Genomic_variation)in a sub-population (Panel). Total number Value of frequency (%) An extension point for collections of haplotypes. An interbreeding set of individuals, from whom a Panel is drawn. (Population in SNP-PML) . Extends Abstract_observation_target, which is abstract class for all entities from which one can make genotype or phenotype measurements or observations. Additional ethnic category of the population sample or "mixed". Language family name or code, e.g. as in Ethnologue Language spoken (name or code), e.g. as in Ethnologue Broad ethnic category of the population sample or "mixed". The class contains information on measurement of samples, done on a physical device connected to plate.. This information includes time of execution, name of instrument, etc. Name of the instrument Date of run. A sample holder, for example a microtiter plate used in one or many runs, represented by instances of Run. Samples, represented by instances of Molecular_sample, are positioned on the plate using instances of Location_on_plate. X,Y plate_positions (wells). Numbering starts from one. Each well can contain one or more observation targets (molecular samples) prepared for measurement using one or more variation assays (e.g. assay multiplexing Assay_set). Note: These are optional laboratory specific details (Sample and Assay information is in Assayed_genomic_genotype) x coordinate of plate. y coordinate of plate An extension point for other kinds of runs. Potentially existing genotypes on specific site that could be observed by Variation_assays. Application of a Variation _assay on one Molecular_sample generates a single Latent_genotype which has one or more Latent_genotype _specifications (this depends on ploidy level in case of Genomic_allele). Latent_genotypes associated to one instance of a Variation_assay can have only one type of Latent_genotype_specifications, as defined by Defining_feature. This class is a holder for one or many observable variation objects (Latent_genotype_specification). Latent_genotype is used to attach possible variations to measurements (Assayed_genomic_genotype), variation assays (Variation_assay) and marker loci (Genomic_variation). Abstract super class of observable variation objects, like alleles, melting temperatures (Melting_temperature), band sizes (Band_size). The class is an extension point to other kinds of variations. DNA fragment length estimated from gel electrophoresis The temperature at which DNA goes from a double-stranded to a single-stranded state. Unit of temperature is Celsius. Measurable feature of observable (e.g. size of nose) An extension point for kinds of values It is an abstract class for all entities from which one can make genotype or phenotype measurements or observations. It deals with entities capable of being observed. Is an association class that has list of values, which are used in defining the instance of Latent_genotype_specifications (for example intensity values used in allele calling). The class captures information how alleles are called (observed) from raw measurement values like intensity values Type of feature Inclusive value range maximum value minimum value Multi_variation_assay is a collection of assays which may be used simultaneously. Examples would be multiplex assays, micro-array based assays, or a panel of single-plex assays that share some common feature or purpose. Enumeration contains list of Values Family or case control based association study experiment. Represents set of experiment sub-sections that would normally be listed in the results section in manuscripts. Objective of experiment A free text description summarizing outcome of all experiment results in this correlation experiment Identifier of original study. Can be used in cases where experiment was originally done for different study. Type of experiment Abstract class. Extension point for Value implementations. Value model is based on concept developed in Generation Challenge Program: http://pantheon.generationcp.org/demeter/Values.html The contents of a Value can be limited by Constraints. Different types of Constraints allow various ways how to limit or validate one or more Value instances. The Constraint superclass only stores a string description of the Constraint. The actual full semantics of a constraint are specified in various subclasses described below. But there are no subclasses in the PAGE-OM - because they are out of scope of PAGE-OM. Description Value of type string Actual value Numeric value Error value is numeric value of accuracy. Quality score Value of type integer Integer value Value of type float Value Observable features can be measured by different methods. This class specifies which method has been used. For example, a method can be usage of a ruler or filling a questionnaire. An extension point for other kinds of observable features. Observable part of the structure, function or behavior of a living organism. Circumstances, objects, or conditions by which one is surrounded Way of life of an individual or panel All features considered by this model can be categorized by using this class. The category should be expressed by an ontology term. Specialized category of features representing diseases. Genomic observation Unit of value. Unit is defined using ontology term Type of unit Evidence can be an EvidenceCode (which is a controlled vocabulary term such as a GO evidence code or ICIS Method code) but can be a more fully documented Evidence object (inheriting from EvidenceCode) generally curated by a specified person, a curator modeled as a Contact. Its strength is expressed by the score (which is usually a numeric value between 0 and 1, but also other types of Value are allowed - e.g. an ontology term value). The core of an evidence is its supporting source which can be anything (because it is identified by a SimpleIdentifier). Usual evidence sources are BiblioReferences, Studies and OntologyTerms. Reference (generationcp - http://pantheon.generationcp.org/demeter/Features.html) Evidence code as specified using ontology term Evidence indicates reliability of a feature or simply documents its authoritative origin. Curator of evidence score of value A reasoned judgment of an experiment Probability value Accuracy code contains information on incompleteness of time of measurement or information on reason why the time of measurement is unknown or incomplete. Accuracy code as defined in specific ontology Value of type boolean Boolean value Set of frequencies Step-by-step procedure for solving a problem Description of algorithm Free text description of hypothesis of study. Description of hypothesis Association study is core concept of the specification. It captures relationships between phenotypes and genotypes. It is an examination of genetic variation across the genome, designed to identify genetic associations with observable phenotypes. Association studies are results of correlation experiments, An extension point for adding other kind of studies in the future. Abstract Acknowledgements Background information Summarizing conclusion for all experiments in this study Key findings Limitations Summarizing objective for all experiments in this study Possible source of bias Study design Power of study Reason for study size Submission date of study Title of study Date when study is updated The experiment result (for example a single p-value) gathers correlation between genomic observation and phenotypic observed values. A correlation experiment can consist of more than one experiment results. Observation done at specific point in time. Time of observation This class does not contain any scientific meaning. Its main purpose is to be the root element for the situations where this specification is used for data exchange formats (e.g. xml-schema). Therefore, it has optional direct associations to all important classes so that implementations can exchange only relevant data.