Chemical Information Working Group
Membership of the Cheminformatics group is quite a mixture (sic) in that you don't have to be a programmer or systems architect to participate. Several members are practising cheminformaticians or domain scientists who want to make the lives of all end users in our community easier. - If you find yourself spending hours (or days) reformatting data, wrestling with incompatable structure databases or CADD applications, then this is the working group for you!
- If you've got a great idea on how to make data storage, searching or retrieval easier, don't be afraid to come along and share your thoughts!
- If you'd like to spend more of your time doing cool stuff and less time wrestling with what you think should be mundane tasks, then we can help make an impact together!
Chair
Richard K.Scott Ph.D. De Novo Pharmaceuticals
Our Mission is to drive the development and adoption of flexible and robust specifications for information system components in support of Chemical entities in the context of drug discovery research.
Strategy
- Focus on those tasks and topics that are central to our mission
- Wherever tasks or topics are peripheral to our mission, adopt or adapt the work of other OMG groups
- Focus on maximizing the synergy between the strengths and directions of software vendors and the most significant and broad issues facing the drug discovery organizations.
Scope
The ChemInformatics Working Group (CIWG) defines its scope by defining the types of data we are concerned with, and what operations are performed with those data. We are primarily concerned with information about the composition and properties of chemical entities.
Chemical Entity Definition
- Our scope is not defined by the size of the chemical entity; "small" drug-like compounds, basic reagents, macromolecules, and mixtures are to be considered by this group.
- Our scope is not restricted to "real" entities; information gathered from simulation, or "virtual" entities is also within our interest.
- Our scope is not restricted to pure, well-defined entities; an entity may be something like "ether extract of Lumifluorus madagascarens."
Information of the following types:
- Chemical Structure, including, but not restricted to structure connectivity, stereochemistry, 3D coordinates, preferred conformations. Also, Markush representations, R-,X-groups, ill-defined structure definitions, salt type.
- Chemical and Physical Properties, including molecular weight, clogP, melting point, density, surface volume, surface charge distribution.
- Reaction information, including reactants, reaction conditions, bibliographic references, notebook references
- Analytical data, including spectroscopic data, purity measurements, composition of matter.
- Pharmacological properties, including Activity measurements, % Inhibition measurements, toxicological measurements, ADME measurements.
- Protocols for acquiring measurements (analytical, pharmacological, physical)
- Inventory information
- Batch and purification information
- Bibliographic references, articles, laboratory notebooks and other written works
- Computed properties, including molecular mechanics force fields, modeled properties, predicted chemical and pharmacological properties.
- Formulations of mixtures
Operations
- Registration of chemical entities and/or reactions
- Acquiring new analytical information (e.g., purity and composition)
- Archiving of lab notebook pages
- Creation of new information through testing for chemical, physical and pharmacological properties
- Selection of one or more chemical entities through structure-based searching, including substructure searching, Markush searching, similarity searching, dissimilarity searching.
- Searching of on-line documents for references to chemical entities by name or structure
- Combinatorial library design or high-throughput chemistry operations, including creation of virtual libraries, subset selection, reagent ordering, robot instruction
- Display of molecular structures (2D, 3D) and associated information
- Simulation or prediction of molecular and pharmacological properties
Back to top
Business Case for Drug Discovery Companies
In today's pharmaceutical company, it is no longer sufficient to maintain databases of chemical structures and experimental outcomes. Rather it is necessary to capitalize upon the potential for value-add that stems from adopting cutting edge data mining techniques, experimental design and analysis, predictions of chemical properties, and realizing the relationships between chemical compounds and genetic sequences. The need to implement these techniques creates the need to be able to efficiently co-mingle best of brand software components.
In today's climate of ready mergers and acquisitions, delivery of material and information and scheduling of experimental workflow are becoming complex issues that are rife with site-specific needs. Efficient solutions to these ever changing needs are most easily accomplished by co-mingling best of brand software components.
Pharmaceutical companies should be involved in creating standard specifications for software components in order to:
- Reduce the amount of internal software development by delineating standard specifications for interoperability among software produced by vendors
- Cooperate with experts from throughout the pharmaceutical industry in establishing specifications, as opposed to consuming internal resources in that effort.
- Participate in building software around a growing thriving standard (CORBA) as opposed to generating custom solutions that prove difficult to maintain
- Greatly simplify the exercise of knitting together various components (several OMG participant companies are positioning themselves to do this integration work with OMG compliant vendor supplied components).
- Allow for the efficient implementation of novel techniques in data mining, analysis, experimental design, property prediction, etc.
Business Case for Chemical Information Software Vendors
It is entirely possible that today's chemical information software marketplace will transform into two parallel markets in the next few years:
1) integrated production systems; and
2) flexible component based systems for exploration, data mining, experimental design, etc.
It is not necessary for the emergence of the second market to be to the exclusion of the first. The following advantages await those chemical information software vendors that participate in the OMG technology adoption process:
- Reduce programming effort on infrastructure details by applying a standard middleware architecture
- Reduce cost and time of development by making use of standardized component technologies internally.
- Reduce cost of supporting several representations of chemical entities espoused by other vendors.
- Allow development efforts to focus on core competencies for a more streamlined development plan with respect to internal organization and external collaborations.
- Open the field up for collaborations with software companies not focused on chemical information (e.g.: database companies, knowledge management tool companies).
- Participate in a forum for frank discussion of needs that pharmaceutical companies are facing.
- Participate in forging the standards for interoperability that will shape the industry tomorrow.
Back to top
The Chemical Sample Access and Representation RFP has been revised again and will be presented to the Architecture board at the Boston Meeting in June 2005.
Please contact Richard K. Scott (De Novo) for more information.
Please see the agenda for the LSR activities at the Athens, Greece Meeting April 13th-14th 2005.
Please watch the OMG web pages and the Life Sciences Research web page for more meeting details and agendas.
Back to top
- Following the Burlingame meeting, the near term goals are:
- Brainstorm any new ideas for Cheminformatics initiatives
- Ensure RFP-8 is supported at the Architecture Board meeting in Boston, June 2005
- Look to recruit new members
- Enage with vendors who have recently expressed an interest in restarting Compound Collections
- Support the Evaluation Task Force for the SNP submission
- Support forming of a charter for a Finalisation Task Force (FTF) for both LSAE and SNP submissions
Compound Collections
The Compound Collections RFP-19 has been retired, despite interest from several Compound Suppliers who had initially intended to submit a LOI.
Further information on the RFP issuance process can be found here.
A general guide on how to get started with the OMG is here.
New Proposals
The Cheminformatics working group would like to hear from interested parties who would like to get involved in generating new proposals that address the needs of the Cheminformatics community.
The Cheminformatics Working Group will not meet in Athens. Next meeting is scheduled for Boston in June.
Meeting minutes from the Washington meeting are available.
Please send comments to: Richard K. Scott
Back to top
The ChemInformatics Working Group has an email list, [email protected].
You can be added to the list of people on this list by sending a note to the folks at [email protected] with your complete mailing information, and the name of the list you want to be added to (cheminfo).
Back to top
Comments and questions: Richard K. Scott, PhD.

Last updated by Richard K. Scott on March 18th, 2005.