Email Discussion: 48 Hyperglossary and transition states molecules guillem@gauss.uib.es
Email Discussion: Re: 48 Hyperglossary and transition states molecules Rzepa,Henry


A Molecular Hyperglossary: Organic Molecular information in Hypermedia Form

Chris Leach[a], Peter Murray-Rust[b] and Henry S. Rzepa[a]

[a]Department of Chemistry, Imperial College of Science, Technology and Medicine, London, UK and [b] Glaxo Research & Development, Stevenage, Herts, UK.


Contents: Introduction, Implementation, Features, Conclusions, Acknowledgements.

Introduction.

The dissemination of molecular information through media such as scholarly journals or conference proceedings has hitherto largely relied on the technology of the printed page for its operation. In the specific area of organic chemistry, which can be dominated by the need to accurately convey two and three dimensional information about molecular structure, connectivity and stereochemistry, the printed page has provided particular challenges. A host of conventions have to be assimilated to convey in an error free way the stereochemistry of say a natural product. When it comes to indexing such content, this has largely been based on a conversion of the molecular structure to text-based nomenclature, sometimes based on formal IUPAC conventions, more often on trivial naming. Thus finding structural, or stereochemical information on the printed pages of a journal can be a very hit-and-miss procedure. In conferences, there is rarely the opportunity to index the abstracts, papers and posters in such a way that visitors to the conference can benefit from structured searches of the conference whilst they are actually attending the conference. Furthermore, during a conference presentation using say 35 mm slides, structural information is often gained subliminally or in broad concept rather than specific detail. Certainly few chemists would be entirely happy basing laboratory work on a half-remembered structure copied down from a conference slide they may have seen for only a few seconds.

The possibility of offering both conferences and journals via the mechanism of the World-Wide web allows the limitations referred to above to be solved in a radically new way. The World-Wide Web (or simply the "Web") is a mechanism that enables information from a diverse collection of sources to be viewed easily. Using a further protocol proposed by us and known as chemical-mime it is possible to transfer organic chemical information from an information server to the reader in a manner which allows the reader to be more involved with the information content. It is also possible to reverse the direction of flow, and have the reader contributing new molecular information to create two-way communication between the reader and the information source, and hence in effect between different readers all using a common information source. This is of course particularly important when considering the "conference" as a medium, although to have this happen with a "journal" environment raises other interesting issues of peer-review, quality control etc.

To illustrate these various themes, we have created what we have termed a "molecular hyperglossary" in electronic form, and associated with the ECTOC conference that this this paper is a component of.

Another example of a working hyperglossary on the Web is the one associated with the Internet Course on The Principles of Protein Structure. This glossary is a collection of definitions of terms and molecules that people, whether they are a tutor or a student, have added to the course. These definitions can be revised by anyone, hopefully to create a wealth of relevant auxillary information. There is a facility that allows links to be added between those definitions where people believe there exists a common theme.

Implementation

This hyperglossary allows the reader to contribute 2D or 3D structural information about organic molecules and comments into a so-called electronic "form", and have this information accepted by the remote server after appropriate self-consistency checks have been performed. By this, we mean that information that cannot be interpreted as a molecule, or fails simple formula or valency checks, will be rejected. The sophistication of these checks is of course under the control of the designer of the hyperglossary. Once this stage has been successfully passed, the information can then made instantly available to the rest of the world. Some of the technical features of this hyperglossary are illustrated below. The actual FORM used to input data is shown below;

ECTOC-1 Paper Number (e.g. 1-77 or general if your molecule has no paper association)

Molecule name/subject

Keywords for indexing (e.g. enzymatic reduction ketones)

Original URL of Molecule (if any, e.g. http://yoursite/yourdirectory/yourmolecule.pdb)

Comment on the molecule (will be truncated at 256 characters)

Type of molecule file

Please paste the molecule file into this area (as TEXT)




Features of the Hyperglossary


Conclusions.

In introducing the concept of an on-line molecular hyperglossary, it was our intention to create a collaborative environment where bibliographic, and 2D/3D molecular data could be exchanged between the participants of a conference. Some degree of quality control can be automated, and such a database can serve to impart a degree of uniformity of presentation and structure to such an event. The scope of such a hyperglossary could range from local use by a research group or department, as a component of an electronic conference (as here) or as an adjunct to an electronic journal. The entire collection of information is readily indexed, and via e.g. the SMILES string is even searchable in a sub-structure context.

Future developments of the hyperglossary will include the introduction of TGF reaction files to allow organic chemists to draw a complete reaction scheme and submit it to the hyperglossary. The hyperglossary will store the reaction scheme and create a 2D picture of the scheme automatically. The structures within the reaction could be separated and used to create 3D pictures and their corresponding 3D molecular information for viewing in 3D viewers. In due course, we anticipate that commercial implementations of these concepts will become available.

Acknowledgements

We are grateful to Pat Walters, author of Babel for helpful discussions, and to Daylight Chemical Information Systems for their generosity in allowing us access to their program libraries.
Revised 4/08/95
keyword search Home page