Paper presented at the 14th Eurographics Conference, Imperial College, March 26-28, 1996
The Molecular Object Toolkit: A New
Generation of VRML Visualisation tools for use in Electronic Journals.
Omer Casher and Henry S. Rzepa
Department of Chemistry,
Imperial College of Science Technology and Medicine,
London, SW7 2AY. E-mail: o.casher@ic.ac.uk and rzepa@ic.ac.uk
We describe here the thinking behind our development of what we have
termed a Molecular Objects Toolkit (MOT), a collection of VRML (Virtual Reality
Modelling Language) authoring tools designed to accept as input popular
molecular file formats and to produce as output VRML files. These tools are
being integrated into MOzART 1.0, a molecular VRML editor, to allow for
user-created VRML files. These tools are also being integrated into server-side
cgi-bin programs to allow for dynamically-created VRML files from molecular
data residing either in the Web server or in an external database server. Our
objective is to produce a complete MOT library for teaching and research
purposes, and to integrate them into the Hyperwave Server to allow for
structured maintenance of the electronic documents and molecular data. The
implications for how electronic chemistry journals, virtual libraries and
electronic conferences might use these technologies are discussed.
Virtual Reality Modelling Language and Electronic Publishing in
Chemistry.
Almost all of the world's scientific and technical literature is still
published in primary form on paper, as indeed exemplified by these conference
proceedings. In molecular science subjects in particular, this can be a
particularly restricting medium. To cope, the subject has evolved an arcane and
often obscure typeset symbolism which can easily lead to isolationism and lack
of integration with other subject areas. Although experiments in electronic
publishing in chemistry go back a surprising thirty years, the advent of the
World-Wide Web (WWW) system in the last five years has introduced an
unparalleled opportunity to "re-invent" the scholarly journal and the means by
which advances in the subject are communicated.[2]
HTML was introduced around 1990 as a mark-up language associated with the WWW,
and provides support for text based information transmission and display. In
1992, features were added to HTML to add support for images, and via a
protocol known as MIME (Multipurpose Internet Mail Extension) for other media
types such as audio or video. Hitherto conspicuous by its absence was any
support for three dimensional model media types. Whilst the molecular sciences
now had a broad framework for mapping conventional printed journals to
electronic form, in essence little infra-structure existed to advance the
expression of the subject in this new medium.
Virtual Reality modelling language (VRML)[3] is a
relatively recent innovation on the World-Wide Web for expressing complex three
dimensional information on the Internet and which we believe holds much
potential for molecular sciences. It is most simply expressed as a three
dimensional extension to the two dimensional ASCII character set. In the
latter, a single byte of information suffices to encode the quite complex shape
of a letter, numeral or other character in the standard ASCII set. A local
program (word processor, editor, World-Wide Web client) serves to convert this
byte of information symbolic representations of some very specific 2D objects
(the ASCII characters) is a particularly concise way of transmitting
information. Encoding the actual shapes of the characters as a bit-mapped image
would result in far larger files. In VRML 1.0, a set of three dimensional
objects, such as spheres or cylinders, can be allocated a size, texture or
colour, and position, and represented in a 3D space using a visualisation
program. In the same way that a text file is a highly compact method of
transmitting information where the task of screen rendering is performed
locally, so VRML is a very efficient method for transmitting visually complex
3D information. A VRML file has the potential for being far more compact than a
bit mapped 2D image or even a bit-mapped 3D animation file in MPEG format.
The adoption of VRML has been particularly prominent in those subject areas
which are dominated by three dimensional concepts, such as the molecular
sciences. However, whereas there exist a wide variety of tools for manipulating
the fundamental ASCII character set (e.g. ranging from simple text editors to
sophisticated page setting programs), far fewer tools exist for creating
molecular VRML models. In part at least, this is because such tools require a
reasonable knowledge of the Open Inventor[4] file
format on which the VRML 1.0 specification is a self-consistent subset. The
Open Inventor Toolkit itself is a C++ object class specific to 3D model. Having
been ported to the Windows operating system and to all the major Unix
platforms, Open Inventor makes a logical starting point for our VRML authoring
development.
The EyeChem Module Suite
Our initial implementation of a VRML toolkit was based around our
EyeChem suite of modules[5] that run within the
IRIS Explorer visualisation system. Explorer's 3D rendering is based on Open
Inventor and several EyeChem modules implement it. Extending EyeChem to produce
VRML encoded representations hterefore required little modification.
EyeChem was initially developed to visualise molecular models. These include
ball and stick and molecular surface representations to quantum mechanical
calculations of molecular systems. Modules can be added whenever needed and
interfaces for appropriate visualisations can be rapidly assembled. Recent
additions to the EyeChem suite includes programs to automatically generate VRML
files to represent the 3D scatter plot data from a structural database query[6].
Any toolkit also has to include a facility for generating hyperlinks within the
3D objects described, in an analogous manner to how HTML (hypertext mark-up
language) introduces the concept of hyperlinks within a collection of ASCII
characters. These different representations of molecular properties were
integrated together with the aid of additional EyeChem modules to generate VRML
files containing any necessary hyperlinks between the various rendered
properties.
Taken together, these various tools have allowed us to construct a three
dimensional equivalent to the two dimensional hyperglossary we have previously
described[7], in which hyperlinks serve to
establish connections between various molecular data expressed as three
dimensional rendered objects on a computer screen. We used this concept to
complement a television program by illustrating in a popular manner how the
properties of the molecule dimethyl sulfate relate to the half-life of this
species in the bloodstream.[8] In this, we believe
we have now achieved a genuine advance in molecular visualisation over the more
conventional medium of print, and one which integrates well with other
communication media such as television.
The Limitations of VRML 1.0 for Molecules
Our work has exposed a number of limitations in the VRML 1.0 specification. For
example, the file format is often ill-suited for our needs. In a 3D ball and
stick molecular model, each sphere and cylinder needs to be explicitly defined
and cannot be grouped into sets. Where one has molecules with tens of thousands
of atoms and bonds, this becomes very unwieldy. Moreover, no VRML node exists
that is appropriate for a protein ribbon representation, which molecular
biologists use to represent higher order structures in very large molecules. We
have circumvented this problem by defining our own nodes in a VRML 1.0
extension. The ribbon in the DNA example that is viewed in a VRML client such
as Webspace is represented by an Open Inventor NURBS node. The main drawback of
this approach is that most of the existing generation of VRML clients do not
understand NURBS and therefore cannot view the ribbons.
Another limitation is that although compact for small molecules, VRML file
sizes can be prohibitively large for larger molecules, especially when
additional molecular geometries such as surfaces are required. Partially in
response to the very large VRML files that we showed could be generated by
molecular models, gzip compression was introduced in order to reduce file
sizes, with the decompression having to be performed at the client end. Clearly
however, this does not represent a scaleable solution to the problem of
representing complex molecular data, and new solutions need to be
found.
The Molecular Object Toolkit (MOT) for VRML Authoring.
Molecular Inventor, under development at SGI by Mark Benzel, is an Open
Inventor node class specific for molecules. It includes nodes that hold atom
and bond data, and nodes to display molecules and atomic surfaces. The
MOT we are developing is a suite of Molecular Inventor-based programs that
transcribe molecular data of interest into VRML. In a broad sense MOs are
EyeChem-like modules that run as stand-alone programs without any Explorer
graphical interface. MOT file readers load molecular data, whilst other MOTs
can generate geometric representations such as surfaces or ribbons. The MOT
VRML writer will create optimised VRML files of the geometric representations.
We are currently implementing MOTs as stand-alone programs that will run in Web
Severs using the cgi format. The advantage here is that molecular data can
reside in the Web Server in whatever format it was created. Only when it is
accessed will a VRML file of it be created dynamically. As VRML is still
evolving, this will preclude the need to manually prepare a new VRML file if
and when its format is modified. Moreover, the scientist need not know anything
about VRML to publish their molecular data on the Web in this form.
MOzART 1.0: An "Open" Molecular VRML Editor
We are developing MOzART as an extensible stand-alone VRML authoring
environment based on Molecular Inventor. It implements the MOTs to input the
various molecular file formats and display a 3D model of the molecule and its
associated properties. Various components of the 3D model can be selected and
hyperlinks for these can be entered. Using the MOT VRML Writer, the 3D model
can then be saved as VRML files.
Although MOzART will have most of the capabilities of EyeChem, without the
Explorer overhead its performance will be far superior. One of the prime
advantages however is the extensibility of EyeChem through the addition of
modules. As MOzART will be an open system, extending the environment through
the addition of programs will also be possible.
The Moving Worlds VRML 2.0 Proposal
The direction that VRML is heading in its second incarnation (V2.0) is
hotly contested as several major vendors, including SGI, Apple, Sun and
Microsoft, have submitted VRML 2.0 proposals. The proposed evolution that is
fully compatible with the developments we have outlined above is the Moving
Worlds Specification[9] by SGI in collaboration
with Sony and WorldMaker. Although it contains the existing VRML 1.0 nodes,
Moving Worlds does have several powerful features making it appropriate for our
needs. Object behaviour is described by script nodes. Script nodes can contain
Java[10] applets or any script type that the
browser can interpret. Extensions to the specification are accomplished by node
prototyping. An alternate representation node has a pointer to a more complex
representation if the browser cannot handle the extension nodes. Examples using
Moving Worlds specification are under preparation.[11]
The HyperWave (Hyper-G) Server
We have described above a scenario for developing electronic libraries and
publishing using a three dimensional metaphor particularly appropriate for
molecular sciences. Creating a complex and extensively cross-hyperlinked
document collection has other implications that must be considered. For example
as the document collection increases in size, the indexing of collections and
maintenance of hyperlinks within text documents and VRML files can become a
major problem. First generation World-Wide Web servers had few intrinsic tools
to automate these processes. As part of the molecular VRML project, we
implemented a HyperWave (previously known as Hyper-G)[12] server since November 1995 as part of a pilot
project to create an on-line electronic journal. Hyper-Wave has numerous
advantages over conventional WWW servers that make it particularly well-suited
for electronic publishing. Of particular significance is its ability to
communicate with all existing Hyper-G servers world-wide. If documents are
moved or deleted the changes are propagated throughout the server and to all
other servers.
We are implementing Molecular Object VRML Authoring Toolkit as Hyper-Wave
cgi-bin programs. By running the MOs in HyperWave, we will be able to take
advantage of the ability to add gateways in HyperWave to remote databases using
the stateful protocols that HyperWave can support. This in turn would in
principle at least allow access to the vast storehouse of information available
in existing molecular databases. We are investigating mechanisms whereby
HyperWave can retrieve the file from a molecular database, and which in turn
can be converted on-the-fly to VRML for local viewing whenever required.
Conclusions
Much of the technology described in this paper relates to how the
visually complex subject of chemistry can be integrated into new mechanisms for
exchanging information such as electronic journals, virtual libraries or
conferencing mechanisms. We have discussed here the development of a new
generation of visulisation tools appropriate for electronic publishing in
chemistry. An enormous amount of work still needs to be done, but a vision of
how we might be operating in the future is already emerging.
Acknowledgements: We thank in particular Peter Murray-Rust, Christopher
Page and Christopher Leach for many helpful discussions and contributions to
the work described here.
References
1 This paper is on-line as
http://www.ch.ic.ac.uk/rzepa/eg/
[2] H. S. Rzepa, B. J. Whitaker and M. J. Winter, >Chemical
Communications, 1994, 1907; O. Casher, G. Chandramohan, M. Hargreaves, C.
Leach, P. Murray-Rust, R. Sayle, H. S. Rzepa and B. J. Whitaker, J. Chem.
Soc., Perkin Transactions 2, 1995, 7; H. S. Rzepa, "The Future of
Electronic Journals in Chemistry". Trends in Analytical Chemistry, 1995,
14, 464; B. J. Whitaker and H. S. Rzepa, "Chemical Publishing on the Internet",
Conference on Chemical Information, Nimes, France, October, 1995; D.
James, B. J. Whitaker, C. Hildyard, H. S. Rzepa, O. Casher, J. M. Goodman, D.
Riddick, P. Murray-Rust, "The Case for Content Integrity in Electronic
Chemistry Journals: The CLIC Project", New Review of Information
Networking, 1996, in press.
3 G. Bell, A Parisi, M. Pesce, "The Virtual Reality Modeling
Language", November 1994. See http://www.eit.com/vrml/vrmlspec.html.
[4] J. Wernecke, "The Inventor Mentor: Programming
Object-Oriented 3D Graphics with Open Inventor(TM)" Re. 2, Reading,
Massachusetts: Addison-Wesley Publishing Company, 1994.
5 O. Casher, H. S. Rzepa and S. Green, "EyeChem 1.0: A Modular
Chemistry Toolkit for Collaborative Molecular Visualisation.", J. Mol.
Graphics, 1994, 12, 226. See http://www.ch.ic.ac.uk/jmg/CRG.html;
O. Casher and H. S. Rzepa, "Chemical Collaboratories using World-Wide Web
Servers and EyeChem Based Viewers", J. Mol. Graphics,1995, 13, 268.
6 O. Casher, C. S. Page and H. S. Rzepa, paper presented at the 2nd
Electronic Computational Chemistry Conference (ECCC2), November 1995; see
http://www.ch.ic.ac.uk/eccc2/ This paper is also due to be published in
Theochem, 1996.
7 C. Leach, P. Murray-Rust and H. S. Rzepa, "Electronic Conference
on Trends in Organic Chemistry", (Eds H. S. Rzepa, J. M. Goodman and C. Leach),
June, 1995.
8 O. Casher, H. S. Rzepa and D. A. Widdowson, supplemental material
to the television program Equinox, transmitted on Channel 4 (UK) on November
29, 1995; see http://www.ch.ic.ac.uk/equinox/
9 C. Marrin et. al, "Moving Worlds Specification", February 1996.
See http://webspace.sgi.com/moving-worlds/spec/spec.main.html
10 "Java(TM): Programming for the Internet", Sun Microsystems, Inc.,
1995. See http://java.sun.com/ For chemical examples of Java, see
http://www.ch.ic.ac.uk/java/
11 O. Casher, H.S. Rzepa, "Molecular Moving Worlds", 1996. See
http://www.ch.ic.ac.uk/VRML/mmw.html
[12] See K. Schmaranz, "Hyper-G and Electronic Publishing",
in "Hyper-G. The Next Generation Web Solution", H. Maurer (Ed), Addison-Wesley,
1996.