Chemo-informatics Activity. Department of Chemistry, Imperial College

We are an afiliate of the Imperial College Centre for Bioinformatics.

Group activities

The principle chemo-informatics activity resides with the group of Henry S. Rzepa. Although the origins of our work in the chemical information sciences date back prior to 1994, it was in that year that we fully started exploring the potential of the Internet in this area, via a close collaboration with Peter Murray-Rust. Our initial work involved utilising the MIME mechanism as a simple hierarchical identification mechanism for chemical content. This was extended to investigating the application of much more finely grained content description frameworks using SGML-derived markup languages. In 1995 this had matured to formalising the chemical markup using an SGML DTD, and by 1996, the SGML/DTD framework had generically evolved into XML (eXtensible markup language) and the chemical markup (CML) was recast as a formal XML language (the first such Science-based markup). This was first described in several review articles published in 1997, and more formally so in 1999 in the Journal of Chemical Information and Computing Science. Examplars of the application of CML appeared in 2000, around the Chimeral and other projects, and these include the first chemical applications of extensible stylesheets (XSLT) and XML Schemas (XSD). Most recently, the focus has been on developing rule-based formal validations for CML using Schemas, and on agent/robot based methods of harvesting Internet-resident molecular information and transforming it to XML representations for deposit into XML repositories.

Other avenues explored have included the creation of the genre of electronic conferences in organic chemical areas (the ECTOC series, of which there have been four), and major projects in collaboration with the Royal Society of Chemistry to create a new generation of chemical electronic journal (the CLIC project), the outcome of which have been several e-only new RSC journals. Most recently, our work in XML has been directed to illustrating how an XML-based chemistry journal could be used to create what we have termed the Chemical Semantic Web, in which both human perception and machine recognition and processing are used to create new forms of knowledge base.

Other Web Resources

In 1999, we established the www.xml-cml.org Web site as a repository of XML/CML related information and code. This site has extensive documentation, and the definitive versions of XML DTDs and Schemas. We also maintain a consultancy company for commercial and industrial applications of XML/CML.

Publications in Chemo-informatics. 1994 - 2005.

  1. A Proposal of a Primary MIME Content Type of Chemical, H. S. Rzepa and P. Murray-Rust, 1994, May-November.
  2. Chemical Applications of the World-Wide-Web, H. S. Rzepa, B. J. Whitaker and M. J. Winter, J. Chem. Soc., Chem. Commun., 1994, 1907. [3.2]
  3. Hyperactive Molecules and the World-Wide-Web Information System, O. Casher, G. Chandramohan, M. Hargreaves, C. Leach, P. Murray-Rust, R. Sayle, H. S. Rzepa and B. J. Whitaker, J. Chem. Soc., Perkin Trans 2, 1995, 7. [1.8]
  4. A Chemical Collaboratory using Explorer EyeChem and the Common Client Interface, O. Casher and H. S. Rzepa, Computer Graphics, 1995, 29, 52.
  5. Chemistry Electronic Conferences, H. S. Rzepa, Trends in Analytical Chemistry, 1995, 14, 240-242.
  6. Chemical Collaboratories using World-Wide Web Servers and EyeChem Based Viewers, O. Casher and H. S. Rzepa, J. Mol. Graphics,1995, 13, 268. [2.2]
  7. The Future of Electronic Journals in Chemistry, H. S. Rzepa, Trends in Analytical Chemistry, 1995, 14, 464.
  8. Chemistry and the World-Wide-Web, H. S. Rzepa, chapter in "The Internet: A Guide for Chemists", Ed. S. Bachrach, American Chemical Society, 1995.
  9. Surfing the Chemical Net, M. J. Winter, H. S. Rzepa and B. J. Whitaker, Chem. Brit., 1995, 685. [1.1]
  10. The Case for Content Integrity in Electronic Chemistry Journals: The CLIC Project, D. James, B. J. Whitaker, C. Hildyard, H. S. Rzepa, O. Casher, J. M. Goodman, D. Riddick and P. Murray-Rust, , New. Rev. Information Networking, 1996, 61.
  11. Proceedings of the First Electronic Conference on Trends in Organic Chemistry (ECTOC-1), H. S. Rzepa, J. M. Goodman and C. Leach (Editors), ISBN 0 85404 899 5, CD ROM Version, The Royal Society of Chemistry, 1996.
  12. A Molecular Hyperglossary: Organic Molecular information in Hypermedia Form, C. Leach, P. Murray-Rust and H. S. Rzepa, Electronic Conference on Trends in Organic Chemistry (ECTOC-1) ISBN 0 85404 899 5, Eds. H. S. Rzepa, J. M. Goodman and C. Leach (CD-ROM), Royal Society of Chemistry publications, 1996.
  13. Advanced VRML Based Chemistry Applications: A 3D Molecular Hyperglossary, O. Casher, C. Leach, C. S. Page and H. S. Rzepa, Theochem, 1996, 368, 49.
  14. H. S. Rzepa, Science and the Internet: The World-Wide Web, Science Progress, 1996, 79, 97.
  15. The Chemical MIME Project: A Working Example, H. S. Rzepa, W. Locke, P. Murray Rust, B. Whitaker, 5th International Conference on Perspectives On Protein Engineering 1996, 1996, Montpellier, France. CD ROM: ISBN 0952901501.
  16. Proceedings of the Electronic Conference on Trends in Heterocyclic Chemistry (ECHET96), H. S. Rzepa, J. Snyder and C. Leach, (Eds), ISBN 0-85404-894-4, CD ROM Version, The Royal Society of Chemistry, 1997.
  17. Similarity analysis of chemical content within ECHET96 contributions, C. Leach and H. S. Rzepa, Electronic Conference on Trends in Heterocyclic Chemistry (ECHET96), H. S. Rzepa, J. Snyder and C. Leach, (Eds), ISBN 0-85404-894-4, CD ROM Version, The Royal Society of Chemistry, 1997.
  18. Conference access statistics and the browsing habits of ECHET96 readers, C. Leach and H. S. Rzepa, Electronic Conference on Trends in Heterocyclic Chemistry (ECHET96), H. S. Rzepa, J. Snyder and C. Leach, (Eds), ISBN 0-85404-894-4, CD ROM Version, The Royal Society of Chemistry, 1997.
  19. HyperWave: An Electronic Conference Manager, O. Casher and H. S. Rzepa Electronic Conference on Trends in Heterocyclic Chemistry (ECHET96), H. S. Rzepa, J. Snyder and C. Leach, (Eds), ISBN 0-85404-894-4, CD ROM Version, The Royal Society of Chemistry, 1997.
  20. The Internet as a Chemical Information Tool", H. S. Rzepa, P. Murray-Rust and B. J. Whitaker, Chem. Soc. Revs, 1997, 1-10. [6.7]
  21. "The Chemical MIME Project", H. S. Rzepa, P. Murray-Rust and B. J. Whitaker, Chem. Intl., 1997, 19, 17.
  22. The Internet as a Computational Chemistry Tool", H. S. Rzepa, Theochem, 1997, 398-399, 27-33
  23. Proceedings of the Electronic Conference on Trends in Organometallic Chemistry (ECTOC-3), H. S. Rzepa and C. Leach, (Eds), ISBN 0-85404-889-8, CD ROM Version, The Royal Society of Chemistry, 1998.
  24. Internet-based Computational Chemistry Tools, H. S. Rzepa, in Encyclopaedia of Computational Chemistry, Wiley, 1998,1426.
  25. The Application of Chemical Multipurpose Internet Mail Extensions (Chemical MIME) Internet Standards to Electronic Mail and World-Wide Web information exchange, H. S. Rzepa, P. Murray-Rust and B. J. Whitaker, J. Chem. Inf. Comp. Sci., 1998, 38, 976-982. [2.1]
  26. VChemLab: A Virtual Chemistry Laboratory. The storage, retrieval and display of chemical information using standard Internet Tools, H. S. Rzepa and A. P. Tonge, J. Chem. Inf. Comp. Sci., 1998, 38, 1048-1053. [2.1]
  27. Virtual Reality Modelling Language (VRML) in Chemistry, O. Casher, C. Leach, C. S. Page and H. S. Rzepa, Chem. in Brit., 1998, 34(9), 26. [1.1]
  28. A History of Hyperactive Chemistry on the Web: From Text and Images to Objects, Models and Molecular Components. H. S. Rzepa, Chimia, 1998, 52, 653-657.
  29. The Internet as a medium for Science Communication, H. S. Rzepa, Science Communication, Open University Press, 1998.
  30. MoldaNet. A Network Distributed Molecular Graphics and Modelling Program that Integrates Secure Signed Applet and Java 3D Technologies, H. Yoshida, H. S. Rzepa, A. P. Tonge, J. Mol. Graph. Mod., 1998, 16, 144-149. [2.0]
  31. Authentication of Internet-based Distributed Computing Models in Chemistry,(see DOI) A. P. Tonge, H. S. Rzepa and Hiroshi Yoshida, J. Chem. Inf. Comp. Sci., 1999, 39, 483-490. [2.1]
  32. The Internet as a Computational Chemistry Tool, H. S. Rzepa, J. Mol. Struct (Theochem), 1999, 463, 217.
  33. Chemical markup Language and XML Part I. Basic principles (see DOI or reprint), P. Murray-Rust and H. S. Rzepa, J. Chem. Inf. Comp. Sci., 1999, 39, 928.
  34. Use of Meta-data in Chemical Content Relevancy Ranking, G. V. Gkoutos and H. S. Rzepa, Electronic Conference on Synthesis in Organic Chemistry (ECSOC), CD-ROM, 1999.
  35. Hierarchical display of Chemical Data in Web Browsers, Georgios V. Gkoutos, Henry S. Rzepa and Michael Wright, Internet J. Chem., 2000, 3, article 7.
  36. A Mechanism for Creating Chemically Oriented Internet Search Channels, Georgios V. Gkoutos and Henry S. Rzepa, Internet J. Chem., 2000, 3, 2000, 3, article 8.
  37. A Universal approach to Web-based Chemistry using XML and CML, P. Murray-Rust, H. S. Rzepa, M. Wright and S. Zara, ChemComm, 2000, 1471-1472.
  38. JChemTidy: A Tool for Converting Chemical Web Document Collections to an XHTML Representation. G. V. Gkoutos, P. Kenway and H. S. Rzepa, J. Chem. Inf. Comp. Sci., 2001, 41, 253-258 DOI.
  39. A robot-based resource discovery tool for adding chemical meta-information and value to web-based documents, G. V. Gkoutos, P. R. Kenway and H. S. Rzepa, New. J. Chem., 2001, 635-638.
  40. Development of Chemical Markup Language (CML) as a System for Handling Complex Chemical Content, Peter Murray-Rust, Henry S. Rzepa and Michael Wright, New J. Chem., 2001, 618-634.
  41. A Resource for Transforming HTML and Molfile Documents to XML Compliant Form, G. V. Gkoutos, P. R. Kenway, P. Murray-Rust, H.S. Rzepa and M. Wright, Internet J. Chem., 2001, 4, Article 5.
  42. Chemical Markup, XML and the World-Wide Web. Part II: Information Objects and the CMLDOM. P. Murray-Rust and H. S. Rzepa, J. Chem. Inf. Comp. Sci., 2001, 1113. (DOI, supporting information)
  43. Chemical Markup, XML and the World-Wide Web. Part III: Towards a signed semantic Chemical Web of Trust, G. Gkoutos, P. Murray-Rust, H. S. Rzepa and M. Wright, J. Chem. Inf. Comp. Sci., 2001, 1124. (DOI), supporting information)
  44. A New Publishing Paradigm: STM Articles as part of the Semantic Web, H. S. Rzepa and P. Murray-Rust, Learned Publishing, 2001, 14, 177.
  45. The Application of XML Languages for Integrating Molecular Resources, G. V. Gkoutos, P. Murray-Rust, H. S. Rzepa, and M. Wright, Internet J. Chemistry, 2001, article 13.
  46. ChemDig: New approaches to Chemically Significant Indexing and Searching of Distributed Web Collections, G. V. Gkoutos, C. Leach and H. S. Rzepa, New. J. Chem., 2002, 656-666.
  47. Scientific publications in XML - towards a global knowledge base, P. Murray-Rust and H. S. Rzepa, Data Science, 2002, 1, 84-98.
  48. STMML. A Markup Language for Scientific, Technical and Medical Publishing, P. Murray-Rust and H. S. Rzepa, Data Science, 2002, 1, 1-65.
  49. Chemistry Preprints, H. S. Rzepa, J. Chem. Inf. Comp. Tech, 2002, 42, 767.
  50. P. Murray-Rust and H. S. Rzepa, "Markup Languages- How to structure chemistry related documents", Chemistry Intl., 2002, 24(4), 9-13.
  51. H. S. Rzepa and M. Williamson, "Chemstock: A Web-based Chemical Inventory system built from OpenSource Software Components", Internet J. Chem, 2002, article 6.
  52. P. Murray-Rust, and H. S. Rzepa "Towards the Chemical Semantic Web", Proc. 2002 International Chemical Information Conference, ed H. Collier. (Infonortics) 2002, pp 127-139.
  53. P. Murray-Rust and H. S. Rzepa, chapter in "Handbook of Chemoinformatics. Part 2. Advanced Topics.", ed. J. Gasteiger and T. Engel, 2003, Vol 1.
  54. G. V. Gkoutos, H. S. Rzepa P. Murray-Rust, "Online Validation and Comparison of Molfile and CML Molecular Atom-Connection Descriptors", Internet J. Chem.,, 2003, article 1.
  55. P. Murray-Rust and H. S. Rzepa, "Chemical Markup, XML and the Worldwide Web. Part 4. CML Schema", J. Chem. Inf. comp. Sci.,, 2003, 43, issue 4.
  56. G. V. Gkoutos, H. S Rzepa, R. M. Clark, O. Adjei, H. Johal, "Chemical Machine Vision: Automated extraction of chemical meta-data from raster images", J. Chem. Inf. comp. Sci.,, 2003, 43, issue 5
  57. P. Murray-Rust and H. S. Rzepa, "Towards the Chemical Semantic Web. An introduction to RSS ", Internet J. Chem., 2003, 6, article 4.
  58. P. Murray-Rust and H. S. Rzepa, "XML for scientific publishing", OCLC Systems and Services, 2003, 19, 162-169.
  59. P. Murray-Rust and H. S. Rzepa, "The Next Big Thing: From Hypermedia to Datuments", J. Digital Inf., 2004, 5, article 248, 2004-03-18.
  60. P. Murray-Rust, H. S. Rzepa, M. J. Williamson and E. L. Willighagen, "Chemical Markup, XML and the Worldwide Web. Part 5. Applications of Chemical Metadata in RSS Aggregators", J. Chem. Inf. Comp. Sci.,, 2004, 44, 462-469. See also.
  61. P. Murray-Rust, H. S. Rzepa, S. M. Tyrrell and Y. Zhang, "Representation and use of Chemistry in the Global Electronic Age", Org. Biomol. Chem., 2004, in press.
  62. J. Wakelin, P. Murray-Rust, S. Tyrrell, Y. Zhang, H.S. Rzepa, A Garcia, "CML Tools and Information Flow in Atomic Scale Simulations", Mol. Simulations, 2004, in press.
  63. P. Murray-Rust, H. S. Rzepa and S. Stein, "The INChI as an LSID for molecules in lifescience", W3C Workshop on Semantic Web for Life Sciences, 27-28 October 2004, Cambridge, Massachusetts USA.
  64. P. Murray-Rust, H. S. Rzepa, "Towards a semantic web for chemistry in lifescience", W3C Workshop on Semantic Web for Life Sciences, 27-28 October 2004, Cambridge, Massachusetts USA.
  65. P. Murray-Rust, H. S. Rzepa and Y. Zhang, "Googling for INChIs; A remarkable method of chemical searching", W3C Workshop on Semantic Web for Life Sciences, 27-28 October 2004, Cambridge, Massachusetts USA.

Henry Rzepa, November 2004.