In my previous post on the topic, I introduced the concept that data can come in several forms, most commonly as “raw” or primary data and as a “processed” version of this data that has added value. In crystallography, the chemist is interested in this processed version, carried by a CIF file. However on rare occasions when a query arises about the processed component, this can in principle at least be resolved by taking a look at the original raw data, expressed as diffraction images. I established with much appreciated help from CCDC that since 2016, around 65 datasets in the CSD (Cambridge structural database) have appeared with such associated raw data. The problem is easily reconciling the two sets of data (the raw data is not stored on CSD) and one way of doing this is via the metadata associated with the datasets. In turn, if this metadata is suitably registered, one can query the metadata store for such associations, as was illustrated in the previous post on the topic. Here I explore the metadata records for five of these 65 sets to find out their properties, selected to illustrate the five data repositories thus far that host such data for compounds in the CSD database.
Raw data and the evolution of crystallographic FAIR data. Journals, processed and raw structure data.
March 28th, 2022Sir Geoffrey Wilkinson: An anniversary celebration. 23 March, 2022, Burlington House, London.
March 24th, 2022The meeting covered the scientific life of Professor Sir Geoffrey Wilkinson from the perspective of collaborators, friends and family and celebrated three anniversaries, the centenary of his birth (2021), the half-century anniversary of the Nobel prize (2023) and 70 years almost to the day (1 April) since the publication of the seminal article on Ferrocene (2022).[cite]10.1021/ja01128a527[/cite]
A four-atom molecule exhibiting simultaneous compliance with Hückel 4n+2 and Baird 4n selection rules for ring aromaticity.
March 22nd, 2022Normally, aromaticity is qualitatively assessed using an electron counting rule for cyclic conjugated rings. The best known is the Hückel 4n+2 rule (n=0,1, etc) for inferring diatropic aromatic ring currents in singlet-state π-conjugated cyclic molecules‡ and a counter 4n rule which infers an antiaromatic paratropic ring current for the system. Some complex rings can sustain both types of ring currents in concentric rings or regions within the molecule, i.e. both diatropic and paratropic regions. Open shell (triplet state) molecules have their own rule; this time the molecule has a diatropic ring current if it follows a 4n rule, often called Baird’s rule. But has a molecule which simultaneously follows both Hückel’s AND Baird’s rule ever been suggested? Well, here is one, as indeed I promised in the previous post.
More aromatic species with four atoms. B4 and N4.
March 19th, 2022I discussed in the previous post the small molecule C4 and how of the sixteen valence electrons, eight were left over after forming C-C σ-bonds which partitioned into six σ and two π. So now to consider B4. This has four electrons less, and now the partitioning is two σ and two π (CCSD(T)/Def2-TZVPPD calculation, FAIR DOI: 10.14469/hpc/10157). Again both these sets fit the Hückel 4n+2 rule (n=0).
Read the rest of this entry »
An unusually small (doubly) aromatic molecule: C4.
March 15th, 2022When you talk π-aromaticity, benzene is the first molecule that springs to mind. But there are smaller molecules that can carry this property; cyclopropenylidene (five atoms) is the smallest in terms of atom count I could think of until now, apart that is from H3+ which is the smallest possible molecule that carries σ-aromaticity. So here I have found what I think is an even smaller aromatic molecule containing only four carbon atoms. And it is not only π-aromatic but σ-aromatic.
Raw data: the evolution of FAIR data and crystallography.
March 1st, 2022Scientific data in chemistry has come a long way in the last few decades. Originally entangled into scientific articles in the form of tables of numbers or diagrams, it was (partially) disentangled into supporting information when journals became electronic in the late 1990s.[cite]10.1021/acs.orglett.5b01700[/cite] The next phase was the introduction of data repositories in the early naughties. Now associated with innovative commercial companies such as Figshare and later the non-commercial Zenodo, such repositories have also spread to institutional form such as eg the earlier SPECTRa project of 2006[cite]10.1021/ci7004737[/cite] and still evolving.[cite]10.1186/s13321-017-0190-6[/cite] Perhaps the best known, and certainly one of the oldest examples of curated structural data in chemistry is the CCDC (Cambridge crystallographic data centre) CSD (Cambridge structural database) which has been operating for more than 55 years now, even before the online era! Curation here is the important context, since there you will find crystal diffraction data which has been refined into a structural model, firstly by the authors reporting the structure and then by CSD who amongst other operations, validate the associated data using a utility called CheckCIF.[cite]10.1107/s090744490804362x[/cite] What perhaps is not realised by most users of this data source is that the original or “raw” data, as obtained from a X-ray diffractometer and which the CSD data is derived from, is not actually available from the CSD. This primary form of crystallographic data is the topic of this post.
Chasing ever higher bond orders; the strange case of beryllium.
February 7th, 2022Ever since the concept of a shared two-electron bond was conjured by Gilbert N. Lewis in 1916,[cite]10.1021/ja02261a002[/cite] chemists have been fascinated by the related concept of a bond order (the number of such bonds that two atoms can participate in, however a bond is defined) and pushing it ever higher for pairs of like-atoms. Lewis first showed in 1916[cite]10.1021/ja02261a002[/cite] how two carbon atoms could share two, four or six electrons to achieve a bond order of up to three. It took quite a few decades for this to be extended to four for carbon (and nitrogen) and that only with some measure of controversy and dispute (for one recent brief summary, see[cite]10.1039/D1CP02056K[/cite]).
Data base or Data repository? – A brief and very selective history of data management in chemistry.
January 26th, 2022Way back in the late 1980s or so, research groups in chemistry started to replace the filing of their paper-based research data by storing it in an easily retrievable digital form. This required a computer database and initially these were accessible only on specific dedicated computers in the laboratory. These gradually changed from the 1990s onwards into being accessible online, so that more than one person could use them in different locations. At least where I worked, the infrastructures‡ to set up such databases were mostly not then available as part of the standard research provisions and so had to be installed and maintained by the group itself. The database software took many different forms and it was not uncommon for each group in a department to come up with a different solution that suited its needs best. The result was a proliferation of largely non-interoperable solutions which did not communicate with each other. Each database had to be searched locally and there could be ten or more such resources in a department. The knowledge of how the system operated also often resided in just one person, which tended to evaporate when this guru left the group.
Quantum chemistry interoperability (library): another step towards FAIR data.
January 1st, 2022To be FAIR, data has to be not only Findable and Accessible, but straightforwardly Interoperable. One of the best examples of interoperability in chemistry comes from the domain of quantum chemistry. This strives to describe a molecule by its electron density distribution, from which many interesting properties can then be computed. The process is split into two parts:
Molecule of the year 2021: Infinitene.
December 16th, 2021The annual “molecule of the year” results for 2021 are now available … and the winner is Infinitene.[cite]10.33774/chemrxiv-2021-pcwcc[/cite],[cite]10.1021/jacs.1c10807[/cite] This is a benzocirculene in the form of a figure eight loop (the infinity symbol), a shape which is also called a lemniscate [cite]10.1021/jo801022b[/cite] after the mathematical (2D) function due to Bernoulli. The most common class of molecule which exhibits this (well known) motif are hexaphyrins (hexaporphyrins; porphyrin is a tetraphyrin)[cite]10.1039/b502327k[/cite],[cite]10.1021/ol0521937[/cite],[cite]10.1002/chem.200600158[/cite], many of which exhibit lemniscular topology as determined from a crystal structure. Straightforward annulenes have also been noted to display this[cite]10.1107/S1600536811048604[/cite] (as first suggested here for a [14]annulene[cite]10.1021/ol0518333[/cite]) and other molecules show higher-order Möbius forms such as trefoil knots.[cite]10.1038/NCHEM.1955[/cite],[cite]10.1039/D0CC04190D[/cite] This new example uses twelve benzo groups instead of six porphyrin units to construct the lemniscate. So the motif is not new, but this is the first time it has been constructed purely from benzene rings. Read the rest of this entry »