Skip to main content

Table 1 Atom and bond properties, and currently reserved extensions, used by the SketchEl molecule format

From: Machines first, humans second: on the importance of algorithmic interpretation of open chemistry data

Atom core properties

Element

An arbitrary string, which typically matches one of the symbols from the periodic table. If not an element, and there is no inline abbreviation for the atom, then the overall representation does not encode a molecule, but rather a template or query.

x, y

2D layout positions, in quasi-Angstrom units, with the idealised bond length being 1.5.

Charge

Formal atomic charge for the chemical species: must be an integer.

Unpaired

Number of unpaired electrons: a whole number. This is used to help calculate the valence, and is primarily relevant only for main block elements.

Virtual hydrogens

By default, implicit hydrogen atoms are calculated automatically for C, N, O, P and S, and zero for all other elements. Non default values allow the number of extra hydrogens to be specified explicitly, as 0 or more.

Extensions

An arbitrary list of strings associated with the atom, some of which have prefixes that are reserved (see below).

Bond core properties

from, to

The two connecting atoms for the bond.

Order

Bond order: a whole number, which is typically one of 0, 1, 2, 3, 4 or 5. Values of 4 and 5 are extremely rare, while values of 0 are used extensively for bonding arrangements that do not follow the simple Lewis octet rule.

Stereo type

Flat by default, but can also be inclined or declined (so-called wedge bonds) or non-stereospecific (usually drawn as squiggly lines).

Extensions

An arbitrary list of strings associated with the atom, some of which have prefixes that are reserved (see below).

Atom reserved extension properties

z

Optional third dimension: the existence of z-coordinates implies that the molecule is not a flat 2D depiction but rather a 3D conformation.

Isotope

Specific isotope enrichment, where the default value of 0 implies a natural isotope distribution.

Mapping number

Integer mapping number associated with the atom. This can be used for any purpose, but is often for correlating atoms in a series or a reaction.

Query

Query properties used to specify how to match a variety of atom types.

Abbreviation

Inline abbreviation, containing a terminal substructure fragment that defines the entire molecular species that the placeholder atom represents. Can be recursive, i.e. the abbreviation can contain its own abbreviations.

Bond reserved extension properties

Query

Query properties used to specify how to match a variety of bond types.