Skip to main content
Fig. 14 | Journal of Cheminformatics

Fig. 14

From: PubChem chemical structure standardization

Fig. 14

Standardization time statistics. Time was measured as wall time on a mixed-use, heterogeneous compute cluster. a Per substance standardization time as non-cumulative histogram. For each bin, the lower (inclusive) and upper (exclusive) boundary is provided. Making the step from s to min, a value of 0.17 min equals 10 s. b Cumulative standardization time per substance (sorted by ascending standardization time). 10% of total standardization time is spent on 97.95% of all substances. Within those 97.95%, the average standardization time is 0.0019 s (± 0.0012 s). c Average contribution to per substance standardization time per standardization step. Standardization steps are numbered by roman numerals: verify element (I), verify hydrogen (II), verify functional groups (III), verify valence (IV), standardize annotations (V), standardize valence bond form (VI), standardize aromaticity (VII), standardize stereochemistry (VIII), standardize explicit hydrogens (IX). For each substance, the time necessary for standardization is dominated by step (VI), which performs valence bond canonicalization (44 ± 12%)

Back to article page