Skip to main content

Advertisement

Table 5 Summary of systematic benchmark comparing v1.4.19 to v2.0

From: The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching

Benchmark Data set CDK v1.4.19 CDK v2.0 Improvement
Skip Time Per min Skip Time Per min
countheavy ChEBI 149 smi 2112 22.51s 108.2K 9 0.85s 2.9M 26.48
sdf 0 7.21s 355.4K 25 3s 854.1K 2.4
ChEMBL 22.1 smi 0 8m39.3s 193.9K 9 10.74s 9.4M 48.35
sdf 0 3m17.29s 510.4K 0 53.27s 1.9M 3.7
rings -mark ChEBI 149 smi 2112 22.91s 106.3K 9 1.06s 2.3M 21.61
sdf 0 8.71s 294.2K 25 3.11s 823.9K 2.8
ChEMBL 22.1 smi 0 8m45.78s 191.5K 9 17.09s 5.9M 30.77
sdf 0 4m12.01s 399.6K 0 1m6.54s 1.5M 3.79
rings -sssr ChEBI 149 smi 2112 27.4s 88.9K 9 1.43s 1.7M 19.16
sdf 0 11.84s 216.4K 25 3.78s 677.8K 3.13
ChEMBL 22.1 smi 0 12m4.62s 139K 9 27.16s 3.7M 26.68
sdf 0 7m9.58s 234.4K 0 1m8.17s 1.5M 6.3
rings -all ChEBI 149 smi 2126 45.28s 53.8K 26 1.26s 1.9M 35.94
sdf 16 36.56s 70.1K 40 3.51s 730K 10.42
ChEMBL 22.1 smi 88 12m40.2s 132.5K 9 24.97s 4M 30.44
sdf 90 8m5.64s 207.4K 0 1m5.68s 1.5M 7.39
cansmi ChEBI 149 smi 2112 36.58s 66.6K 9 1.91s 1.3M 19.15
sdf 35 21.15s 121.1K 26 4.37s 586.3K 4.84
ChEMBL 22.1 smi 14 14m33.86s 115.2K 9 40.84s 2.5M 21.4
sdf 0 8m59.82s 186.6K 0 1m29.33s 1.1M 6.04
convert -ofmt smi ChEBI 149 smi 2112 35.63s 68.4K 16 1.47s 1.7M 24.24
sdf 35 20.91s 122.5K 25 4.55s 563.1K 4.6
ChEMBL 22.1 smi 14 14m26.02s 116.3K 37 26.2s 3.8M 33.05
sdf 0 8m59.38s 186.7K 1 1m12.49s 1.4M 7.44
convert -ofmt sdf ChEBI 149 smi 2112 32.42s 75.1K 9 10.39s 234.4K 3.12
sdf 13 17s 150.7K 25 13.96s 183.5K 1.22
ChEMBL 22.1 smi 0 14m25.82s 116.3K 9 5m26.29s 308.6K 2.65
sdf 1 8m51.33s 189.5K 0 6m34.5s 255.3K 1.35
convert -gen2d -ofmt sdf ChEBI 149 smi 2112 24m28.02s 1.7K 9 35.86s 67.9K 40.94
sdf 13 35m12.03s 1.2K 25 42.43s 60.4K 49.78
ChEMBL 22.1 smi 0 3h27m7s 8.1K 9 17m44.64s 94.6K 11.67
sdf 1 5h58m30s 4.7K 0 19m42.77s 85.1K 18.19
fpgen -type path ChEBI 149 smi 2112 1m38s 24.9K 9 10.28s 236.9K 9.53
sdf 0 2m11.03s 19.6K 25 13.03s 196.6K 10.06
ChEMBL 22.1 smi 0 42m56.15s 39.1K 9 6m34.67s 255.2K 6.53
sdf 0 47m5.58s 35.6K 0 7m52.32s 213.2K 5.98
fpgen -type maccs ChEBI 149 smi 2150 1h37m35s 416 9 19.51s 124.8K 300.1
sdf 48 1h44m17s 409 25 21.25s 120.6K 294.45
ChEMBL 22.1 smi 214 20h24m57s 1.4K 9 13m31.21s 124.1K 90.6
sdf 225 24h41m46s 1.1K 0 13m26.41s 124.9K 110.25
fpgen -type circ ChEBI 149 smi 0   9 4.37s 557.4K 0
sdf 0   25 6.81s 376.2K 0
ChEMBL 22.1 smi 0   9 2m43.45s 616.1K 0
sdf 0   0 3m42.01s 453.6K 0
  1. The total elapsed real time was measured with the unix time utility. The throughput is reported in molecules per minute (K = thousand, M = million) as a relatable metric. This throughput was calculated by taking the total elapsed time and dividing it by the number of molecule in the dataset (42704 for ChEBI 149, and 1678393 for ChEMBL 22.1). The ChEBI SMILES input contains 2107 blank (but valid) inputs, this accounts for the majority skipped in v1.4.19. The throughput calculation was adjust to account for this