Skip to main content

Table 1 Overview of indexing and searching errors

From: Sachem: a chemical cartridge for high-performance substructure search

Cartridge

Indexing failures

Rejected queries

1M

10M

94M

n

reason

Bingo

105

1024

9754

0

 

OrChem

2

4

12

Unsupported aromatic bond in SMILES

pgchem

30

255

2527

146

Fragmented SMILES, queries with [*]

RDKit

72

707

6911

4

Chemical structure considered invalid

Sachem

0

0

0

0

 
  1. Measurements are slightly influenced by errors that some cartridges exhibited during benchmarking, due to both indexing and searching errors. Indexing errors are primarily reported as unacceptable data in the SDF files from PubChem, most frequently as invalid atom valences or stereochemistry. Note that Bingo beta version can lower the number of indexing errors by using algorithms that work with ‘incorrect’ structures (this feature is disabled by default)