From: canSAR chemistry registration and standardization pipeline
Sure_ChEMBL (SI1) | Pubchem (SI2) | ChEMBL literautre (SI3) | |
---|---|---|---|
Structures # | 52,074 | 297,864 | 147,008 |
ChEMBL pipeline errors (not uploaded structures) | 849 (1.6%) | 10,692 (3.59%) | 0 |
ChEMBL uploaded structures | 51,225 (98.37%) | 287,172 (96.41%) | 100% |
canSAR pipeline errors (rejected structures) | 114 (0.22%) | 7431 (2.5%) | 3 (0.002%) |
SDF parsing errors | 0 | 0 | 0 |
Sanitization errors | 110 | 1540 | 2 |
Standardization errors | 4 | 67 | 0 |
Empty molblock | 0 | 5824 | 1 |
canSAR accepted structures | 51,960 (99.78%) | 290,433 (97.5%) | 147,005 (99.99%) |