From: canSAR chemistry registration and standardization pipeline
PubChem Add_File_4 | ||
---|---|---|
Structures # | 375,397 | |
PubChem pipeline errors (rejected compounds) | 375,397 (100%) | |
ERRORS found PubChem Checker | Invalid isotope specifications | 141 |
Valence check | 364,946 (97.22%) | |
Identical charges on adjacent atoms or invalid valence after valence bond canonicalization | 10,243 | |
Excess the limit of 999 explicit atoms | 65 | |
canSAR pipeline errors (rejected compounds) | 285,552 (76.07%) | |
SDF parsing errors | 0 | |
Sanitization errors | 270,131 (71.96%) | |
Standardization errors | 2954 (0.78%) | |
Empty molblock | 12,467 (3.37%) | |
canSAR accepted structures | 89,845 (23.93%) |