Skip to main content
Fig. 2 | Journal of Cheminformatics

Fig. 2

From: KnowTox: pipeline and case study for confident prediction of potential toxic effects of compounds in early phases of development

Fig. 2

Schematic description of CP workflow. Data is split into training and test set (blue box). The training set is further divided into calibration (red box) and proper training set (violet box). An ML model is fitted on the proper training set and used to predict compounds of the calibration and test set. Predictions are transformed into nonconformity scores (nc scores). Calibration is conducted by sorting the nc scores of the calibration set (class-wise, mondrian) into two lists. The nc score of a test compound is arranged in the list and thus the p-value calculated. An additional normaliser model (green box) can optionally be fitted on the descriptors and nc scores of the compounds of the proper training set

Back to article page