Skip to main content

Table 2 Measures for evaluating the Quality of OpenTox Models

From: Collaborative development of predictive toxicology applications

Measures for Classification Tasks

Name

Explanation

Confusion Matrix

A confusion matrix is a matrix, where each row of the matrix represents the instances in a predicted class, while each column represents the instances in an actual class. One benefit of a confusion matrix is that it is easy to see if the system is confusing two or more classes.

Absolute number and percentage of unpredicted compounds

Some compounds might fall outside the applicability domain of the algorithm or model. These numbers provide an overview on the applicability domain fit for the compound set requiring prediction.

Precision, recall, and F2-measure

These three measures give an overview on how pure and how sensitive the model is. The F2-measure combines the other two measures.

ROC curve plot and AUC

A receiver operating characteristic (ROC) curve is a graphical plot of the true-positive rate against the false-positive rate as its discrimination threshold is varied. This gives a good understanding of how well a model is performing. As a summarisation performance scalar metric, the area under curve (AUC) is calculated from the ROC curve. A perfect model would have area 1.0, while a random one would have area 0.5.

Measures for Regression Tasks

Name

Explanation

MSE and RMSE

The mean square error (MSE) and root mean squared error (RMSE) of a regression model are popular ways to quantify the difference between the predictor and the true value.

R2

The explained variance (R²) provides a measure of how well future outcomes are likely to be predicted by the model. It compares the explained variance (variance of the model's predictions) with the total variance (of the data).