Skip to main content

Advertisement

Table 8 Performance of models on subsets of the compiled all-substructure set.

From: An investigation into pharmaceutically relevant mutagenicity data and the influence on Ames predictive potential

Molecule set Random Foresta TopKat Localb, c Local (remaining data)
Set C 0.79 0.65 0.80 0.65
Not polyaromatic, ArNH2, ArNO2 0.78 0.62 -  
Set D 0.88 0.90 0.89 0.69
Not polyaromatic, ArNH2, ArNO2 0.86 0.86 -  
Set E 0.78 0.77 0.66 0.67
Kazius et al. 0.91 0.95 -  
All 0.90b 0.89 -  
Nitroaromatics 0.85 0.90 -  
Aryl-amines (not nitroaromatic or polyaromatic) 0.87 0.87 -  
Polyaromatic (not nitroaromatic) 0.75 0.86 -  
  1. The four columns denote a random forest model built on the full set of data, the default TopKat[76, 77] model, a local random forest models built only on the indicated set, and the performance of this local model on the rest of the compiled all-substructure set.
  2. aGlobal model, trained on all data, bOOB performance, cTrained only on the particular set