Skip to main content

Table 2 Impact of individual groups of descriptors on random forest prediction of bond dissociation energies a

From: A big data approach to the ultra-fast prediction of DFT-calculated bond energies

 

No. of descriptors

RMSD

MAD

MaxError

Selection 2 b

293

7.01

4.35

56.87

Selection 2 - CN point

209

9.06

5.50

80.97

Selection 2 - Element pair

218

7.18

4.48

57.13

Selection 2 - Fragment point

288

7.02

4.36

58.75

Selection 2 - Aromatic fragment point

279

7.07

4.39

62.14

Selection 2 - In-ring fragment point

281

7.03

4.36

55.31

Selection 2 - No-type pair

269

7.06

4.43

58.95

Selection 2 - No-type bond-breaking difference pair

284

6.96

4.32

57.70

Selection 2 - π fragment point

281

7.03

4.37

58.59

Selection 2 - Molecular element pair

263

6.96

4.31

57.86

Selection 2 - Molecular CN fragment point

265

7.04

4.35

57.77

Selection 3 c

209

7.00

4.32

58.36

  1. a Results are in kcal/mol, and were obtained in the out-of-Bag (OOB) RF validation procedure over the training set.
  2. b Combination of the following descriptors: 1) Connection number point descriptors, 2) Simple element pair descriptors, 3) Fragment point descriptors (only the field with the lower value is used), 4) Aromatic fragment point descriptors with corresponding molecular descriptors, 5) In-ring fragment point descriptors, 6) No-type pair descriptors, 7) No-type bond-breaking difference pair descriptors, 8) Conjugated π system fragment point descriptors with corresponding molecular descriptors, 9) Simple element molecular pair descriptors, 10) Molecular connection number fragment point descriptors. Descriptors 1, 4, 5, are calculated for spheres from 0 to 5. Descriptors 2, 3, 6, 8 are calculated for spheres from 1 to 5. Descriptor 2 only involves pairs at a distance of one bond. Descriptor 7 involves pairs with interatomic distances from 2 to 7. Descriptor 9 involves pairs with interatomic distances of 1–2.
  3. c Combination of Descriptors 1, 2, 4, 6, 8.