jCompoundMapper: An open source Java library and command-line tool for chemical fingerprints

Table 4 Nested Leave-one-out MSE Performance on Sutherland Data Sets

Encoding	ACE	ACHE	BZR	COX2	DHFR	GPB	THERM	THR
DFS	1.93	0.61	0.61	1.09	0.57	0.66	2.08	0.55
ASP	1.93	0.56	0.54	1.07	0.56	0.63	2.08	0.53
AP2D	1.59	0.79	0.68	1.04	0.72	0.64	2.06	0.45
AT2D	1.68	0.69	0.67	0.92	0.69	0.65	1.92	0.45
CATS2D	1.83	0.83	0.96	1.34	0.65	0.62	2.20	0.45
PHAP2PT2D	1.88	0.92	0.98	1.37	0.67	0.62	2.10	0.46
PHAP3PT2D	1.83	0.91	0.85	1.20	0.66	0.58	2.04	0.50
SHED	2.11	1.00	1.13	1.71	1.41	0.72	2.94	0.43
ECFP	2.01	0.66	0.66	0.96	0.57	0.65	2.17	0.47
RAD2D	1.99	0.73	0.79	1.03	0.72	0.67	2.18	0.43
LSTAR	2.29	0.66	0.68	1.00	0.60	0.71	2.31	0.46
AP3D	1.88	0.64	0.59	0.90	0.67	0.70	2.61	0.54
AT3D	2.04	0.60	0.65	0.97	0.58	0.71	2.70	0.59
CATS3D	1.91	0.85	0.84	1.26	0.70	0.74	2.62	0.58
PHAP2PT3D	1.92	0.81	0.85	1.30	0.70	0.76	2.72	0.62
PHAP3PT3D	2.40	0.73	0.81	1.11	0.59	0.82	2.82	0.67
RAD3	2.43	0.73	0.73	1.04	0.57	0.75	2.75	0.55

Overview of ϵ support vector regression performance on the Sutherland data set using the MinMax kernel function. The performance was evaluated using a nested leave-one-out cross-validation with a parameter optimization using 10-fold cross-validation repeated 2 times. Bold values indicate performances not worse than 10% of the best performing encoding.

ISSN: 1758-2946