Skip to main content
Fig. 3 | Journal of Cheminformatics

Fig. 3

From: Machine learning approaches to optimize small-molecule inhibitors for RNA targeting

Fig. 3

Importance of chemical features for the binding of small molecules to the RNA target. a Two features are sufficient to determine RNA binding: solubility (MolLogP) and regularity of the molecule (variance, Chi3v), respectively (left) or the hydroxy amino count (NHOH count) and subdivided van der Waal’s surface area (SlogP_VSA2 (right). The docking scores of the molecules range from − 5 (yellow) to − 15.2 (blue). The figure was created using an informative projection optimization function embedded in the Orange environment [22]. b Feature importance and inter-relationships between the top 10 features observed by the absolute regression coefficient (X-axis), SHapley Additive exPlanations value absolute mean values (SHAP [19], Y-axis), absolute correlation value of the feature with Y (point size). Feature importance observed by the distribution of SHAP values for every feature [14] is presented in the right panel. The 10 most influential features are: (1) NHOHCount—count of NH and OH groups in the molecule; (2) TPSA—topological polar surface area [23]; (3) NOCount—count of N and O atoms in the molecule; (4) Estate-VSA8—electrotopological state indices and van der Waals surface area [24] (2.05 ≤ x < 4.69); (5) PEOE_VSA1—capture direct electrostatic interactions based on atomic partial charge (− inf < x < − 0.30, y = 0) [25]; (6) SMR_VSA7—capture polarizability by molar refractivity with correct protonation state assumption (3.05 ≤ x < 3.63, y = 6) [25]; (7) HallKierAlpha—value for complexity, a topological descriptor for shape, size, and molecular complexity [15]; (8) Estate-VSA7—electrotopological state indices and van der Waals surface area [24] (1.81 ≤ x < 2.05); (9) NumHDonors—number of hydrogen bond donors; and (10) VSA_Estate8—topological state indices and van der Waals surface area [24] (6.45 ≤ x < 7.00). The Pearson correlation for all 32 features is presented in Additional file 1: Fig. S1

Back to article page