Skip to main content
Fig. 1 | Journal of Cheminformatics

Fig. 1

From: Machine intelligence-driven framework for optimized hit selection in virtual screening

Fig. 1

Graphical representation of the A-HIOT workflow and function. Automated-hit identification and optimization tool (A-HIOT) utilizes both ligand and receptor-structure information to bridge the long-standing gap between ligand-based and structure-based virtual screening. The input data for A-HIOT comprises marketed, FDA-profiled, and molecules under clinical trial for an individual or set of specific protein targets belonging to a similar family. The ligands were transformed into feature vector (xn = 1,…, N) representation. The data preprocessing retains dimensionality and yields a machine-readable dataset. The machine and deep learning comprise the stacked ensemble framework in which random forest, extreme gradient boost serves as base-learners, and deep neural networks (deep learning) serves as the super learner job. The inhibitors-like representative feature-instances, hence represented, as chemical space (CS) module creation, result in the high-performance classification of the predictive model (CS-driven stacked ensemble framework). The true positive (TP) molecules are identified leads/hits that serve as input for the protein space (PS) module implemented in the A-HIOT framework. The identified leads were further explored for binding patterns implementing docking-simulation within the receptor pocket. The binary fingerprints for each protein–ligand complex are reckoned to assess the binding pattern. These fingerprints serve as deep neural networks input and outcomes a robust predictive model (PS-driven DNNs framework). The true positives obtained were further concatenated with protein–ligand interaction profiles and re-ranked as per the binding interaction (interaction number between protein–ligand complex) threshold. The collected molecules are optimized leads serves the purpose of final output in the A-HIOT framework

Back to article page