Skip to main content
Fig. 3 | Journal of Cheminformatics

Fig. 3

From: Filtered circular fingerprints improve either prediction or runtime performance while retaining interpretability

Fig. 3

Win loss statistics to compare AUPRC scores of different feature selection techniques with increasing bit-vector lengths (using ECFP4). Each bar corresponds to a pairwise comparison of two methods on 76 datasets. Wins of the first/second method are colored in blue/red colors and drawn above/below zero respectively. Intense colors indicate significant wins/losses and are additionally stated in numbers above each bar. When comparing filtering/folding to unprocessed fragments, increasing bit-vectors for filtering/folding are compared to the entire unprocessed feature set. Filtering is in general better than folding. The longer the bit-vector, the more similar are the folded or filtered feature sets to the unprocessed feature set. Filtered features yield equally predictive models as unprocessed fragments for RF, less predictive models for SVM and largely improved models for NB. Folding deteriorates predictivity compared to unprocessed fragments unless NB is used

Back to article page