Skip to main content

Table 2 The characteristics of the data sets used in this paper

From: Fast rule-based bioactivity prediction using associative classification mining

Data set hERG antiTB Mutagenicity
Source PKKB [32] Prathipati et al. [2] Jeroen et al. [35]
#Compounds 806 3,779 4,337
Diversity 0.90 0.90 0.93
Class blocker/non-blocker active/inactive mutagen/non-mutagen
  1. Note: The diversity of each dataset is the average distance of all molecules and is calculated based on ECFP_6 by using Pipeline Pilot. The distance is defined as (1- similarity) for every pair of molecules based on the specified fingerprint.