Skip to main content

Table 2 The characteristics of the data sets used in this paper

From: Fast rule-based bioactivity prediction using associative classification mining

Data set

hERG

antiTB

Mutagenicity

Source

PKKB [32]

Prathipati et al. [2]

Jeroen et al. [35]

#Compounds

806

3,779

4,337

Diversity

0.90

0.90

0.93

Class

blocker/non-blocker

active/inactive

mutagen/non-mutagen

  1. Note: The diversity of each dataset is the average distance of all molecules and is calculated based on ECFP_6 by using Pipeline Pilot. The distance is defined as (1- similarity) for every pair of molecules based on the specified fingerprint.