Skip to main content

Table 1 Selected datasets from the epigenomic database

From: Statistical-based database fingerprint: chemical space dependent representation of compound databases

Dataset Number of compounds Intra-set similarity median (Tc) Average “1” bits Number of “1” bits in DFP Number of “1” bits in SB-DFP
MACCSa ECFP4b MACCSa ECFP4b MACCSa ECFP4b MACCSa ECFP4b
BRD2 234 0.569 0.152 56.0 54.3 53 27 67 229
BRD3 246 0.573 0.153 56.6 54.6 53 26 73 231
BRD4 477 0.486 0.133 55.9 52.8 47 14 71 333
CREBBP 105 0.694 0.276 56.1 53.9 52 36 50 185
DNMT1 127 0.403 0.115 55.4 51.7 50 13 62 281
EHMT2 61 0.636 0.228 62.4 55.7 62 41 56 167
EP300 57 0.425 0.106 58.2 55.7 53 11 56 285
HDAC10 190 0.514 0.165 53.2 50.6 50 17 46 272
HDAC11 137 0.494 0.156 51.2 50.8 48 16 42 229
HDAC1 2740 0.453 0.149 53.2 51.4 51 15 63 499
HDAC2 767 0.447 0.149 50.3 48.4 46 13 53 336
HDAC3 669 0.474 0.147 52.6 50.3 49 13 54 356
HDAC4 452 0.427 0.135 50.4 46.4 42 10 49 248
HDAC5 112 0.455 0.153 47.3 44.1 39 13 26 176
HDAC6 1374 0.474 0.149 54.3 49.8 48 13 62 415
HDAC7 112 0.489 0.165 50.4 45.8 43 12 28 197
HDAC8 864 0.500 0.153 54.9 51.2 50 12 52 398
HDAC9 102 0.494 0.169 52.6 47.4 46 13 29 190
KAT2B 55 0.583 0.179 50.8 37.3 46 13 44 99
KDM1A 241 0.380 0.143 44.8 46.2 31 21 31 216
KDM4C 88 0.359 0.101 48.8 40.3 41 10 38 158
L3MBTL1 50 0.804 0.551 42.2 36.8 37 27 37 56
L3MBTL3 89 0.731 0.404 40.4 36.6 37 26 35 83
MAP3K7 96 0.539 0.137 57.1 60.5 59 35 45 190
MGEA5 67 0.683 0.316 54.2 39.6 48 19 42 126
NCOA1 51 0.350 0.105 45.5 43.3 34 11 18 132
NCOA3 157 0.368 0.109 47.7 44.6 39 10 26 166
PRMT1 61 0.395 0.076 53.0 53.5 41 9 40 239
Average 350 0.507 0.178 52 48 46 18 46 232
  1. aMACCS keys 166-bit
  2. bECFP4 2048-bit