From: Are phylogenetic trees suitable for chemogenomics analyses of bioactivity data sets: the importance of shared active compounds and choosing a suitable data embedding method, as exemplified on Kinases

Multi-dimensional scaling (MDS) of kinases in bioactivity space. A low average signed relative stress level of 0.28 was obtained, meaning that the 2D representation of the kinases involves a low loss of information. Gray lines connect similar kinases. Kinases in outlier group 1 (shown in red) are clearly separated from the non-outliers, but vary amongst each other in terms of SAR similarity. In contrast, members of the second group of kinase outliers are densely scattered in a small area, indicating that these kinases are very similar to each other in terms of SAR similarity, but are apparently quite distinct from the non-outliers (shown in green). However, it is likely that the kinases in outlier group 2 tend to cluster together, due to the fact that most of these kinases share few active compounds with the other kinases in the dataset, making accurate comparison in terms of SAR similarities more difficult.

