Skip to main content

Table 2 Defects identified in the descriptor calculation software

From: Mordred: a molecular descriptor calculator

Software

Details

CDK

Theoretically, the roto-translation of a molecule should not change any molecular property. However, TPSA and LengthOverBreadth descriptors resulted in different values of molecules before and after roto-translation transformation

The value of ChiPathCluster is invalid because the patterns are not adequate (fixed in the latest version of CDK)

PaDEL

Several molecules (e.g., Cyanidin) resulted in invalid values in many descriptors (e.g., nH (hydrogen count) returned 12) when using the default configuration owing to a bug in the aromaticity detecting procedure and/or 3D conformer generator of PaDEL-Descriptor. This caused breakage of an aromatic ring and attachment to an invalid hydrogen

Some descriptors use the log sum exponential (LSE) function (\({\text{LSE}}_{1}\) below), which is prone to arithmetic overflow or underflow. The LSE trick (\({\text{LSE}}_{2}\) below) should be used to avoid this issue

\({\text{LSE}}_{1} \left( {x_{1} ,x_{2} , \ldots , x_{n} } \right) = { \log }\left( {\mathop \sum \limits_{i = 1}^{n} { \exp }\left( {x_{i} } \right)} \right)\)

\(\begin{array}{*{20}c} {{\text{LSE}}_{2} \left( {x_{1} ,x_{2} , \ldots , x_{n} } \right) = x^{ *} + \log \left( {\mathop \sum \limits_{i = 1}^{n} \exp \left( {x_{i} - x^{*} } \right)} \right)} \\ {x^{ *} = \hbox{max} \left( {x_{1} ,x_{2} , \ldots ,x_{n} } \right)} \\ \end{array}\)

In the constitutional descriptor, discrepancies might be induced in the algorithm implementation owing to incorrect code reuse

ChemoPy

Cannot calculate exact values of modified Zagreb index 2