Skip to main content

Table 3 Detailed breakdown (%) of top-1 accuracy on 50 K test set for the top-performing structural fingerprints belonging to five sub-categories

From: Reconstruction of lossless molecular representations from fingerprints

Representation

Components

MACCS

Avalon

HashAP

TT

AEs

ECFP4

SMILES

\(T_c = 1.0\)

34.7

65.6

83.1

85.2

83.5

93.1

String exact

22.3

44.7

58.7

57.8

52.1

64.6

Stereo

8.2

14.9

19.2

19.2

18.0

21.2

Non-canonical

1.6

3.5

4.3

4.2

3.7

4.8

Others

2.6

2.6

0.8

4.0

9.6

2.5

Invalid

0.2

0.4

0.3

0.3

0.3

0.2

\(\overline{T_{c}}\)

81.9

90.5

95.5

96.3

96.7

98.1

SELFIES

\(T_c = 1.0\)

27.2

45.2

70.7

78.0

76.6

85.6

String exact

17.7

31.3

50.9

54.0

49.1

60.5

Stereo

5.9

9.3

15.2

16.7

19.9

18.5

Non-canonical

1.5

2.8

4.0

4.1

3.6

4.7

Others

2.2

1.7

0.6

3.3

8.0

1.9

Invalid

No invalid predictions

\(\overline{T_{c}}\)

77.8

81.5

90.7

93.9

94.4

95.1