Skip to main content

Table 2 Performance comparison of the different models on the external three test sets

From: DeepSA: a deep-learning driven predictor of compound synthesis accessibility

Datasets

Model

ACC

Recall

Precision

F-score

AUROC

Threshold

TS1

DeepSA

0.995

1.000

0.989

0.995

1.000

0.47

 

DeepSA

0.995

1.000

0.990

0.995

1.000

0.50

 

GASA

0.987

0.999

0.976

0.987

1.000

0.50

 

SAscore

0.989

0.992

0.986

0.989

0.999

4.50

 

SAscore

0.665

0.331

0.998

0.497

0.999

6.00

 

RAscore

0.919

0.867

0.967

0.914

0.982

0.50

 

SYBA

0.962

1.000

0.930

0.964

0.998

0.00

 

SCScore

0.608

0.698

0.592

0.641

0.641

3.10

TS2

DeepSA

0.840

0.746

0.861

0.799

0.913

0.47

 

DeepSA

0.838

0.730

0.871

0.795

0.913

0.50

 

GASA

0.796

0.677

0.815

0.740

0.876

0.50

 

SAscore

0.815

0.603

0.946

0.737

0.919

3.40

 

SAscore

0.664

0.216

0.996

0.355

0.919

6.00

 

RAscore

0.751

0.485

0.878

0.625

0.865

0.50

 

SYBA

0.787

0.627

0.834

0.716

0.862

0.00

 

SCScore

0.395

0.442

0.341

0.385

0.373

2.30

TS3

DeepSA

0.819

0.761

0.861

0.808

0.896

0.47

 

DeepSA

0.817

0.753

0.864

0.805

0.896

0.50

 

GASA

0.760

0.646

0.837

0.729

0.849

0.50

 

SAscore

0.577

0.211

0.788

0.333

0.772

3.10

 

SAscore

0.512

0.044

0.690

0.084

0.772

6.00

 

RAscore

0.701

0.571

0.772

0.656

0.790

0.50

 

SYBA

0.647

0.387

0.806

0.523

0.790

0.00

 

SCScore

0.472

0.723

0.481

0.578

0.425

2.20