Skip to main content

Table 2 Overview of the downstream datasets

From: Prediction of chemical reaction yields with large-scale multi-view pre-training

Dataset

Split type

# Training

# Test

Out-of-sample type

Buchwald-Hartwig

Test 1

3057

898

Ligand-based

(3955 reactions)

Test 2

3055

900

Ligand-based

 

Test 3

3058

897

Ligand-based

 

Test 4

3055

900

Ligand-based

 

Plate 1

2880

1075

Ligand-based

 

Plate 2

2515

1440

Lligand-based

 

Plate 3

2515

1440

Ligand-based

 

Plate 2 new

2515

1440

Ligand-based

 

Halide Br

2636

1319

Reactant-based

 

Halide Cl

2637

1318

Reactant-based

 

Halide I

2637

1318

reactant-based

 

Pyridyl

2372

1583

Reactant-based

 

Nonpyridyl

1583

2372

Reactant-based

 

random

2768

1187

None

Suzuki-Miyaura

Test 1

4320

1440

Ligand-based

(5760 reactions)

Test 2

4320

1440

Ligand-based

 

Test 3

4320

1440

Ligand-based

 

Test 4

4320

1440

Ligand-based

 

Random

4032

1728

None