Multi-channel PINN: investigating scalable and transferable neural networks for drug discovery

Table 5 Architectures, parameters, and hyperparameters explored for DNNs

Base model		Value	Description
PINN	Separated layers	1, 2, 3, 4	The number of separated layers for PINN
	Concatenated layers	1, 2	The number of concatenated layers for PINN
	Number of nodes	256, 512, 1024, 2048	The number of nodes for layers
Dilated CNN	Filters	4, 8, 16, 32	The number of filters for Dilated CNN
	Kernel size	6, 8, 12, 22	The length of the convolution window for Dilated CNN
	Embedding	16, 32	Dimension of dense embedding for low level representations
LSTM, BLSTM	Units	128, 256	The units to represent hidden layers for RNN
DNN	Lr	0.0005	Initial learning rate
	Initializer	\([ - \sqrt{ 3 / fan_{in}}, \sqrt{ 3 / fan_{in}}]\)	Initial weight value called Lecun uniform distribution
	Optimizer	Adam	Optimizer for stochastic gradient descent
	Weight decay	0.0, 0.00001	Learning rate decay over each update
	Activation function	ReLU, ELU	Neuron activation function
	Drop out	0.25, 0.5	The rate of drop out
	Batch	1024	Batch size for training
	Epochs_training	400	Training epochs on a training task
	Epochs_finetune	200	Finetuning epochs for a pretrained model on a test task

ISSN: 1758-2946