Skip to main content

Advertisement

Featured article: Randomized SMILES strings improve the quality of molecular generative models

Recurrent Neural Networks (RNNs) trained with a set of molecules represented as unique (canonical) SMILES strings have shown the capacity to create large chemical spaces of valid and meaningful structures. In this article, Arús-Pous et al. performed an extensive benchmark on models trained with subsets of GDB-13 of different sizes (1 million, 10,000 and 1000), with different SMILES variants (canonical, randomized and DeepSMILES), with two different recurrent cell types (LSTM and GRU) and with different hyperparameter combinations. To guide the benchmarks, new metrics were developed that define how well a model has generalized the training set. Results show that models that use LSTM cells trained with 1 million randomized SMILES, a non-unique molecular string representation, are able to generalize to larger chemical spaces than the other approaches and they represent more accurately the target chemical space.

Articles

2019

Proceedings of the 11th International Conference on Chemical Structures
Edited by Gerard van Westen and Markus Wagener
Collection published: 14 February 2019

Programming Languages for Chemical Information
Edited by Rajarshi Guha
Collection published: 5 February 2019

2018

BioCreative V.5
Edited by Martin Krallinger, Obdulia Rabal, Anália Lourenço, Alfonso Valencia
Collection published: 14 December 2018

Novel applications of machine learning in cheminformatics
Edited by Ola Spjuth
Collection published: 21 February 2018

2015

Cross journal collection
Jean-Claude Bradley Memorial Series
Edited by Andrew SID Lang, Antony Williams
Collection published: 22 March 2015

2013

6th Joint Sheffield Conference on Chemoinformatics
Collection published: 29 July 2013

2012

The IUPAC International Chemical Identifier (InChI) and its influence on the domain of chemical information
Edited by Antony Williams
Collection published: 13 December 2012

Semantic physical science
Edited by Henry Rzepa, Peter Murray-Rust
Collection published: 3 August 2012

2011

Visions of a semantic molecular future
Collection published: 14 October 2011

RDF technologies in chemistry
Edited by Egon Willighagen, Martin Paul Braendle
Collection published: 13 May 2011

PubChem3D
Collection published: 27 January 2011

Upcoming Special Issues

Learn more about open Calls for Papers and upcoming Special Issues here.

Aims and Scope

Journal of CheminformaticsJournal of Cheminformatics is an open-access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling.

Read the full scope here.

New Thematic Series: Programming Languages for Chemical Information

Programming LanguagesCheminformatics methods and analyses employ software written in different programming languages, with each one offering features that make it more or less suitable for a given task.  While there is much research on programming languages, we believe it would be useful for practitioners to report on how their preferred language  has benefited them in practice. 

Benefit from our free funding service

We offer a free open access support service to make it easier for you to discover and apply for article-processing charge (APC) funding.

Learn more here.

oa

Latest Tweets

Editor profiles

Editors-in-Chief:

New Content ItemEgon Willighagen is a researcher in the BiGCaT Department for Bioinformatics and teacher at Maastricht University in the Netherlands.

New Content ItemRajarshi Guha is the Associate Director of Informatics at Vertex Pharmaceuticals where he leads the cheminformatics group that is responsible for informatics needs in high throughput screening.​​​​​​​

Associate Editor:

Nina Jeliazkova, Idea Consult Ltd., Bulgaria

Advertisement

BMC is part of Springer Nature

Annual Journal Metrics

  • Speed
    73 days to first decision for reviewed manuscripts only
    61 days to first decision for all manuscripts
    145 days from submission to acceptance
    13 days from acceptance to publication

    Usage 
    400,118 downloads
    1,054 Altmetric mentions

Advertisement