- Oral presentation
- Open Access
Semantics vs. statistics in chemical markup
Journal of Cheminformatics volume 4, Article number: O16 (2012)
Since the late 1990s, natural language processing (NLP) has seen a massive shift from high-precision, low-recall systems based on small sets of hand-written rules, to methods based on the statistical analysis of large corpora. The field of chemoinformatics, likewise, is dominated by statistical and machine-learning approaches. In recent years, however, pharmaceutical companies have been engaging more and more with Semantic Web technologies, which are largely built around the sorts of hand-written systems that NLP has moved away from this century. We discuss where our current text analysis and Semantic Web efforts at the Royal Society of Chemistry are headed and how we're making use of the unreasonable effectiveness of data.