Predict the results of your experiments
Text-and-Data-Mining (TDM) Thieme Chemistry content
Success in Digital Chemistry today depends on data quality. Thieme and its curated knowledge database Science of Synthesis (SoS) is able to provide chemical reaction and structure data in Organic Synthetic Chemistry to an unprecedented level of accuracy. Research organizations, academic institutions and companies are using machine-learning (ML) and artificial intelligence (AI) techniques to explore scientific data, train algorithms and create knowledge from datasets as part of text-and-data mining (TDM). Thieme Chemistry can help support you with this task by providing highly standardized and structured organic synthesis information, including XML, .cdx and SDF/RDF files.
Collaboration of Science of Synthesis/Thieme with IBM Research accelerates discovery in Organic Chemistry
In 2018 IBM launched the RXN for Chemistry cloud platform to help synthetic organic chemists in predicting the outcome of chemical reactions using an artificial intelligence (AI) model, called Molecular Transformer. Earlier in 2021 IBM Research and Thieme Chemistry incorporated expert synthesis data from Thieme’s curated digital publication source on organic chemistry – Science of Synthesis – into RXN for Chemistry. Initial results show that Thieme-trained models predict correct reactions more than twice as often as baseline models when tested on Science of Synthesis chemistry.
Thieme Chemistry content including Science of Synthesis datasets in key figures
Boost results of ML and AI Projects with Thieme Chemistry content
Unleash the full potential of quality and curated knowledge of databases such as Science of Synthesis by leveraging ML and AI techniques with a wide range of scope:
- Evidence-based research
- The evaluation of reactions
- Synthesis design
- Drug design
- Pattern recognition
- Pattern analysis
- Substructure and similarity searches
- Discovery and innovation
- New insights/knowledge
“The ultimate quality of the data used in model training will determine the future adoption of AI tools in chemical synthesis. Integrating high-quality, curated data from Science of Synthesis provides a once-in-a-lifetime opportunity to boost the performance of RXN for chemistry to unprecedented levels while also unleashing the entire knowledge value contained in hundreds of thousands of high-quality chemical reaction records.”
Dr. Teodoro Laino, IBM Research Europe, Switzerland
How Thieme Chemistry content adds to your success
By applying TDM skills or training AI with Thieme Chemistry content data you could potentially profit in many ways. Thieme’s cooperation with IBM RXN prove the following Science of Synthesis characteristics to be based on evidence*:
- Inspiring: Find a greater diversity in reaction coverage in comparison to patent data.
- Reliable: Science of Synthesis data show the most reliable synthetic transformations available. It is curated by expert chemists over a 20-year period.
- Comprehensive: Science of Synthesis data covers yields and conditions, reaction reactants, products, reagents, and catalysts. Detailed and proven experimental procedures also available.
- Consistent: Science of Synthesis data shows an exceptionally consistent quality and structure because of the high-quality and comprehensive scientific edit (use of chemistry nomenclature, detailed reaction schemes including solvents and catalysts across all records).
- Improved results: AI models retrained by Thieme data give better results when evaluated by top academic natural-product research groups and retrosynthesis experts worldwide.
- Exclusive: The unique Science of Synthesis dataset is not available in the public domain.
- Expert: Profit from over 20 years of work in the compilation of synthetic methods by over 2,000 expert authors worldwide.
* Initial results show that Thieme-trained models predict correct reactions more than twice as often as baseline models when tested on Science of Synthesis chemistry.