Abstract:
Advances in machine learning make it possible to discover more about organic reactivity than has ever been possible before. However, for these studies to be successful, good data is required. We are developing the DP5 method for the analysis of NMR data in order to help to automate NMR interpretation and to challenge incorrect assignments.
Knowledge of the outcome of reactions allows us to build models of selectivity, even in cases where dynamic effects dominate the choice of pathway. Tracking all of these molecular data is necessary for effective machine learning and hard to achieve with large datasets. New developments in the InChI international chemical identifier are making this process even more tractable.