News & Events

Chemistry, Big Data and Machine Learning

Tuesday, September 21, 2021 - 12:45 to 14:00
Dr. David Wishart I Laird Lecture
Departments of Computing Science and Biological Sciences, University of Alberta
Event Category: 
LMC - Lectures in Modern Chemistry
Dr. Tao Huan
Chemistry B250


Computing has been an integral part of chemistry for more than 60 years. Indeed the use of computers to draw structures, model molecular dynamics or calculate molecular orbitals is now ubiquitous. More recently, computers have been impacting chemistry in a different way. In particular, chemists are now starting to make greater use of "big" data (i.e. chemistry databases), the internet and techniques such as artificial intelligence and machine learning to make important advances.  In this presentation I will describe some of the work that my lab has been doing in the field of big data, machine learning and chemistry. I will first describe a number of the open-access, web-enabled databases that we have been developing to "house" chemical data.  These include DrugBank, the Human Metabolome Database (HMDB) and the Natural Products Magnetic Resonance Database (NP-MRD). I will briefly describe the purpose of each of these databases and what kind of data they house. Next I will describe how we have developed software to help classify and describe the data in these chemical databases so that they are more machine (i.e. computer) readable and also more understandable by regular people. Then I will describe how we (along with others) have been combining these datasets with machine learning techniques to develop a variety of very fast and accurate tools for predicting chemical properties, chemical reactions, chemical structures and various molecular spectral features. Finally, I will show how some of these databases and software tools are having significant impacts in the fields of medicinal chemistry, drug discovery, biochemistry and organic chemistry.