Chemical engineering is being rapidly transformed by the tools of data science. On the horizon, artificial intelligence (AI) applications will impact a huge swath of our work, ranging from the discovery and design of new molecules to operations and manufacturing and many areas in between. The first part of the talk will focus on our group’s research in the area of discovery, characterization and design of new molecules through the lens of data science. We introduce the field through the concept of a molecular data science life cycle and discuss relevant aspects of five distinct phases of this process: creation of curated data sets, molecular representations, data driven property prediction, generation of new molecules, and feasibility and synthesizability considerations.
Throughout this portion of the talk, I will primarily discuss research from our group related to the discovery, characterization and design of new ionic liquids for energy applications. Additionally, I will discuss progress related to large scale data mining of the research literature to identify lead molecule targets.
Time permitting, I will conclude the research portion of the presentation with a brief broader discussion of applications of data science in ChemE including the use of data driven surrogate models in process modeling and optimization and open science and reproducibility in this research area.
Finally, I will discuss graduate education in ChemE data science, and outcomes from our NSF National Research Traineeship program that seeks to train cohorts of researchers at the intersection of ChemE and data science.
Jim Pfaendtner is the Rogel Professor & Chair of Chemical Engineering and Professor of Chemistry at the University of Washington and Staff Scientist at Pacific Northwest National Laboratory. He holds a B.S. in Chemical Engineering (Georgia Tech, 2001) and a PhD in Chemical Engineering (Northwestern University, 2007). He also serves as Associate Vice Provost for Research Computing at the UW. Jim’s research focus is computational molecular science and his recent teaching interests are in the area of teaching data science skills.