D. Tyler McQuade, Ph.D., professor in the Department of Chemical and Life Science Engineering at Virginia Commonwealth University College of Engineering, is principal investigator of a multi-university project seeking to use artificial intelligence to help scientists come up with the perfect molecule for everything from a better shampoo to coatings on advanced microchips.
The project is one of the first in the U.S. to be selected for $994,433 in funding as part of a new pilot project of the National Science Foundation (NSF) called the Convergence Accelerator (C-Accel). McQuade and his collaborators will pitch their prototype in March 2020 in a bid for additional funding of up to $5 million over five years.
Adam Luxon, a Ph.D. student in the Department of Chemical and Life Science Engineering who has been involved from the beginning, explained it this way: “We want to essentially make the Alexa of chemistry.”
Just as Amazon, Google and Netflix use data algorithms to suggest customized predictions, the team plans to build a platform and open knowledge network that can combine and help users make sense of molecular sciences data pulled from a wide range of sources including academia, industry and government.
The idea is right in line with the goal of the NSF program: to speed up the transition of convergence research into practice in nationally critical areas such as “Harnessing the Data Revolution.”
The team itself reflects expertise across several specialties. Working with McQuade are James K. Ferri, Ph.D., professor in the Department of Chemical and Life Science Engineering; Carol A. Parish, Ph.D., professor of chemistry and the Floyd D. and Elisabeth S. Gottwald Chair in the Department of Chemistry at the University of Richmond; and Adrian E. Roitberg, Ph.D., professor in the Department of Chemistry at University of Florida. Two companies are also involved with the project: Two Six Labs, based in Arlington, Virginia, and Fathom Information Design, based in Boston, Massachusetts.
Currently, there is no shared network or central portal where molecular scientists and engineers can harness artificial intelligence and data science tools to build models to support their needs.
What’s more, while scientists have been able to depict what elements make up a molecule, how the atoms are arranged in space and what the properties of that molecule are (such as its melting point), there is no standard way to represent — or predict — molecular performance.
The team aims to fill these gaps by advancing the concept of a “molecular imprint.” The collaborators will create a new system that represents molecules by combining line-drawing, geometry and quantum chemical calculations into a single, machine-learnable format.
They will develop a central platform for collecting data, creating these molecular imprints and developing algorithms for mining the data, and will develop machine learning tools to create performance prediction models.
Parish said, “The ability to compute molecular properties using computational techniques, and to dovetail that data with experimental measurements, will generate databases that will produce the most comprehensive results in the molecular sciences.
“There are many laboratories around the world working in this space; however, there are few organizational structures available that encourage open sharing of these data for the benefit of the community and the common good. We seek to collaborate with others to provide this structure; an open knowledge network or repository where scientists can deposit their molecular-level experimental and computational data in exchange for user-friendly tools to help manage and query the data.”
The initial response to their idea has been strong from potential partners. Ferri and the others have already collected more than a dozen letters from major corporations such as Dow and Merck expressing interest in participating. Also on board are Idaho National Laboratory and Argonne National Laboratory, as well as national chemical engineering and chemistry organizations.
McQuade said that chemical engineers in major industries including consumer products and oil and gas producers expend a lot of effort running experiments to determine the molecule they want to use, such as finding the best shampoo additive that doesn’t make babies cry. “The ability to design the properties you want is still more art than science.”
The team also plans to develop a toolkit for processing and visualizing the data.
Roitberg, whose research focuses include advanced visualization, said this could take the form of a virtual reality realm in which a user could find materials that are soluble in water but not oil, for instance, and then be able to browse for similar materials nearby. “We envision a very interactive platform where the user can explore relations between data and desired material properties,” he said.