Constructing AI models that understand chemical principles | MIT News

Amongst all the possible chemical compounds, it’s estimated that between 1020 and 1060 may hold potential as small-molecule drugs.

Evaluating each of those compounds experimentally could be far too time-consuming for chemists. So, lately, researchers have begun using artificial intelligence to assist discover compounds that would make good drug candidates. 

One in all those researchers is MIT Associate Professor Connor Coley PhD ’19, the Class of 1957 Profession Development Associate Professor with shared appointments within the departments of Chemical Engineering and Electrical Engineering and Computer Science and the MIT Schwarzman College of Computing. His research straddles the road between chemical engineering and computer science, as he develops and deploys computational models to research vast numbers of possible chemical compounds, design latest compounds, and predict response pathways that would generate those compounds. 

“It’s a really general approach that could possibly be applied to any application of organic molecules, but the first application that we take into consideration is small-molecule drug discovery,” he says.

The intersection of AI and science

Coley’s interest in science runs within the family. In truth, he says, his family includes more scientists than non-scientists, including his father, a radiologist; his mother, who earned a level in molecular biophysics and biochemistry before going to the MIT Sloan School of Management; and his grandmother, a math professor.

As a highschool student in Dublin, Ohio, Coley participated in Science Olympiad competitions and graduated from highschool on the age of 16. He then headed to Caltech, where he selected chemical engineering as a serious since it offered a technique to mix his interests in science and math.

During his undergraduate years, he also pursued an interest in computer science, working in a structural biology lab using the Fortran programming language to assist solve the crystal structure of proteins. After graduating from Caltech, he decided to maintain getting in chemical engineering and got here to MIT in 2014 to begin a PhD.

Advised by professors Klavs Jensen and William Green, Coley worked on ways to optimize automated chemical reactions. His work focused on combining machine learning and cheminformatics — the applying of computation methods to research chemical data — to plan response pathways that would make latest drug molecules. He also worked on designing hardware that could possibly be used to perform those reactions robotically. 

A part of that work was done through a DARPA-funded program called Make-It, which was focused on using machine learning and data science to enhance the synthesis of medicines and other useful compounds from easy constructing blocks.

“That was my real entry point into serious about cheminformatics, serious about machine learning, and serious about how we will use models to know how different chemicals will be made and what reactions are possible,” Coley says.

Coley began applying for faculty jobs while still a graduate student, and accepted a suggestion from MIT at age 25. He received a mixture of recommendation for and against taking a job at the identical school where he went to graduate school, and eventually decided that a position at MIT was too enticing to show down.

“MIT is a really special place when it comes to the resources and the fluidity across departments. MIT appeared to be doing a very good job supporting the intersection of AI and science, and it was a vibrant ecosystem to remain in,” he says. “The caliber of scholars, the keenness of the scholars, and just the incredible strength of collaborations definitely outweighed any potential concerns of staying in the identical place.”

Chemistry intuition

Coley deferred the school position for one 12 months to do a postdoc on the Broad Institute, where he sought more experience in chemical biology and drug discovery. There, he worked on ways to discover small molecules, from billions of candidates in DNA-encoded libraries, that might need binding interactions with mutated proteins related to diseases.

After returning to MIT in 2020, he built his lab group with the mission of deploying AI not only to synthesize existing compounds with therapeutic potential, but in addition to design latest molecules with desirable properties and latest ways to make them. Over the past few years, his lab has developed quite a lot of computational approaches to tackle those goals. 

“We attempt to take into consideration methods to best pair a challenge in chemistry with a possible computational solution. And infrequently that pairing motivates the event of recent methods,” Coley says. One model his lab has developed, generally known as ShEPhERD, was trained to judge potential latest drug molecules based on how they’ll interact with goal proteins, based on the drug molecules’ three-dimensional shapes. This model is now getting used by pharmaceutical corporations to assist them discover latest drugs.

“We’re trying to present more of a medicinal chemistry intuition to the generative model, so the model is aware of the appropriate criteria and considerations,” Coley says.

In one other project, Coley’s lab developed a generative AI model called FlowER, which will be used to predict the response products that can result from combining different chemical inputs. 

In designing that model, the researchers in-built an understanding of fundamental physical principles, akin to the law of conservation of mass. Additionally they compelled the model to contemplate the feasibility of the intermediate steps that must happen on the pathway from reactants to products. These constraints, the researchers found, improved the accuracy of the model’s predictions.

“Fascinated with those intermediate steps, the mechanisms involved, and the way the response evolves is something that chemists do very naturally. It’s how chemistry is taught, however it’s not something that models inherently take into consideration,” Coley says. “We’ve spent a number of time serious about methods to ensure that our machine-learning models are grounded in an understanding of response mechanisms, in the identical way an authority chemist could be.”

Students in his lab also work on many various areas related to the optimization of chemical reactions, including computer-aided structure elucidation, laboratory automation, and optimal experimental design.

“Through these many various research threads, we hope to advance the frontier of AI in chemistry,” Coley says.

Related Post

Leave a Reply