Accelerating scientific discovery with AI | MIT News

Several researchers have taken a broad view of scientific progress during the last 50 years and are available to the identical troubling conclusion: Scientific productivity is declining. It’s taking more time, more funding, and bigger teams to make discoveries that when got here faster and cheaper. Although a wide range of explanations have been offered for the slowdown, one is that, as research becomes more complex and specialized, scientists must spend more time reviewing publications, designing sophisticated experiments, and analyzing data.

Now, the philanthropically funded research lab FutureHouse is searching for to speed up scientific research with an AI platform designed to automate most of the critical steps on the trail toward scientific progress. The platform is made up of a series of AI agents specialized for tasks including information retrieval, information synthesis, chemical synthesis design, and data evaluation.

FutureHouse founders Sam Rodriques PhD ’19 and Andrew White imagine that by giving every scientist access to their AI agents, they’ll break through the largest bottlenecks in science and help solve a few of humanity’s most pressing problems.

“Natural language is the actual language of science,” Rodriques says. “Other individuals are constructing foundation models for biology, where machine learning models speak the language of DNA or proteins, and that’s powerful. But discoveries aren’t represented in DNA or proteins. The one way we all know how you can represent discoveries, hypothesize, and reason is with natural language.”

Finding big problems

For his PhD research at MIT, Rodriques sought to know the inner workings of the brain within the lab of Professor Ed Boyden.

“All the idea behind FutureHouse was inspired by this impression I got during my PhD at MIT that even when we had all the data we wanted to find out about how the brain works, we wouldn’t comprehend it because no person has time to read all of the literature,” Rodriques explains. “Even when they might read all of it, they wouldn’t find a way to assemble it right into a comprehensive theory. That was a foundational piece of the FutureHouse puzzle.”

Rodriques wrote in regards to the need for recent kinds of huge research collaborations because the last chapter of his PhD thesis in 2019, and though he spent a while running a lab on the Francis Crick Institute in London after graduation, he found himself gravitating toward broad problems in science that no single lab could tackle.

“I used to be serious about how you can automate or scale up science and what kinds of recent organizational structures or technologies would unlock higher scientific productivity,” Rodriques says.

When Chat-GPT 3.5 was released in November 2022, Rodriques saw a path toward more powerful models that would generate scientific insights on their very own. Around that point, he also met Andrew White, a computational chemist on the University of Rochester who had been granted early access to Chat-GPT 4. White had built the primary large language agent for science, and the researchers joined forces to start out FutureHouse.

The founders started off wanting to create distinct AI tools for tasks like literature searches, data evaluation, and hypothesis generation. They began with data collection, eventually releasing PaperQA in September 2024, which Rodriques calls the perfect AI agent on the planet for retrieving and summarizing information in scientific literature. Around the identical time, they released Has Anyone, a tool that lets scientists determine if anyone has conducted specific experiments or explored specific hypotheses.

“We were just sitting around asking, ‘What are the sorts of questions that we as scientists ask on a regular basis?’” Rodriques recalls.

When FutureHouse officially launched its platform on May 1 of this yr, it rebranded a few of its tools. Paper QA is now Crow, and Has Anyone is now called Owl. Falcon is an agent able to compiling and reviewing more sources than Crow. One other recent agent, Phoenix, can use specialized tools to assist researchers plan chemistry experiments. And Finch is an agent designed to automate data driven discovery in biology.

On May 20, the corporate demonstrated a multi-agent scientific discovery workflow to automate key steps of the scientific process and discover a brand new therapeutic candidate for dry age-related macular degeneration (dAMD), a number one explanation for irreversible blindness worldwide. In June, FutureHouse released ether0, a 24B open-weights reasoning model for chemistry.

“You actually have to think about these agents as part of a bigger system,” Rodriques says. “Soon, the literature search agents will likely be integrated with the information evaluation agent, the hypothesis generation agent, an experiment planning agent, and they’re going to all be engineered to work together seamlessly.”

Agents for everybody

Today anyone can access FutureHouse’s agents at platform.futurehouse.org. The corporate’s platform launch generated excitement within the industry, and stories have began to are available in about scientists using the agents to speed up research.

One among FutureHouse’s scientists used the agents to discover a gene that might be related to polycystic ovary syndrome and give you a brand new treatment hypothesis for the disease. One other researcher on the Lawrence Berkeley National Laboratory used Crow to create an AI assistant able to searching the PubMed research database for information related to Alzheimer’s disease.

Scientists at one other research institution have used the agents to conduct systematic reviews of genes relevant to Parkinson’s disease, finding FutureHouse’s agents performed higher than general agents.

Rodriques says scientists who consider the agents less like Google Scholar and more like a sensible assistant scientist get probably the most out of the platform.

“People who find themselves in search of speculation are inclined to get more mileage out of Chat-GPT o3 deep research, while people who find themselves in search of really faithful literature reviews are inclined to get more out of our agents,” Rodriques explains.

Rodriques also thinks FutureHouse will soon get to a degree where its agents can use the raw data from research papers to check the reproducibility of its results and confirm conclusions.

Within the longer run, to maintain scientific progress marching forward, Rodriques says FutureHouse is working on embedding its agents with tacit knowledge to find a way to perform more sophisticated analyses while also giving the agents the power to make use of computational tools to explore hypotheses.

“There have been so many advances around foundation models for science and around language models for proteins and DNA, that we now need to present our agents access to those models and all the other tools people commonly use to do science,” Rodriques says. “Constructing the infrastructure to permit agents to make use of more specialized tools for science goes to be critical.”

Related Post

Leave a Reply