
When scientists first sequenced the human genome in 2004, they revealed the full set of DNA instructions that make a person. But we still didn’t know what all those 3 billion genetic letters actually do.
Now Google’s DeepMind division says it’s made a leap in trying to understand the code with AlphaGenome, an AI model that predicts what effects small changes in DNA will have on an array of molecular processes, such as whether a gene’s activity will go up or down. It’s just the sort of question biologists regularly assess in lab experiments.
“We have, for the first time, created a single model that unifies many different challenges that come with understanding the genome,” says Pushmeet Kohli, a vice president for research at DeepMind.
Five years ago, the Google AI division released AlphaFold, a technology for predicting the 3D shape of proteins. That work was honored with a Nobel Prize last year and spawned a drug-discovery spinout, Isomorphic Labs, and a boom of companies that hope AI will be able to propose new drugs.
AlphaGenome is an attempt to further smooth biologists’ work by answering basic questions about how changing DNA letters alters gene activity and, eventually, how genetic mutations affect our health.
“We have these 3 billion letters of DNA that make up a human genome, but every person is slightly different, and we don’t fully understand what those differences do,” says Caleb Lareau, a computational biologist at Memorial Sloan Kettering Cancer Center who has had early access to AlphaGenome. “This is the most powerful tool to date to model that.”
Google says AlphaGenome will be free for noncommercial users and plans to release full details of the model in the future. According to Kohli, the company is exploring ways to “enable use of this model by commercial entities” such as biotech companies.
Lareau says AlphaGenome will allow certain types of experiments now done in the lab to be carried out virtually, on a computer. For instance, studies of people who’ve donated their DNA for research often turn up thousands of genetic differences, each slightly raising or lowering the chance a person gets a disease such as Alzheimer’s.
Lareau says DeepMind’s software could be used to quickly make predictions about how each of those variants works at a molecular level, something that would otherwise require time-consuming lab experiments. “You’ll get this list of gene variants, but then I want to understand which of those are actually doing something, and where can I intervene,” he says. “This system pushes us closer to a good first guess about what any variant will be doing when we observe it in a human.”
Don’t expect AlphaGenome to predict very much about individual people, however. It offers clues to nitty-gritty molecular details of gene activity, not 23andMe-type revelations of a person’s traits or ancestry.
“We haven’t designed or validated AlphaGenome for personal genome prediction, a known challenge for AI models,” Google said in a statement.
Underlying the AI system is the so-called transformer architecture invented at Google that also powers large language models like GPT-4. This one was trained on troves of experimental data produced by public scientific projects.
Lareau says the system will not broadly change how his lab works day to day but could permit new types of research. For instance, sometimes doctors encounter patients with ultra-rare cancers, bristling with unfamiliar mutations. AlphaGenome could suggest which of those mutations are really causing the root problem, possibly pointing to a treatment.
“A hallmark of cancer is that specific mutations in DNA make the wrong genes express in the wrong context,” says Julien Gagneur, a professor of computational medicine at the Technical University of Munich. “This type of tool is instrumental in narrowing down which ones mess up proper gene expression.”
The same approach could apply to patients with rare genetic disease, many of whom never learn the source of their condition, even if their DNA has been decoded. “We can obtain their genomes, but we are clueless as to which genetic alterations cause the disease,” says Gagneur. He thinks AlphaGenome could give medical scientists a new way to diagnose such cases.
Eventually, some researchers aspire to use AI to design entire genomes from the ground up and create new life forms. Others think the models will be used to create a fully virtual laboratory for drug studies. “My dream would be to simulate a virtual cell,” Demis Hassabis, CEO of Google DeepMind, said this year.
Kohli calls AlphaGenome a “milestone” on the road to that kind of system. “AlphaGenome may not model the whole cell in its entirety … but it’s starting to sort of shed light on the broader semantics of DNA,” he says.