laitimes

Nature Depth: Where will this revolutionary technology go when artificial intelligence predicts protein structure?

▎ WuXi AppTec content team editor

Last July, DeepMind published a study in the journal Nature on artificial intelligence (AI) systems AlphaFold predicting the three-dimensional structure of proteins based on amino acid sequences. The researchers also released the source code of the AI system, making this technology available to scientists and researchers. Since the article was published, AlphaFold has set off an unprecedented boom in the life sciences field. Many people call it a game-changing scientific breakthrough. Recently, an in-depth article in the journal Nature elaborated on AlphaFold's transformative impact on the life sciences field and its future. In today's article, WuXi AppTec's content team will share the highlights of the article with readers. Click "Read more" at the end of the article to visit the official website of Nature to read the full text.

Nature Depth: Where will this revolutionary technology go when artificial intelligence predicts protein structure?

Amazing success

In 2020, the AlphaFold AI system beat the rest of the contestants in the International Protein Structure Prediction Competition (CASP) to accurately predict the 3D structure of proteins based on amino acid sequences. Its accuracy is comparable to that of 3D structures that can be resolved using experimental techniques such as cryoelectron microscopy (CryoEM), nuclear magnetic resonance, or X-ray crystallography. At the time, the breakthrough was described by multiple media outlets as a breakthrough in "transforming the biological sciences and biomedicine." Dr. Arthur D. Levinson, former CEO of Genentech, called the achievement "once in a generation advance."

In July 2021, a paper describing the AlphaFold and RoseTTAFold AI systems was published in Science and Nature, and open source code and related information were provided to allow scientists to use these tools. A week later, DeepMind announced that AlphaFold predicted 98.5 percent of human protein structures, as well as protein structures from 20 model organisms such as mice, fruit flies, and Escherichia coli, depositing more than 365,000 protein structures into a public database built in collaboration with the European Institute for Bioinformatics (EMBL-EBI). The database now stores nearly 1 million protein structures.

And this year, DeepMind plans to release more than 100 million structure predictions, which is close to half the number of all known proteins, hundreds of times more than the number of protein structures that have been determined experimentally!

According to DeepMind, more than 400,000 people are now using EMBL-EBI's AlphaFold database. In which areas of life sciences has it had a significant impact?

Nature Depth: Where will this revolutionary technology go when artificial intelligence predicts protein structure?

Image credit: 123RF

Resolve protein structure

AlphaFold's ability to resolve protein structures has convinced many biologists. As long as the proteins fold into a single, fixed three-dimensional conformation, AlphaFold's predictions are difficult to surpass. Dr Arne Elofsson, a protein bioinformatician at Stockholm University, said, "It's a one-click solution that probably provides the best model you want. ”

Even where AlphaFold isn't quite sure, "it's pretty good at telling you when it's not working," says Dr. Elofsson. In this case, the predicted structure is somewhat like floating noodles. This usually corresponds to the lack of a region of protein that determines the conformation. This intrinsically disordered region, which makes up about a third of the human proteome, is only possible to be clearly defined if another molecule, such as a signaling partner, is present.

Nature Depth: Where will this revolutionary technology go when artificial intelligence predicts protein structure?

▲AlphaFold accurately predicts protein structure (a) and inaccurately predictable protein structure (b and c) (Image source: Reference[5])

AlphaFold's protein structure stored in the EMBL-EBI database was also immediately applied. Dr Christine Orengo's team at University College London is searching it and discovering new protein types without experimental validation. They have discovered hundreds or thousands of potential new protein families, expanding scientists' understanding of protein form and function. Another project of the research team is to mine databases of DNA sequences harvested from the ocean and wastewater in an attempt to discover new proteases that degrade plastics. Using AlphaFold to quickly predict the structure of thousands of proteins, the researchers hope to better understand how enzymes have evolved to break down plastics and potentially improve on them.

Sergey Ovchinnikov, Ph.D., an evolutionary biologist at Harvard University, says the ability to transform the genetic sequence of any protein into a reliable structure provides a powerful tool for evolutionary research. Researchers usually determine correlations between species by comparing gene sequences. For genes that are more distantly related, DNA sequence comparisons may not be able to find evolutionary kinship because the sequence has changed significantly. But protein structure changes more slowly than gene sequences, and by comparing protein structures, researchers may uncover ancient associations that have been overlooked. "This opens up an amazing opportunity to study the evolution of proteins and the origin of life." Dr. Pedro Beltrao, a computational biologist at the Swiss Federal Institute of Technology, said.

Empower scientific experiments

For scientists who want to determine the detailed structure of a particular protein, AlphaFold's predictions don't necessarily provide a solution right away. However, it provides an initial model that can be experimentally verified or improved, and it helps to understand experimental data in itself. For example, the raw data for X-ray crystallography is the diffraction pattern of X-rays. Often, scientists need to make preliminary guesses about the structure of proteins to explain these patterns. Dr. Randy Read, a structural biologist at the University of Cambridge in the United Kingdom, said that previously they needed to cobble together information about relevant proteins in the public protein database or use experimental methods to determine the original protein model. Now, AlphaFold's predictions allow scientists to parse most X-ray diffraction patterns without this strategy.

Dr. Read and other researchers have used AlphaFold to identify crystal structures from X-ray data that cannot be resolved without enough starting models. "People are parsing structures that have been unable to be parsed for years." Dr. Clautia Millán Nebot, a former postdoc in Read's lab, said she expects to see a large number of new protein structures submitted to public databases, largely due to AlphaFold.

Nature Depth: Where will this revolutionary technology go when artificial intelligence predicts protein structure?

Image credit: 123RF

Laboratories specializing in capturing images of cryo-protein using cryo-EM can also benefit. Dr. Bryan Roth, a structural biologist and pharmacologist at the University of North Carolina at Chapel Hill, said there are instances where AlphaFold's model accurately predicts the unique characteristics of G protein-coupled receptors (GPCRs). They are important drug targets. "AlphaFold did a great job of generating the first model, which we then refined with experimental data, which saved us time," he says. ”

But Dr. Roth adds that AlphaFold isn't always that accurate. In some cases, AlphaFold flagged the structural prediction with a high degree of confidence, but experimental data showed that it was wrong. Even if the software gets the right results, it can't mimic what a protein looks like when it binds to a drug or other small molecules (ligands) that substantially alter the protein structure.

In drug discovery efforts, it is increasingly common for researchers to use computational docking software to screen billions of small molecules to find molecules that may bind to the target protein, suggesting that they could become useful drugs. Dr. Roth is now working with Dr. Brian Shoichet, a medicinal chemist at the University of California, San Francisco, to compare AlphaFold's predictions with experimentally determined structures.

Dr. Shoichet said they limited their work to proteins whose predictions of AlphaFold coincided with experimentally confirmed structures. However, even in these cases, existing docking software and AlphaFold will find different compounds. His team is now synthesizing potential drugs that use the structures predicted by AlphaFold and testing their activity in the lab.

Aids in drug discovery

Dr Shoichet said researchers at biomedical and biotech companies are excited about AlphaFold's potential to help with drug discovery. In November 2021, DeepMind launched IsoMorphic Labs, which aims to apply AlphaFold and other AI tools to drug discovery.

Dr. Karen Akinsanya, head of therapy development at Schr dinger, said her team has had some success in leveraging the AlphaFold structure for virtual screening and drug candidate design. In some cases, AlphaFold provides structures that can already guide drug discovery. Still, "it's hard to say it's a panacea, because you might do a great job of one structure, but that doesn't mean it can be analogous to all structures." Dr. Akinsanya said. In terms of drug discovery, alphaFold provides structures that will never completely replace those obtained experimentally, but they may complement experimental approaches and speed up the process of drug development.

Nature Depth: Where will this revolutionary technology go when artificial intelligence predicts protein structure?

Image credit: 123RF

Limitations of AlphaFold

While AlphaFold has made rapid progress, scientists say it's important to be aware of its limitations, especially now that researchers who don't specialize in predicting protein structure are also using it.

Scientists have tried using AlphaFold to speculate on the effect of missense mutations (including those associated with early breast cancer) on protein structure, and the results show that the software does not yet have the ability to predict the effect of new mutations on proteins.

AlphaFold's team is now thinking about how to design a neural network to handle new mutations. Dr. John Jumper, a scientist at DeepMind, expects that this will require the network to better predict how a protein will transition from an unfolded state to a folded state. Dr. Mohammed AlQuraishi, a computational biologist at Columbia University, said this may require software that relies solely on what they've learned in protein physics to predict structure. "One thing we're interested in is how to make predictions from a single sequence without using evolutionary information," he said. This is a key issue that remains unresolved. ”

AlphaFold is designed to predict a single structure, but many proteins have multiple conformations, which can be important for their function, and AlphaFold predicts isolated protein structures, many of which function in combination with ligands including DNA, RNA, fat molecules, and minerals.

The future of AlphaFold

Although AlphaFold is designed to predict a single structure, when DeepMind released its source code, scientists quickly discovered ways to let it predict the interactions between proteins. A few days after the AlphaFold code was released, Dr. Yoshitaka Moriwaki, a protein bioinformatics scientist at the University of Tokyo, found that alphaFold could accurately predict their interactions if two protein sequences were spliced together with a long ligation sequence.

In October 2021, DeepMind released an update called AlphaFold-Multimer, a system specifically trained to recognize protein complexes. DeepMind's team used it to identify thousands of complexes in a database of publicly available proteins and found that it was able to predict about 70 percent of known protein-to-protein interactions.

These tools are already helping researchers discover new binding proteins. In a paper recently published in Nature Communications, Dr. Arne Elofsson's team at Stockholm University used AlphaFold in combination with experimental data to predict the structure of 65,000 protein pairs that could interact.

These virtual screenings provide a good starting point for further experimentation. "Just because it looks good doesn't mean it's right," Dr. Elofsson says, "you need experimental data to show you're right." ”

Nature Depth: Where will this revolutionary technology go when artificial intelligence predicts protein structure?

Using AlphaFold and RoseTTAFold, the team of Professor David Baker of the University of Washington used AlphaFold and RoseTTAFold to model the interactions between all the protein pairs expressed by yeast and found more than 100 previously unknown interactions.

Recently, Professor Baker's team published a paper in the journal Nature to go one step further, using algorithms that predict the structure of proteins, the research team only needs to know the structural information of the target proteins, and can find miniproteins that bind to them with high affinity. Professor Baker said the discovery was expected to lead to a paradigm shift in drug discovery and molecular biology.

Nature Depth: Where will this revolutionary technology go when artificial intelligence predicts protein structure?

Not only have AI tools changed how scientists determine what proteins look like, some researchers are using them to make entirely new proteins. Professor David Baker said: "Deep learning is completely changing the way my group designs proteins. In a paper published in Nature last December, researchers succeeded in getting AI to imagine and construct protein structures like never before.

In this study, instead of predicting amino acid sequences with known protein structures in the AI system, the researchers provided them with random sequences and introduced mutations into them until the AI's neural network determined that it would be able to fold these sequences into stable structures.

The researchers expressed 129 proteins imagined by the AI system in bacteria and found that about one-fifth of the proteins folded into structures predicted by the AI. "This is the first demonstration that using these neural networks can be used to design proteins." Dr Baker said. Now, his team is using this strategy to design useful proteins, such as protein catalysts that can catalyze specific chemical reactions. Scientists only need to give amino acids responsible for specific catalytic functions and then let the AI imagine other parts.

Where the AlphaFold revolution will go is hard to predict even for experts in the field. Professor Baker said the field was changing so quickly that in less than a year we expected to see new major breakthroughs using these tools. ”

Dr. Janet Thornton, a computational biologist at EMBL-EBI, believes that one of AlphaFold's biggest influences is convincing biologists to be more open to the insights offered by computational and theoretical methods. "For me, revolution is a change of mindset."

Resources:

[2] Bryant et al., (2022). Improved prediction of protein-protein interactions

Disclaimer: WuXi AppTec's content team focuses on the global biomedical health research process. This article is for informational purposes only and the views expressed herein do not represent the position of WuXi AppTec, nor do they represent WuXi AppTec's support for or opposition to the views expressed herein. This article is also not recommended for treatment options. For guidance on treatment options, please visit a regular hospital.

Copyright note: This article is from WuXi AppTec content team, welcome to forward to the circle of friends, refuse the media or institutions to reprint to other platforms in any form without authorization. Reprint authorization, please reply to "Reprint" on the "WuXi AppTec" WeChat public account to obtain the reprint instructions.

Share, like, watch, focus on global biomedical health innovation

Read on