New artificial intelligence (AI) tools have emerged that can help scientists discover previously unknown proteins and design entirely new ones. Used properly, this AI could enable more efficient vaccine development, accelerate cancer drug research, and even help discover entirely new substances.
In 2020, DeepMind, an AI research organization, a subsidiary of Alphabet, announced AlphaFold, an AI tool that uses deep learning to solve one of the ‘difficulties’ in the biological world. A biological challenge that Alphafold solved was to accurately predict the shape of a protein. Proteins are fundamental to life, and understanding their shape is essential to using proteins. Earlier this summer, DeepMind announced that Alphafold could predict the shape of any protein known to the scientific community.
A research team at the University of Washington A new tool, ‘ProteinMPNN’, described in two papers published in the Journal (available here and here), appears to be a perfect match for AlphaFold’s protein prediction technology.
Two papers published today show the latest examples of how deep learning is revolutionizing protein design by providing scientists with new research tools. Previously, researchers made proteins by slightly modifying proteins that exist in nature. However, the MPNN protein will expand the domain of proteins from scratch while allowing researchers to design proteins from scratch.
David Baker, one of the scientists involved in this paper and director of the Institute for Protein Design at the University of Washington, said, “In nature, everything related to life, from getting energy from sunlight to making molecules, fundamentally resolved. “Everything in biology happens with proteins,” he said.
“Proteins have evolved together during evolution to solve problems faced by organisms as they evolved,” he said. However, we are currently facing a new problem, such as COVID-19. “If we can design proteins that are good at solving new problems just as they are good at solving old problems while evolving together in the course of life, that would be a very powerful tool.”
Proteins are made up of hundreds to thousands of amino acids, linked together in long chains and folded into three-dimensional shapes. AlphaFold provides insight into how proteins will behave while helping researchers predict protein folding structures.
MPNN protein will help solve the opposite problem. Protein MPNN can help researchers find amino acid sequences that fold into their shape, given that they already know the exact structure of the protein. The system uses a neural network trained from a very large number of examples of amino acid sequences that fold into three-dimensional structures.
But researchers also have to solve another problem. To design a protein that has real-world applications, such as a new enzyme that digests plastics, researchers must first figure out which protein shape has that function.
To do this, researchers from Baker’s lab use two machine learning techniques that the team call ‘constrained hallucination’ and ‘in painting’. These two methods were last This was described in detail in an article published in
‘Limited hallucinations’ allow users to select a sequence with a specific function through a random search among all possible protein sequences. This ‘illusion’ makes all possible protein structure searches possible, powered by the power of machine learning to process vast datasets. There are a total of 20 types of amino acids, which can be combined into a huge number of different sequences.
“In nature, we only find a small fraction of possible protein sequences,” says Baker. So if you limit your search to sequences that exist in nature, you won’t find anything.”
‘Paint’ works like a word processor’s autocomplete function for protein structures and sequences. This method allows researchers to create completely new proteins that have never been seen before in nature, such as giant ring-shaped structures.
Baker’s team is testing whether such ring-shaped structures can be used as components for small machines that operate on the nanoscale. In the future, these nanomachines could be used to unblock arteries, for example.
Using machine learning to design proteins in this way is ‘fantastic work’, says Lynne Regan, professor of biochemistry and bioengineering at the University of Edinburgh.
Machine learning will make the whole process of designing proteins much faster and easier, and will enable researchers to create entirely new proteins and structures on a much larger scale. The software is over 200 times faster than the best tools ever and requires minimal user input, potentially lowering the barrier to entry for protein design.
“This software and other recent developments are changing the field of biomolecular structure prediction and design,” said Jeffrey Gray, professor of chemistry and biomolecular engineering at Johns Hopkins University.
“The implications of these changes for understanding biology, health and disease and designing new molecules to reduce human suffering are impressive,” explains Gray.
By combining deep learning tools developed in his lab with tools developed in Baker’s lab, Gray said he will better understand the immune system and immune-related diseases and use AI to design treatments.
“AlphaFold solves the problem of protein structure prediction and is a transformative role that AI and machine learning can play in biology,” said Pushmeet Kohli, team leader of DeepMind’s AI for Science team. “Providing a new era in biology, MPNN, which designs proteins for specific tasks, is another proof of this paradigm shift.”
Protein MPNN, now freely available on the open source software repository GitHub, will provide researchers with the tools to enable unlimited new designs. “Of course, the important thing is, ‘What are you going to design?'” says Baker. (By Melissa Heikkilä)