A.I. Turns Its Artistry to Creating New Human Proteins

Inspired by digital art generators like DALL-E, biologists are building artificial intelligences that can fight cancer, flu and Covid.

Cade Metz is a technology correspondent, covering artificial intelligence, driverless cars, robotics, virtual reality and other emerging areas. He previously wrote for Wired magazine. NY Times

“Last spring, an artificial intelligence lab called OpenAI unveiled technology that lets you create digital images simply by describing what you want to see. Called DALL-E, it sparked a wave of similar tools with names like Midjourney and Stable Diffusion. Promising to speed the work of digital artists, this new breed of artificial intelligence captured the imagination of both the public and the pundits — and threated to generate new levels of online disinformation.

Social media is now teeming with the surprisingly conceptual, in which shockingly detailed, often photorealistic images generated by DALL-E and other tools. “Photo of a teddy bear riding a skateboard in Times Square.” “Cute corgi in a house made out of sushi.” “Jeflon Zuckergates.”

But when some scientists consider this technology, they see more than just a way of creating fake photos. They see a path to a new cancer treatment or a new flu vaccine or a new pill that helps you digest gluten.

Using many of the same techniques that underpin DALL-E and other art generators, these scientists are generating blueprints for new proteins — tiny biological mechanisms that can change the way of our bodies behave.

Our bodies naturally produce about 20,000 proteins, which handle everything from digesting food to moving oxygen through the bloodstream. Now, researchers are working to create proteins that are not found in nature, hoping to improve our ability to fight disease and do things that our bodies cannot on their own.

David Baker, the director of the Institute for Protein Design at the University of Washington, has been working to build artisanal proteins for more than 30 years. By 2017, he and his team had shown this was possible. But they did not anticipate how the rise of new A.I. technologies would suddenly accelerate this work, shrinking the time needed to generate new blueprints from years down to weeks.

“What we need are new proteins that can solve modern-day problems, like cancer and viral pandemics,” Dr. Baker said. “We can’t wait for evolution.” He added, “Now, we can design these proteins much faster, and with much higher success rates, and create much more sophisticated molecules that can help solve these problems.”

David Baker stands in his lab, holding a white and blue model of a protein. Behind him are shelves stacked with bottles and boxes.

Last year, Dr. Baker and his fellow researchers published a pair of papers in the journal Science describing how various A.I. techniques could accelerate protein design. But these papers have already been eclipsed by a newer one that draws on the techniques that drive tools like DALL-E, showing how new proteins can be generated from scratch much like digital photos.

“One of the most powerful things about this technology is that, like DALL-E, it does what you tell it to do,” said Nate Bennett, one of the researchers working in the University of Washington lab. “From a single prompt, it can generate an endless number of designs.”

The San Francisco company is one of the world’s most ambitious artificial intelligence labs. Here’s a look at some recent developments.

  • ChatGPT: The new cutting-edge chatbot is inspiring awe, fear, stunts and attempts to circumvent its guardrails, our technology columnist writes.
  • DALL-E 2: The system lets you create digital images simply by describing what you want to see. But for some, image generators are worrisome.
  • GPT-3: With mind-boggling fluency, the natural-language system can write, argue and code. The implications for the future could be profound.

To generate images, DALL-E relies on what artificial intelligence researchers call a neural network, a mathematical system loosely modeled on the network of neurons in the brain. This is the same technology that recognizes the commands you bark into your smartphone, enables self-driving cars to identify (and avoid) pedestrians and translates languages on services like Skype.

A neural network learns skills by analyzing vast amounts of digital data. By pinpointing patterns in thousands of corgi photos, for instance, it can learn to recognize a corgi. With DALL-E, researchers built a neural network that looked for patterns as it analyzed millions of digital images and the text captions that described what each of these images depicted. In this way, it learned to recognize the links between the images and the words.

When you describe an image for DALL-E, a neural network generates a set of key features that this image may include. One feature might be the curve of a teddy bear’s ear. Another might be the line at the edge of a skateboard. Then, a second neural network — called a diffusion model — generates the pixels needed to realize these features.

The diffusion model is trained on a series of images in which noise — imperfection — is gradually added to a photograph until it becomes a sea of random pixels. As it analyzes these images, the model learns to run this process in reverse. When you feed it random pixels, it removes the noise, transforming these pixels into a coherent image.

At the University of Washington, other academic labs and new start-ups, researchers are using similar techniques in their effort to create new proteins.

Proteins begin as strings of chemical compounds, which then twist and fold into three-dimensional shapes that define how they behave. In recent years, artificial intelligence labs like DeepMind, owned by Alphabet, the same parent company as Google, have shown that neural networks can accurately guess the three-dimensional shape of any protein in the body based just on the smaller compounds it contains — an enormous scientific advance.

Now, researchers like Dr. Baker are taking another step, using these systems to generate blueprints for entirely new proteins that do not exist in nature. The goal is to create proteins that take on very specific shapes; a particular shape can serve a particular task, such as fighting the virus that causes Covid.

Much as DALL-E leverages the relationship between captions and photographs, similar systems can leverage the relationship between a description of what the protein can do and the shape it adopts. Researchers can provide a rough outline for the protein they want, then a diffusion model can generate its three-dimensional shape.

A protein diffusion model doing unconditional generation, converting noise into plausible structures. Video by Namrata Anand
Namrata Anand in a black blazer against a blue background. She is smiling, and her arms are crossed.
Namrata Anand, a former Stanford University researcher. She is now building a company in generative A.I. protein design.Credit…Herve Philippe/TerrificShot Photography

“With DALL-E, you can ask for an image of a panda eating a shoot of bamboo,” said Namrata Anand, a former Stanford University researcher who is also an entrepreneur, building a company in this area of research. “Equivalently, protein engineers can ask for a protein that binds to another in a particular way — or some other design constraint — and the generative model can build it.”

The difference is that the human eye can instantly judge the fidelity of a DALL-E image. It cannot do the same with a protein structure. After artificial intelligence technologies produce these protein blueprints, scientists must still take them into a wet lab — where experiments can be done with real chemical compounds — and make sure they do what they are supposed to do.

For this reason, some experts say that the latest artificial intelligence technologies should be taken with a grain of salt. “Making a new structure is just a game,” said Frances Arnold, a Nobel Laureate who is a professor specializing in protein engineering at the California Institute of Technology. “What really matters is: What can that structure actually do?”

But for many researchers, these new techniques are not just accelerating the creation of new protein candidates for the wet lab. They provide a way of exploring new innovations that researchers could not previously explore on their own.

“What’s exciting isn’t just that they are creative and explore unexpected possibilities, but that they are creative while satisfying certain design objectives or constraints,” said Jue Wang, a researcher at the University of Washington. “This saves you from needing to check every possible protein in the universe.”

Often, artificially intelligent machines are developed to perform skills that come naturally to humans, like piecing together images, writing text or playing board games. Protein-designing bots pose a more profound question, Dr. Wang said: “What can machines do that humans can’t do at all?”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s