Synthetic biology is like a reality-altering version of Minecraft. Rather than digital blocks, synthetic biology rejiggers the basic building blocks of life—DNA, proteins, biochemical circuits—to rewire living organisms or even build entirely new ones. In theory, the sky’s the limit on rewriting life: lab-grown meat that tastes like the real thing with far less impact on our environment. Yeast cells that pump out life-saving drugs. Recyclable biofuel.
But there’s a catch: to get there, we first need to be able to predict how changing a gene or a protein ultimately changes a cell.
It’s a tough problem. A human cell carries over 20,000 genes, each of which can be turned on, shut off, or changed in expression levels. So far, synthetic biologists have taken the trial-and-error approach. Part of the reason is that life’s biological circuits are incredibly difficult to decipher. Changes to one gene or protein may trigger a “butterfly effect” type of repercussion that propagates unpredictably through the cell. Rather than getting yeast to pump out insulin, for example, the cell could produce a bastardized, non-working version, or just die off.
Designing new biological circuits takes time—lots of it.
But maybe there’s another way. This month, a team at the Department of Energy’s Lawrence Berkeley National Laboratory, led by Dr. Hector Garcia Martin, suggested it might not be necessary to meticulously tease apart the molecular dance inside a cell to be able to manipulate it. Instead, the team tapped into the power of machine learning and showed that even with a limited dataset, the AI was able to predict how changes to a cell’s genes can affect its biochemistry and behavior.
What’s more, the algorithm could also make recommendations on how to further improve the next bioengineering cycle using simulations. The program provides predictions on how likely an additional genetic change is to lead to a syn-bio project goal—for example, making hoppy Indian Pale Ales (IPAs) but without actual hops in the mix.
“The possibilities are revolutionary,” said Martin. “Right now, bioengineering is a very slow process. It took 150 person-years to create the anti-malarial drug artemisinin. If you’re able to create new cells to specification in a couple weeks or months instead of years, you could really revolutionize what you can do with bioengineering.”
Limits to Power
Similar to germline genome editing in humans and AI, synthetic biology has the power to change the world. Considered one of the “Top Ten Emerging Technologies” by the World Economic Forum in 2016, syn-bio includes many branches of research—wiping out all mosquitoes with gene drives, or designing microbiomes for agriculture to replace environment-damaging fertilizers. However, metabolic engineering is its current golden child.
Everything alive requires metabolism. The concept in science is a bit different than the everyday vernacular. If you think of the cell as a car manufacturing facility, and every cellular component as raw material, then “metabolism” is the process of making a car out of these raw materials but at a cellular scale. Tweaking the manufacturing process, as had happened during Covid-19, can change a car manufacturer into one that makes ventilators without fundamentally altering the factory. In essence, synthetic biology does the same thing. It tweaks a cell so that its normal production is now directed to something else—a yeast that has no concept of blood sugar can now pump out insulin.
Yet due to its complexity, reprogramming a cell is far harder than rewriting software code. Here’s where AI can help. “Machine learning arises as an effective tool to predict biological system behavior,” said the team.
Rather than fully characterizing how molecular circuits work together, machine learning can extract trends from experimental data, and in turn provide predictions on how a synthetic biology tweak changes a cell. Better yet, it can do so even without completely understanding what’s happening on the ground—a bit similar to predicting day-to-day weather trends even though we’re still relatively blind to its underlying forces.
To accelerate syn-bio, the team engineered an algorithm called ART: Automated Recommendation Tool.
Similar to weather forecasts, the algorithm thrives on probabilities. At its core is a Bayesian approach, commonly used in machine learning to make predictions about the future based on things you’ve already learned. Machine learning algorithms, especially deep learning, generally require thousands or more training examples. However, these datasets are very rare in synthetic biology. The team adjusted ART so that it learns well on a low number of training instances, and so that—similar to biology—it operates on uncertainties.
To train ART, the team fed the algorithm data from proteomics—that is, a census of all proteins in a cell—to build a probabilistic model that can then predict how changes to those proteins alter the production line. Going back to the car factory analogy, ART spits out educated guesses on how rejiggering inputs changes how many cars the factory produces. Here, the “cars” are the biological products that scientists want to engineer.
“With a predictive model at hand, ART can provide a set of recommendations expected to produce the desired outcomes,” said the team. For example, ART can predict how to increase the production of wanted biochemicals—drugs, biofuels—or squash unwanted chemicals such as biotoxins. What’s more, it can also predict the levels of a biochemical—say, hops—so that the output is an enjoyable beer.
The team put ART through the ringer with five tests, ranging from a simulated “toy” dataset to real-world inputs. For example, one trial was using ART to optimize carbon-neutral biofuel production with living cells. Using previous data from 27 different biological pathways that produce a handful of different biofuel chemicals, the team trained ART to predict a synthetic pathway that’s not only efficient but also automatic, making it possible to potentially scale up production.
What’s especially cool is this: although ART’s predictions for any given fuel chemical weren’t very accurate, altogether they pointed towards “the right direction” to improve production. Somehow, ART had learned “the gist” to the secret of biofuel manufacturing. In other words, we can still optimize synthetic biology even without knowing the exact underlying mechanisms—a sort of voodoo previously unimaginable until machine learning.
The team also tapped ART to improve happy hour: brewing hoppy beer without hops. Here, bioengineered yeast “programmed” with the ability to brew ethanol (alcohol) are then modified to also synthesize chemicals that produce a “hoppy” flavor. Although many people like hops in beer, the team said, growing actual hops requires enormous amounts of water and energy, which makes the taste highly variable between crops.
Here, ART also worked its magic. The algorithm learned which synthetic biological pathways produce hop-like chemicals in yeast, which includes tinkering with the levels of four proteins. Similar to a seasoned chef, ART was able to predict the level of hops—dictated by the expression level of all those proteins—to brew a perfect Pale Ale. Another set of training data, programming cells to make the protein component tryptophan, allowed ART to tease out the interaction between five different genes. ART scanned through nearly 8,000 combinations of biochemical pathways to produce tryptophan, and recommended based on probabilities a way to double the chemical’s production.
Adding machine learning may be the fuel synthetic biology needs to further dominate life itself.
“This is a clear demonstration that bioengineering led by machine learning is feasible, and disruptive if scalable. We did it for five genes, but we believe it could be done for the full genome,” said Garcia Martin. “This is just the beginning… If we could automate metabolic engineering, we could strive for more audacious goals.”