Computing Better Enzymes: Optimizing Directed Evolution
Using computation, researchers narrow the search space for directed evolution; guide mutagenesis; and create de novo enzymes
Enzymes are among nature’s crowning achievements: they accelerate chemical reactions, making life possible. People have co-opted natural enzymes for industrial use for thousands of years (think cheese-making). But it’s only recently that scientists have been able to create made-to-order enzymes for applications ranging from detoxifying deadly nerve gas to converting waste into fuel.
In a process called “directed evolution” scientists re-enact natural evolution in the laboratory: they iteratively mutate an enzyme and select for mutants with the desired feature. Within months, directed evolution can increase an enzyme’s ability to catalyze a particular reaction by as much as 1000-fold—and sometimes even beyond. “In the past five years alone, there have been over 60 publications containing examples of directed evolution of enzymes for industrial processes,” says Gert Kiss, PhD, a postdoc at Stanford University, who collaborates closely with David Baker, PhD, professor of biochemistry at the University of Washington and a pioneer in directed enzyme evolution.
Several advancements are fueling recent progress in the field. Chief among these is the coupling of directed evolution with computation. Though simulating directed evolution exhaustively in silico remains beyond reach, computation can help narrow the search space for directed evolution; guide mutagenesis; and create de novo enzymes for catalytic activities that don’t exist in nature.
Narrowing the Search Space
Directed evolution follows two steps: researchers mutate the starting enzyme using a replication process that randomly introduces errors (a procedure called error-prone PCR); they then select or screen the resulting variants for increased catalytic activity. Enzymes with higher catalytic activity undergo a second round of mutagenesis and selection/screening; and the process is repeated (typically around 10 to 20 times). The number of possible variants is astronomical (20250 for an average-sized protein), so even with large libraries of mutants, one can cover only a tiny fraction of the search space. “You’re shooting with a shot gun in a dark room and you’re just hoping to hit something,” Kiss says. When structural information is available, computational approaches can reduce the search space and improve the odds of a hit. Rather than randomly mutating the whole protein, scientists focus only on those amino acids that are likely to yield dividends, such as those in the active site. “You won’t get around the actual experiments with these approaches,” Kiss says, “but by providing more and more rational input, the process becomes less random and thus more effective.”
For example, a 2012 paper in Chemistry and Biology described the directed evolution of human PON1, a protein that in its native form can weakly detoxify the nerve gas sarin. Researchers used the computational docking program AutoDock to explore the effects of different mutations on substrate binding—and identified eight key amino acids in the active site likely to affect sarin binding. These were subjected to repeated rounds of site-specific mutagenesis. The resulting laboratory-evolved variants had up to 3400-fold increased activity (relative to wild type), enough to block the action of sarin on human target proteins for 24 hours.
To make the problem computationally tractable, modeling programs must rely on certain approximations. For example, many approaches assume a rigid protein backbone when in fact, proteins are “like spaghetti in a bowl—continuously vibrating and breathing—capable of adapting to their environment,” Kiss says. Though they can’t fully model backbone flexibility (as this requires massive computer resources), Kiss and colleagues have applied a program called RosettaBackrub that can incorporate backbone flexibility on a smaller scale.
Using Backrub, as well as several other computational strategies for guiding directed evolution, Kiss and colleagues increased the catalytic activity of the enzyme KE70 (a computer-designed enzyme, see below) by 400-fold. The work is described in a 2011 paper in the Journal of Molecular Biology. “In many cases, the mutations were suggested computationally,” says first author Olga Khersonsky, PhD, a postdoc in David Baker’s lab at the University of Washington. “Many other labs are also now using computation to guide their direct, a postdoc in David Baker’s lab at the University of Washington. “Many other labs are also now using comped evolution processes.”
Directed evolution can enhance protein function, but this is often at the cost of protein stability. “It really comes down to what you select for,” Kiss says. “If you care about improving catalytic activity, you might end up losing thermal stability.” Sequence analysis can help here: scientists may focus their mutagenesis only on “hotspots” of mutation and avoid mutations in more conserved regions. The fact that nature has disfavored changes in these areas suggests that they are destabilizing. In a 2010 paper in ChemBioChem, researchers mutated four residues in the active site of an esterase enzyme. But they only allowed substitutions with amino acids that commonly appear in these sites in other enzymes from the same family (as determined by the alignment program 3DM). Indeed, control experiments confirmed that the strategy significantly increased their hit rate.
Making Enzymes from Scratch
Directed evolution can only work if there is a starting enzyme that has at least a weak ability to catalyze the reaction of interest. When no natural enzymes exist, scientists can now create them from scratch on a computer. “We are generating enzymes for which there was no actual evolutionary pressure in nature,” Kiss says. Though these designed enzymes display only weak catalytic activity (“we’re very good at making bad enzymes!” Kiss says), they provide a starting point for directed evolution.
Baker’s lab has provided some of the first examples of de novo enzyme creation. They first build an idealized active site: using quantum mechanics calculations, they determine which amino acid groups are needed—and in what orientation—to stabilize the transition state of the chemical reaction. Then, using the RosettaMatch program, they search through a database of over 86,000 crystal structures to geometrically fit this theoretical active site (also called a theozyme) into a groove or cavity on an existing protein. “The challenge is that it’s a huge search space. Finding a way to efficiently search through this library of scaffolds to find a good geometry is very challenging,” says Daniela Grabs-Röthlisberger, PhD, a cofounder of Arzeda Corporation, which uses Baker’s technology to make designer enzymes. Finally, they graft the theozyme onto the protein scaffold in silico.
Most of the proteins turn out to be duds: they fail to express, fold, or show the desired activity, or they aggregate in solution. But a few percent work, Kiss says. For example, Baker’s lab created enzymes to catalyze the Kemp elimination reaction (for which no natural enzymes exist). They came up with 57 designs in 17 different scaffolds, 8 of which showed Kemp eliminase activity. Three, including KE70, were further optimized by directed evolution to increase their activities up to 2000-fold. “This is really an uphill battle. But it’s so cool to make progress and to eventually find a way through,” Kiss says. The Kemp elimination reaction is a model reaction that has no practical applications for industry or medicine. But, Kiss says, “we’re now starting to go from proof of principle to a place where we’re starting to apply these methods to real problems.”