Keio University

Learning from the Past: Creating New Proteins by Studying 4 Billion Years of Evolution

Participant Profile

  • Nobuhide Doi

    Nobuhide Doi

The Japanese phrase *onko-chishin* comes from a passage in the Analects of Confucius: "The Master said, 'He who by reviewing the old can gain knowledge of the new is fit to be a teacher.'" It means "to gain new knowledge and insights by studying and examining things of the past" (from the *Kojien* dictionary). In our laboratory, by studying how proteins (genes) with their diverse and complex functions have evolved over 4 billion years, we aim to learn their construction principles and apply this knowledge to create new proteins useful in the fields of medicine, environment, and energy.

Proteins are incredibly diverse. Not only muscles, but also hair, the transparent eyeball, and most parts of the body are composed of different proteins. Furthermore, the enzymes that digest food, hormones like insulin, and the receptors that sense smells, tastes, and light are all proteins. Thus, proteins perform a wide variety of functions and can be called the most fundamental functional molecules in life. The actual substance of a protein is a chain-like molecule made of tens to thousands of 20 different types of amino acids linked together. It folds into a unique shape according to its string of amino acids (sequence) and exhibits a specific function. The reason for the great diversity of proteins lies in the differences in their amino acid sequences, and this sequence information is stored in the DNA of the genome . Since DNA is a string of four bases (A, C, G, T), all life encrypts it by corresponding a sequence of three consecutive bases to one amino acid to translate it into a 20-character amino acid sequence (for example, GGG corresponds to glycine, and GAG to glutamic acid). Using standard techniques in modern genetic engineering, it is easy to synthesize an artificial gene with a desired sequence and create an artificial protein based on that information. However, it is not easy to predict what kind of sequence should be designed to create a protein with a desired function. Therefore, we thought that if we could explore how existing high-performance proteins have evolved and reproduce that process in the laboratory, we might be able to create new proteins.

So, what is the mechanism by which protein evolution occurs? As mentioned above, the amino acid sequence of a protein is encoded in the base sequence of DNA. This DNA is copied and passed on to offspring as genes. If this copying were 100% perfect, all life would be clones with the exact same genes, and evolution would not occur. However, in reality, random errors (mutations) occur at a low frequency when copying the DNA string, which rewrites the gene's string (for example, if one base in a gene's sequence changes from G to A, the corresponding amino acid in the protein's sequence changes from glycine to glutamic acid). If this amino acid change causes the protein to lose its function, and if that function is essential for survival, the individual with that mutation cannot survive and leave offspring, so such "disadvantageous mutations" are eliminated. On the other hand, if this amino acid change improves the protein's function and acts favorably for the survival of the individual with the mutation, such "advantageous mutations" are passed down through generations and spread throughout the population. In this way, by repeating random "mutation" and functional "selection," the phenomenon of "evolution" occurs, where a string with low function gradually transforms into a string with high function.

To actually evolve proteins in the laboratory using this principle of repeated "mutation" and "selection," various creative steps are necessary. As an example, let's consider the case of artificially evolving a protein that binds to a certain target molecule to improve its binding affinity (Figure 1). First, using a technique called PCR, we introduce mutations into the protein's gene while amplifying it, creating a diverse population (library) of mutants (Figure 1 ①). Next, this population of genes is biologically converted into a population of proteins (Figure 1 ②). The key point here is to use our original technology to physically link the protein to its gene. Finally, using a column to which the target molecule has been pre-immobilized, we select only the proteins that bind strongly to the target molecule (Figure 1 ③), and then recover the linked genes (Figure 1 ④). The base sequence of the recovered DNA can be easily read by an automated sequencer. We introduce further mutations based on this gene and repeat this evolutionary cycle. By making the "selection" conditions more stringent as the cycles progress, we can gradually evolve proteins with higher binding affinity.

Figure 1: In-vitro evolution of proteins

What we learn from the evolution of life is mainly how to create an initial mutant library that increases the probability of obtaining a protein with high function. For example, it has become clear that organisms speed up evolution by efficiently combining "advantageous mutations" through gene shuffling, or they generate new functions when the environment changes by accumulating "neutral mutations" that are neither advantageous nor disadvantageous. In fact, it is being proven that incorporating such mechanisms into experimental systems makes it easier for proteins to evolve even in the laboratory.

To date, many proteins around the world have already been improved through in-vitro evolution. For example, the green fluorescent protein from the jellyfish *Aequorea victoria*, which was a topic of the 2008 Nobel Prize in Chemistry, has had its application value increased by evolving it through artificial mutations into fluorescent proteins of various colors, such as blue and yellow. It is now widely used in technologies for "visualizing" cells and other applications.

Finally, let me introduce some of the applied research projects currently underway in our laboratory.

In contrast to small-molecule drugs that use conventional chemical synthesis techniques, pharmaceuticals that use proteins, genes, and the like are called biopharmaceuticals. A prime example is antibody drugs. Antibodies are proteins that recognize and bind to foreign substances such as bacteria and viruses that have entered our bodies, attacking and eliminating them. Antibody drugs utilize this property of antibodies to identify and attack cancer cells or the molecules that cause rheumatoid arthritis (Figure 2). Since biopharmaceuticals are molecules that originally exist in the body, they have advantages such as a lower risk of side effects, and many have been developed in recent years. However, they have the disadvantage of being more expensive than chemically synthesized drugs because they require culturing cells to produce the proteins. In our laboratory, we have established an original technology for rapidly evolving proteins in a test tube. We are conducting research to enhance the therapeutic effect of antibody drugs against cancer and viral infections by strengthening their binding affinity, and to make them easier to mass-produce by fragmenting the antibodies to make them smaller, as shown on the right side of Figure 2. By establishing such methods, we expect to be able to improve various antibody drugs in the future, reducing their dosage and production costs.

Figure 2: Structures of major antibody drugs

Biomass was originally an ecological term meaning "biological resource mass," but in recent years it has gained attention as a renewable biological resource for energy. In particular, there are high hopes for the effective use of cellulosic raw materials that do not compete with food sources, such as wood waste and rice straw. Cellulose is a chain of glucose molecules. If it can be broken down into glucose by a cellulose-degrading enzyme (a protein), it can then be used to produce bioethanol through fermentation or be directly converted into electricity, as described later. In our laboratory, we are conducting research to create artificial enzymes with higher cellulose-degrading activity than conventional ones by evolving cellulose-degrading enzymes in the lab.

Meanwhile, biofuel cells are anticipated as a future energy device that can directly convert organic matter (substrates like glucose) into electricity using oxidoreductases (which are also proteins) (Figure 3). The reason Doraemon can move after eating dorayaki (or, in a more modern example, why Franky can run on cola?) is probably because he is equipped with a biofuel cell. However, at the current stage, biofuel cells have limitations in power output and lifespan, and have not yet been put into practical use. While we cannot build robots in our lab, we believe that we can improve the power output and lifespan of biofuel cells by enhancing the activity and stability of oxidoreductases. We are collaborating with other laboratories both inside and outside the university to build an experimental system for evolving these enzymes.

Figure 3: Principle of a biofuel cell

Studying the history of life's evolution is, ultimately, a way of asking where we came from, and is an extremely "interesting" theme in itself. But it is also an empirical rule shown by the history of science that "what is interesting is useful." There is a famous story from over 150 years ago when a lady, after seeing a simple electrical experiment by Michael Faraday, the father of electromagnetism, asked, "It is very interesting, but what is the use of it?" Faraday is said to have replied, "Madam, what is the use of a newborn baby?" We were reminded once again after last year's earthquake of just how indispensable electricity is to our modern lives. By studying the "interesting" mechanisms of past protein (gene) evolution, we hope to create new proteins that are "useful" for biopharmaceuticals and biofuel cells, and thereby contribute, even in a small way, to solving the problems facing humanity.

Gakumon no susume (An Encouragement of Learning) (Research Introduction)

Showing item 1 of 3.

Gakumon no susume (An Encouragement of Learning) (Research Introduction)

Showing item 1 of 3.