The genetic code is the basis for all life and allows the information contained in the DNA to be translated into the proteins that perform most functions of a cell. And yet it is … a kind of mess. Life typically uses a series of about 20 amino acids while the genetic code has 64 possible combinations. This mismatch means that redundancy is widespread and a variety of species have evolved variations of an otherwise universal genetic code.
Is the code itself important, or is it a historical accident blocked by events in the distant evolutionary past? Answering this question was not an option until recently, as individual codes occur hundreds of thousands of times in the genome of even the simplest of organisms. However, as our ability to produce DNA has increased, it has become possible to synthesize whole genomes from scratch, allowing a full transcription of the genetic code.
Now researchers announce that they have renewed the genome of bacteria ] E. coli to get rid of some of the redundancy of the genetic code. The resulting bacteria grow somewhat slower than a normal strain, but were otherwise difficult to distinguish from their non-synthetic counterparts.
Codes and Redundancy
The genetic code is written in sentences of three DNA bases. Each of the three positions can hold one of the four bases, meaning that there are 4 × 4 × 4 possible combinations, or 64. In contrast, there are only 20 amino acids, while at least one of the remaining codons must be used. Tell the cell she should finish the translation of the code. This results in a mismatch of 43 codes that are not necessarily needed. Cells use these additional codes as redundancy. Instead of a stop code, most genomes use three. Eighteen of the 20 amino acids are encoded by more than one set of three bases; two have up to six possible codes.
Is this redundancy useful? The answer is "sometimes". For example, many DNA sequences perform a dual function and encode both a protein and regulatory information that controls gene activity or allows the formation of specific RNA structures. The flexibility of redundancy makes it easy for a sequence to accomplish two purposes. The redundancy may also allow for fine tuning of gene activity, as some codes are more efficiently translated into proteins than others. These factors suggest that the redundancy of the genetic code has proven to be essential to an organism.
The test of whether this is the case, however, is a nightmare. Even the most compact genomes have hundreds of genes ( E. coli strains have between 4,000 and 5,500), and all the individual codes can occur in each multiple. The processing is possible, but very time consuming.
The researchers simply re-encoded the data on the computer. They focused on one of the amino acids with multiple redundant codes and optimized the sequences, so that more than 1
According to one of the participating researchers (and regular Ars readers), Wolfgang, this is easier than it sounds Smith. In one such project where you ask questions about the rules of the genetic code, "at some point you have to commit to ordering a genome with synthetic DNA," he told Ars, "that's a pretty big financial investment and not one simple button to press. "But push what they did.
Some assemblies required
Unfortunately, there is a big gap between the output of a DNA synthesizer and the genome, which is several million bases long. The group had to complete a complete assembly process by assembling small parts in one cell into one large segment and then transferring it to another cell that had an overlapping large segment. "Personally, my biggest surprise was how well the assembly process worked," said Smith. "The success rate in each phase was very high, which meant we could do most of the work with standard bench techniques."
During the process, there were some places where the synthetic genome had problems – in at least one case, two major genes overlapped. But the researchers were able to optimize their version to circumvent the identified problems. The final genome also had a handful of errors that occurred during the assembly process, but none of these errors changed the three basic codes targeted.
In the end it worked. Instead of using 61 of the 64 possible codes for amino acids, the new organism – called Syn61 – used only 59. The researchers were then able to delete the genes that normally allow E. coli to use the redirected codes. Usually, these genes are essential; In Syn61 they could easily be deleted. That does not mean the Syn61 strain is okay. it grew slower than its normal counterparts. This is probably the result of all previously described cases in which DNA sequences performed more than one function. It is possible that the strain will revert to normal growth over time.
Aside from answering basic biology questions, the Syn61 strain may ultimately be useful. There are far more amino acids than the 20 used, and many of them have interesting chemical properties. But to use them, we need additional genetic codes that can be redirected to the artificial amino acids – exactly what this new work has brought.
Nature 2019. DOI: 10.1038 / s41586-019- 1192-5 (About DOIs).