How large is the alphabet of DNA?


New sequencing technology is transforming epigenetics research, and could greatly improve understanding of cancer, embryo formation, stem cells and brain function.

It’s remarkable that for so long, we weren’t aware of these other modifications in human DNA. If we’ve found four more bases since 2009, then who are we to argue that nothing else is there?” — Shankar Balasubramanian

The mechanisms which cause certain genes to be switched on or off, and are thought to play a role in cancer development and stem cell differentiation, can now be accurately detected and studied thanks to a new DNA sequencing method.

The technology developed by Cambridge Epigenetix is helping researchers understand modifications to DNA, by detecting ‘extra’ DNA bases, which until now could not be definitively identified.

There are four standard DNA bases (Guanine, Cytosine, Adenine and Thymine), and the way they are ordered determines the makeup of the genome. In addition to G, C, A and T, there are also small chemical modifications, or epigenetic marks, which affect how the DNA sequence is interpreted and control how certain genes are switched on or off. The study of these marks and how they affect gene activity is known as epigenetics.

The most-studied mark is 5-methylcytosine (5mC), which is formed when molecules of methyl attach to the cytosine base of DNA, a process known as methylation. In 2009, a ‘sixth’ base, 5-hydroxymethylcytosine (5hmC) was discovered in human DNA, and subsequently two further modified DNA bases, 5-formylcytosine (5fC) and 5-carboxycytosine (5caC) were also identified.

Professor Shankar Balasubramanian of the Department of Chemistry founded Cambridge Epigenetix in 2012 to develop innovative epigenetic research tools that can identify, decode and help elucidate the function of the ‘extra’ DNA bases.

Standard DNA sequencing methods work by reading the features of the four standard bases, but cannot detect whether a cytosine base has been methylated. In order to address this shortcoming, a method called bisulfite sequencing was developed to detect methylation by adding a bisulfite reagent that converts the non-methylated cytosine bases to uracil, one of the subunits of RNA. By sequencing bisulfite-treated DNA, researchers can identify which cytosine bases were originally methylated and which were not.

However, because 5hmC and 5mC are both resistant to bisulfite treatment, it is impossible to distinguish between these two epigenetic marks using traditional bisulfite sequencing.

The reason this is a key distinction to make is that 5mC and 5hmC are thought to have completely different physiological functions. Research on the link between gene expression and methylation indicates that there are certain sites where methylation causes the gene to be switched off and silenced, whereas hydroxymethylation causes the gene to be switched on.

“Functionally, they have profoundly different meanings, yet we haven’t been able to tell the difference between them using typical sequencing methods,” said Professor Balasubramanian.

Following the discovery of the fifth and sixth bases, Professor Tony Green from the Department of Haematology encouraged Professor Balasubramanian to think about a new method of sequencing to detect these modifications. Balasubramanian and his PhD student Michael Booth co-invented such a method, known as oxidative bisulfite sequencing.

Oxidative bisulfite sequencing allows researchers to quantitatively measure 5mC and 5hmC at single-base resolution, enabling more accurate DNA sequencing.

The technique works by chemically oxidising 5hmC to 5fC, which like cytosine is susceptible to bisulfite treatment. Once the oxidative bisulfite reaction is complete, 5hmC and cytosine will appear in the sequence as thymine, so that the only cytosine bases remaining in the sequence are truly 5mC.

“In one reaction, you can get an accurate representation of methylation without having to factor in the ‘contamination’ from hydroxymethyl C,” said Professor Balasubramanian. “What our research group and Cambridge Epigenetix are doing is bringing this capability to go beyond the standard four letters of the genetic alphabet in a way that benefits from all the general innovation brought from ‘next generation’ sequencing technology, such as the Solexa/Illumina approach.”

Research studies indicate that dynamic regulation of DNA function by these epigenetic marks is essential for normal foetal development and plays an important role in cancer, neurological disorders and other diseases. In addition, it is thought that DNA modification plays a central role in stem cell reprogramming.

“Reprogramming the way DNA functions is fundamental to all living systems,” said Professor Balasubramanian. “It’s remarkable that for so long, we weren’t aware of these other modifications in human DNA. If we’ve found four more bases since 2009, then who are we to argue that nothing else is there?”

Cambridge Epigenetix conducted an alpha trial of its first product, TrueMethyl, late last year, and a beta trial in 13 laboratories around the world earlier in 2013. TrueMethyl is available worldwide, and the company has a range of other epigenetic research tools currently in development.

The company is funded by Cambridge Enterprise, the University’s commercialisation arm, and Syncona Partners.

Photo credit: Queen bee larvae in royal jelly. Worker bees and the queen have exactly the same DNA sequence, but queen larvae are fed royal jelly which epigenetically modifies their DNA so they grow to be larger and fertile. By Waugsberg via Wikimedia Commons


To read more information, click here.

Cambridge Enterprise exists to help University of Cambridge inventors, innovators and entrepreneurs make their ideas and concepts more commercially successful for the benefit of society, the UK economy, the inventors and the University.

Cambridge Enterprise, University of Cambridge