The power of methylation haplotypes

How NGS technology informed biology and now transforming cancer diagnostics with an epigenomics Cambrian Explosion

Finding a cell-free cancer signal is like finding a needle in a haystack...

Finding a cell-free cancer signal is like finding a needle in a haystack...

Of the 28 million CpG residues in the human genome, 60% to 80% of them are methylated at the 5’ end of the cytosine residue. Studied for over half a century, the term ‘epigenetics’ is derived from the terms ‘epigenesis’ and ‘genetics’; the term ‘epigenomics’ is “generally used to describe the global, comprehensive view of sequence-dependent processes that modulate gene expression patterns in a cell and has been liberally applied in reference to the collection of DNA methylation state and covalent modification of histone proteins along the genome”.

Before the epigenomics Cambrian Explosion

In the book “A short history of nearly everything”, science writer Bill Bryson describes the Cambrian Explosion as “the moment when complex life burst forth in dazzling profusion”. Most animal lineages of present-day species occurred at the beginning of the Cambrian Period, some 542 million years B.C.E. Over the course of a relatively minuscule amount of time, an astonishing number of species arose.

Currently the field of epigenomics is experiencing ‘unprecedented growth with no sign of deceleration’ (op. cit.) But first let’s start with a brief history.

Prior to the advent of Next-Generation Sequencing (NGS), after bisulfite-treated DNA treatment methods such as MS-PCR (methylation-specific PCR), COBRA (Combined Bisulfite Restriction Analysis), real-time quantitative PCR, pyrosequencing, or even cloning and Sanger Capillary Electrophoresis methods were used. Later, microarray approaches from commercial vendors appeared for massively parallel interrogation of thousands to hundreds of thousands of CpG sites simultaneously. For example, in 2009 a landmark paper was published interrogating ~4. 6million CpG’s in colorectal cancer samples using a custom NimbleGen HD2 microarray (now a Roche subsidiary).

Decades of research with single-loci approaches (i.e., examining the methylation status of a single CpG site) has led to two FDA-approved methylation-based diagnostics, namely Exact Sciences’ Cologuard™ and Epigenomics Epi proColon® for colorectal cancer screening. The Cologuard test assays KRAS DNA mutations as well as methylation markers of the three genes NDRG4, BMP3 and ACTB from stool, using their proprietary Quantitative Allele-Specific Real-time Target and Signal [QuARTS™] technology as a send-out test. Epigenomics Epi proColon interrogates the ‘v2 region’ of the SEPT9 gene, using a real-time PCR assay deployed on the Applied Biosystems® 7500 Fast Dx Real-Time PCR Instrument.

The weaknesses of the single CpG loci approach

Single-loci approaches have their advantages, especially with PCR-based approaches like real-time PCR that are assay technologies with high sensitivity and specificity for trace amounts of DNA. However PCR-based technology certainly has difficulty in looking at more than a few loci, as multiplexing the assay has severe practical limitations. Given that the tumor DNA fraction may only be one part in two hundred (0.5% is an average allele fraction reported across patient samples for circulating tumor DNA), a PCR-based approach overcomes the technical limitations imposed on a requirement for a highly sensitive and specific technical approach to pick up the rare methylation CpG signal.

However this inability to look at multiple loci simultaneously means splitting the sample into one of several smaller reaction vessels, which would need to take place either in the device (say a microfluidic chamber) or manually by the operator. The overhead in cost of time, labor and reagents in manipulating a single sample into multiple reaction vessels is then a multiple over what it would cost were it to be in a single multiplex assay.

Next-Generation Sequencing and an explosion of epigenomics discovery

With the advent of next-generation sequencing, the field of epigenomics is now undergoing a Cambrian Explosion. The ability to look at millions of simultaneous CpG loci from a single sample has opened up a multitude of different epigenetic and epigenomic avenues of analysis, as well as uncovered a rich vein of biomarkers for additional characterization.

We have spoken briefly here about methylation haplotypes: it is a unique signature of a string of methylated cytosines, at single-molecule resolution (which we have expanded upon here). Instead of a single 5’-methyl-Cytosine residue as a positive signal (again often at a 1:200 concentration within the cell-free DNA fraction), the Singlera MethylTitan assay looks at a string of a few to a few dozen 5’-methyl-Cytosines as a contiguous set.

Unique patented Singlera MethylTitan assay and analysis technologies

While the specific sources of cell-free DNA in the bloodstream remains to be clearly elucidated, what is known about the nature of cell-free DNA from non-invasive prenatal testing research is a rapid clearing of these molecules from the bloodstream, with a half-life of 16 to 30 minutes. Illustrated below is a region of DNA where the circles represent individual CpG methylation sites, with the different colors representing methylated / de-methylated individual cytosine residues.

Similar healthy and tumor DNA regions with CpG sites marked

Fig 1: Similar healthy and tumor DNA regions with CpG sites marked

The first figure shows healthy and tumor DNA with the same genomic regulatory region with the CpG residues’ methylation status shown in blue and green respectively. As you can see in the middle portion labelled ‘Cell-free DNA in Plasma’, one strand of tumor cell DNA is highlighted in a high background (typically >99%) of healthy cell DNA.

Fig 2: The challenge of detecting a single CpG locus

Fig 2: The challenge of detecting a single CpG locus

The second figure illustrates the challenge of detecting individual CpG sites. Even the tumor cell DNA does not have consistent methylation patterns across the identical region across the collection of tumor cells, due to the well-studied (and often underestimated) nature of tumor heterogeneity. At a single CpG site, you are only depending on the detection of a single methylated base against an ocean of wild-type unmethylated background.

On top of these limitations of detecting an individual CpG site add the additional challenge of detecting this signal from bisulfite-treatment of the incoming native cell-free DNA, which will damage a large proportion of the native DNA, causing not only breakage and nicking of the phosphodiester backbone but also causing abasic sites (locations where the sugars are removed).

Fig 3: Singlera's MethylTitan approach analyzing adjacent CpG methylated cytosines

Fig 3: Singlera's MethylTitan approach analyzing adjacent CpG methylated cytosines

The third figure illustrates Singlera’s MethylTitan approach. Through analysis of the adjacent CpG methylated cytosines (a given target may have a few to a few dozen CpG’s depending on the region), the entire pattern of methylation for that particular region can be compared as a unique signature instead of the status of an individual base.

Think of it this way: Singlera’s MethylTitan searches and identifies information-rich sentences, whereas individual CpG sites identify information-poor letters.

A different chemistry, a different approach, and different results

The Singlera technology uses a unique library preparation methodology (manuscript currently under review) that captures bisulfite-treated cell-free DNA and transforms them into sequence-ready libraries in a straightforward laboratory workflow. By increasing molecular yield we can then generate results for early detection of cancer that will open up a new era.

In addition to having the data, it is the analysis of methylation haplotypes (the adjacent CpG status information as information strings rather than discrete single CpG sites) combined together that make a powerful combination. The clinical data from the ColonES assay is on our Product Sheet available here, with specificity >99% (very low false-positive rates) and a sensitivity of 97% for Stage I colorectal cancer, and 91% for pre-cancerous advanced adenoma (low false-negative rates), from a total of 1,283 clinical samples analyzed.

For more information about ColonES, or to explore a potential partnership with us, please contact us here.


  1. Rivera C.M. and Ren B. Cell 2013. "Mapping human epigenomes" PMID: 24074860
  2. Irizarry R.A. and Feinberg A.P. et al Nat Genet. 2009 "The human colon
    cancer methylome shows similar hypo- and hypermethylation at conserved
    tissue-specific CpG island shores" PMID: 19151715
  3. Bianchi D.W. and Johnson K.L. et al Early Hum Dev. 2010 "Insights into fetal and neonatal development through analysis of cell-free RNA in body fluids" PMID: 20851538