A common cancer screening test - Singlera ctmDNA

Mammography, a common cancer screening test

Current screening approaches for early cancer detection

The American Cancer Society screening recommendations and a thought experiment

A recent article by a Stanford professor (and Andreessen Horowitz venture fund general partner) Vijay Pande in The Scientific American, entitled "How to Engineer Biology", subtitled 'Billion-dollar bridges rarely fail—whereas billion-dollar drug failures are routine' generated a fair amount of commentary and discussion online.

Biology is not engineering

Suffice it to say, biology is not engineering. One interesting illustration Dr. Pande uses states "The identification of biomarkers (chemical substances we can measure and then target) for disease is currently driven by discovery via a bespoke, one-off process—so the discovery of PSA for prostate cancer, for instance, does not suggest a biomarker for ovarian cancer. Introducing machine learning into the process, however, can turn this handcrafting into assembly-line production. Furthermore, we’re teaching the machine how to fish, allowing not just for reproducibility but for the improvement of accuracy over time, thanks to inputs from additional raw data and identification of complex patterns that humans are incapable of seeing."

There is no doubt that machine learning of Big Data has been a boon to biological discovery. Indeed the Human Genome Project itself would not have been possible without the aid of powerful software running on powerful computers. However, the huge cognitive leap that machine learning can help identify novel biomarkers for other types of cancers in an assembly-line fashion, has been ably been addressed by Derek Lowe's commentary "Engineering Biology, for real?". Biology cannot be broken down into Lego-like blocks, when there are multitudes of unknown layers interacting and reacting in unpredictable ways.

The science of statistics

Not to say engineering principles are not useful - Pande's further points on Repeatability and Reproducibility, and Testing and Process Engineering, do find a home in biological statistics, epidemiology, population genetics and even cancer screening. Biological sciences are becoming less descriptive and more quantitative over time, especially as biologists' ability to measure becomes more precise.

In the world of cancer screening, you can describe the Key Performance Indicators (KPIs) as mentioned in Pande's piece as Sensitivity and Specificity. Sensitivity is defined as the True Positive rate - "measuring the actual positives identified as such" (per Wikipedia), while Specificity is the True Negative rate, measuring the extent to which actual negatives are classified as such.

For colorectal cancer (CRC) screening, the two accepted test methologies for occult blood in stool, the Guaic Fecal Occult Blood Test and the Fecal Immunohistochemistry Test (G-FOBT and FIT) have reported sensitivities of 50% and 75% respectively, and specificities of 77% and 90% (Table 4 of this reference). While these numbers may seem low (a sensitivity measurement of 50% after all does mean every other person going into a follow-up will have a negative colonoscopy), the fact that a colonoscopy is a low-risk, relatively low-cost procedure. To quantify the risk of colonoscopy, this 2006 reference of over 16,000 procedures indicates 0.5% frequency of 'serious complications' and 1 death within 30 days directly attributable to the procedure (1/16,318, or 0.0061%).

While on the topic of CRC, newer non-invasive tests have been approved by the FDA for CRC screening, namely Cologuard from Exact Sciences and epi proColon from Epigenetics. For Cologuard, the overall sensitivity and specificity are 92% and 87% respectively. For epi proColon, the overall sensitivity and specificity are 68-72% and 79-82% respectively.

These non-invasive tests, whether stool- or blood-based, can have relatively low specificity and sensitivity. As generally >95% or >99% values for sensitivity and specificity are desirable, why is this the case for colorectal cancer screening?

The answer is in the reflex, 'gold standard' follow-up procedure: the invasive colonoscopy. Widely available in the United States, recommended for healthy individuals age 50 and older (at 10-year intervals), relatively high false-positive and false-negative rates will flag individuals who genuinely need to go in for routine colonoscopy. As the compliance rate for colonoscopy among healthy individuals is approximately 60% in the US, the non-invasive methods are an important mechanism to raise that compliance rate upwards.

American Cancer Society Screening Recommendations

The usefulness of cancer screening is undeniable. Since the advent of screening mammography for breast cancer, mortality has decreased 30%. For CRC, a decline of 40% from 1975 to 2011 in the US has been observed, with a full 50% of that decline attributed to screening efforts.. For cervical cancer, the advent of the Papanicolaou test (commonly known as the 'Pap test' or 'Pap smear') has been credited with a 70% decline since the 1950's.

The American Cancer Society covers these three cancer types and the ages and frequency of recommended screenings online. In addition, risks for endometrial cancer, lung cancer, and prostate cancer (and steps to take including some important caveats for prostate cancer screening).

Current cancer statistics and a thought experiment

The two sets of statistics, one from the US National Cancer Institute's Surveillance, Epidemiology and End Results program (NCI-SEER) and the other an article from the American Cancer Society's journal CA: A Cancer Journal for Clinicians titled "Cancer Statistics in China, 2015" give stark differences in the types of cancers and mortality. This reflects both the ethnic genetic background of the population, as well as environmental and other societal differences that all add to cancer risk in a population. (For additional background on environment, heredity and random risk factors see this prior post titled "Why isn't cancer prevention a higher priority?")

Singlera ctmDNA US Cancer Statistics

2018 Estimate of US Cancer Statistics (link to source)

Singlera ctmDNA China Cancer Statistics

2015 Estimate of China Cancer Statistics (link to source)

Here is the thought experiment: take the top three types of cancer in each country, and assume an early detection implementation where the level of cancer mortality reduction in each case is a 40% reduction (the prior three screening tests achieved 30%, 40% and 70% per the above). What would be the total number of lives saved once 40% reduction were achieved across the top 3?

For the US, the top 3 by deaths are lung, colorectal and breast, totalling 98,240 lives saved; for China, the top 3 by deaths are lung, stomach and liver, totalling 612,120.

While biology is not engineering, there is reason to be optimistic that tools such as methylation haplotyping can point the way to specific and sensitive early detection screening tools to bring the mortality rate of cancer drastically down.


1. Pande, V. Sci American (2018) How to Engineer Biology. https://blogs.scientificamerican.com/observations/how-to-engineer-biology/

2. Lowe, D. AAAS/Science In the Pipeline (2018) Engineering Biology, For Real? https://blogs.sciencemag.org/pipeline/archives/2018/11/12/engineering-biology-for-real

3. National Cancer Institute United States; Surveillance, Epidemiology, and End Results Program. Cancer Stat Facts: Cancer of Any Site. https://seer.cancer.gov/statfacts/html/all.html

4. Chen W. and He J. et al. CA Cancer J Clin. (2016) Cancer statistics in China, 2015. PubMed PMID: 26808342.