Limitations of Screening Tests - Genomic Messages: How the Evolving Science of Genetics Affects Our Health, Families, and Future - George Annas, Sherman Elias 

Genomic Messages: How the Evolving Science of Genetics Affects Our Health, Families, and Future - George Annas, Sherman Elias (2015)

APPENDIX B. Limitations of Screening Tests

The purpose of a screening test is to determine if you have or don’t have a certain condition or even a certain gene or mutation. In some cases the test may be inconclusive and may need to be repeated. However, screening results may not be accurate. The result of the test may indicate that you have a condition or disease you don’t have, a “false positive.” The test may also fail to detect a disease or condition that you do have, a “false negative.” Usually there are additional tests that are conducted to confirm (or reject) the original finding.

The sensitivity of a test is how likely a test is to identify a true positive. The specificity of a test is the probability that a negative test result is a true negative. For example, if a screening test with 99 percent sensitivity has a positive result, there is a 99 percent chance that the person screened has the condition or disease. Unfortunately, the ability of a test to detect a disease or condition varies with the prevalence of the disease in the testing population. In a population where everyone has the condition we are looking for, the test will be accurate 99 percent of the time. But if only one in every thousand patients we are screening has the condition, even a test that has a 99 percent sensitivity and a 99 percent specificity, a patient with a positive result actually has only a 9 percent chance of having the disease! This 9 percent value is called the positive predictive value of the test, and it is calculated by dividing the number of true positive results of a test by the total number of positive tests. Extrapolating to rarer conditions affecting fewer than 1 in 100,000 creates even higher numbers of false positives and an even lower positive predictive value. Of course, if there are no people in the population you are screening who actually have the disease or condition, all positive results will be false positives.

Because the prevalence of a disease in a symptomatic patient population is much higher than the prevalence in an asymptomatic patient population, a positive result is much more likely to be a true positive in the symptomatic population. A good example is the current practice of breast cancer screening using mammograms. Imagine you are a health care practitioner. If you do a mammogram on a patient with a breast mass, it is more likely that an abnormal mammogram will correctly indicate a true positive rather than a false positive. This is because the prevalence of disease is much higher in a population of women with palpable masses. Likewise, because the prevalence of breast cancer in an asymptomatic population is much lower, an abnormal mammogram in this population is much more likely to be a false positive. (See the discussion of breast cancer in chapter 8 for more on this topic.) One solution to this dilemma is to personalize screening tests based on age and health status. Screening, some have suggested, should be offered to patients who are at high risk and thus are most likely to benefit from screening. Understanding the high false positive rates in screening tests should help you contextualize the news of a positive result. We caution you to be skeptical of screening results and urge you not to immediately assume you have a disease following a positive screening. It is important to proceed with diagnostic testing following any positive screening results.

Much genomic testing involves screening asymptomatic patients for rare conditions, and therefore is likely to result in a high number of false positive results. When genomic testing is performed, patients should not only be counseled on the myriad findings which are of unknown significance, but also that even positive results of known significance may not be real, especially if the condition is rare. The enormous complexity and sheer size of the human genome makes erroneous results likely. For example, 23andMe uses a microarray that reports 99.99 percent reproducibility for detecting SNP variants, meaning that they estimate the error is about 0.01 percent. While the error rate for a single SNP is low, 23andMe is not designed to examine a single SNP variable. Rather, it aims to examine thousands of SNPs in a single genome. It has been estimated that with an error rate of 0.01 percent applied to a million SNPs, at least a third of 23andMe customers have an error in the results of one of their “interesting” SNPs. Additionally, the genomic coverage, which measures the average number of times a base in the test genome is matched to a probe and is an indication of the accuracy of data, is variable. Most scientific publications require a coverage depth of 10 times to 30 times. In a recent study, the 23andMe microarray had an average coverage depth of 28.4 times, but 4.3 percent of bases were read at a very low coverage depth (less than 5 times).

As more and more genomic studies are conducted, researchers are revisiting previous studies and finding that much of what we thought we knew about the relationship between the genome and disease may not be true. A recent report on cancer genomics in Nature, for example, is titled, “Lists of cancer mutations awash with false positives,” and describes how the use of improved models and reanalysis of cancer genome data decreased the number of genes possibly associated with cancer from 450 to 11. Another study, which evaluated over 600 positive associations between common gene variants and disease, found that only 166 associations had been studied more than 3 times. Of the 166 which had been studied repeatedly, only six were consistently replicated. As the science improves and more data is collected, potential pitfalls can multiply. This is not to suggest that we abandon the investigation but rather that we proceed with caution, recognizing the limitations present in the evolving field of genomics.

In the case of prenatal testing, when positive results can create the possible realization of parents’ worst fears, a positive screening result must be tempered with the understanding that positive results for rare conditions are often inaccurate, and must be confirmed with a diagnostic test. At the 2013 American College of Medical Genetics and Genomics annual meeting presenters discussed the real problem of both false positive and false negative results in noninvasive prenatal testing. Cases of what were believed to be fetal aneuploidies (an abnormal number of chromosomes), actually reflected maternal tumor DNA. If confirmation testing had been bypassed, not only would perfectly healthy fetuses have been terminated, but maternal cancer would have proceeded undiagnosed.

Nate Silver, in his book The Signal and the Noise, notes statistical pitfalls that have direct applicability to genomics. He points out that statistical analysis of data frequently shows an association or correlation of data elements, but that statistical correlation is very different than causation. A simple example is high glucose levels in patients with diabetes. High sugar levels do not cause diabetes but are a manifestation of the disease. Another example is the higher incidence of automobile accidents in younger drivers. Does youth cause accidents or is it inexperience or a higher incidence of reckless behavior in younger drivers or both? There are likely other factors which account for the correlation as well. In genomics research certain genes are associated with a higher incidence of a particular disease, but this association does not mean causation. Many other factors such as epigenetics and the microbiome could be important. All of these reasons support comprehensive counseling prior to genomic screening.

Gilbert Welch goes even further in his book Overdiagnosed, arguing that since all of us harbor at least some genomic abnormalities, “we can all be shown to be at high risk for some disease.” From this proposition he concludes: “So the new world of personal genetic testing has the potential to make all of us sick and arguably poses the greatest threat of overdiagnosis.” Our hope is that if you have a better understanding of the limitations of much of genomic testing, you will be better able to use the results in ways that make your life better, not worse.