Evidence Report/Technology Assessment: Number 5

Evaluation of Cervical Cytology

Summary


Under its Evidence-based Practice Program, the Agency for Health Care Policy and Research (AHCPR) is developing scientific information for other agencies and organizations on which to base clinical guidelines, performance measures, and other quality improvement tools. Contractor institutions review all relevant scientific literature on assigned clinical care topics and produce evidence reports and technology assessments, conduct research on methodologies and the effectiveness of their implementation, and participate in technical assistance activities.

Overview / Reporting the Evidence / New Technologies Assessed / Patient Population and Settings / Methodology / Supplemental Analyses / Findings / Future Research / Availability of Full Report


Overview

Worldwide, carcinoma of the cervix is one of the most common malignancies in women. It was estimated that approximately 13,700 new cases of the disease would occur in the United States in 1998. A woman's lifetime risk of being diagnosed with cervical cancer in the United States is currently 0.83 percent, and the risk of dying from the disease is 0.27 percent.

The incidence of cervical cancer and associated mortality have each decreased over 40 percent since 1973; the decreases are largely attributable to the success of mass screening using the Papanicolaou (Pap) test to diagnose premalignant or early-stage cases. The decreases in invasive cervical cancer incidence and mortality since the introduction of the Pap smear have been so dramatic that it is one of the few interventions to receive an "A" recommendation from the U.S. Preventive Services Task Force even though there are no randomized trials demonstrating its effectiveness.

Despite the indisputably dramatic impact of Pap screening, there is still uncertainty about the details of Pap smear performance, and much could be done to improve the performance of the test and followup of patients after screening. Controversy about the details of Pap smear performance is manifest in differing recommendations about the frequency of screening and the age (if any) at which screening may safely be stopped. A significant proportion of patients and providers fail to comply with even the least demanding recommendations for Pap screening frequency. Numerous barriers to screening have been identified that reduce access to Pap smears and other preventive services.

Recently, efforts to improve Pap smear performance have focused on reducing the number of false negative smears, that is, cases in which premalignant or malignant cells have been misdiagnosed as normal. Measures adopted to improve laboratory performance on this point include manual rescreening of a portion of slides initially evaluated as negative, an approach mandated by Federal law (Clinical Laboratory Improvement Amendments [CLIA]). Recently, several technologies have been developed to optimize Pap test screening by reducing the false negative rate. These technologies are a major focus of this report.

Return to Contents

Reporting the Evidence

The report addresses three main questions:

  1. What is the accuracy of cervical cytology using conventional Pap smears and new technologies (thin-layer cytology, computer rescreening, algorithm-based decisionmaking technology) for detecting cervical cancer and its precursors?
  2. What are the direct medical costs associated with cervical cancer screening, evaluation, treatment, and followup of cervical cytological abnormalities and treatment and followup of cervical cancer?
  3. What are the effects on total health care cost, morbidity, and mortality of regular cervical cytological screening using thin-layer cytology and computer rescreening using neural network or algorithm-based decisionmaking technology compared with the conventional Pap smear in women participating in a screening program?

On the first point, the report will review published studies comparing cervical cytological diagnosis with clinical diagnosis based on colposcopy or biopsy. The results of this review will form the basis for a meta-analysis.

On the second point, the report will identify and examine current claims data and other datasets to estimate empirically costs associated with cervical cytological screening.

On the third point, the report will review the literature on the effectiveness and cost-effectiveness of cervical cytology screening and use these data to develop a comprehensive cost-effectiveness model to examine the impact of the newer screening technologies. In the absence of definitive clinical trials on key questions of cervical cancer screening, policymakers have relied on decision-modeling studies to integrate epidemiological data on the natural history of cervical cancer precursors, data on the performance of diagnostic tests for early cervical cancer or cervical cancer precursors, and data on cost. These models estimate the efficacy of various screening programs, balance estimated efficacy against estimated cost, and lead to decisions about appropriate screening intervals and age cutoffs.

Return to Contents

New Technologies Assessed

Recent developments in specimen processing and interpretation may substantially improve the Pap smear as a diagnostic test for cervical cancer and cancer precursors. Three new devices recently approved by the Food and Drug Administration (FDA) are considered in this report: ThinPrep®, Papnet®, and AutoPap®. The three devices employ three different types of technology: thin-layer cytology (ThinPrep®) and computerized rescreening utilizing neural-network technology (Papnet®) or algorithmic classification (AutoPap®).

Each of these technologies was developed to reduce the false negative rate associated with cervical cytological screening. The two major components to this false negative rate are false negatives related to sampling error and false negatives related to detection error. About two-thirds of false negatives are a result of sampling error and the remaining one-third a result of detection error. Each of the new technologies is directed at one of these components of false negatives. Thin-layer cytology aims primarily to fix sampling error, whereas computerized rescreening targets detection error. This implies that neither technology will be able to reduce false negatives beyond a certain threshold.

Thin-layer cytology is a new technology for processing cytological samples. The sample is collected as in the conventional Pap test using a broom-type device or plastic spatula and endocervical brush combination, but rather than smearing the cytological sample directly onto a microscope slide, this method suspends the sample cells in a fixative solution, disperses them, and then selectively collects cells on a filter. The cells are then transferred to a microscope slide for cytological interpretation. Because cytological samples are fixed immediately after collection, there are fewer artifacts in cellular morphology. Fewer cells on the slide are obscured, both because the process reduces artifactual material such as blood and mucus and because cells are deposited on the slide in a monolayer. Clinical studies of the ThinPrep® 2000 (Cytyc Corporation, Boxborough, MA) have shown that test sensitivity is improved compared with conventional Pap smears. The improvement in sensitivity appears to be greater in populations with a low incidence of cytological abnormalities.

One newly approved device, Papnet®, uses neural-network computerized rescreening of Pap smears initially read as negative by a cytotechnologist. The system works by using automated computerized imaging of Pap smear slides and interpretation of images using a computerized algorithm to identify slides that are likely to contain abnormal cells. The Papnet® system (Neuromedical Systems, Inc.) identifies cells or clusters of cells that require review and can display up to 128 images of the slide likely to contain abnormalities. These images can be reviewed by a cytotechnologist who can decide whether or not to review the slide using light microscopy.

AutoPap® 300 QC system (Neopath, Inc.), an algorithm-based decisionmaking technology, identifies slides exceeding a certain threshold for the likelihood of abnormal cells. The laboratory can select different thresholds corresponding to 10, 15, and 20 percent review rates. In contrast to random rescreening, the population of slides selected by the AutoPap® 300 QC system is enriched with abnormalities and, at the 10-15 percent sort rate, this population of slides should contain 70-80 percent of the slides containing abnormalities missed by manual screening.

A variety of other technologies or clinical strategies have been proposed to improve Pap testing including various devices for collecting a cytological sample from the cervix. Still other technologies have been proposed to augment or replace cervical cytological screening, including colposcopic photographs for review by experts (cervicography) and DNA testing for specific human papillomavirus (HPV). These technologies are not considered in the present report.

Return to Contents

Patient Population and Settings

The primary target population for this evidence report is women of average cervical cancer risk in the United States who are candidates for Pap smear screening. For the purposes of our analysis, candidates for Pap smear screening include women between the age of onset of sexual activity and the age of 85.

Although a large proportion of cervical cancer occurs in women with very limited or no screening, we did not examine programs or policies designed to improve screening compliance. Some previous studies have focused on special populations such as elderly women and elderly women who have not previously been screened.

The principal practice setting considered is the primary care practice in the United States (general internal medicine, family practice, adolescent medicine, and obstetrics/gynecology) and government and nongovernment family planning clinics (e.g., Planned Parenthood, public health clinics).

Return to Contents

Methodology

The comprehensive review of the literature, from identification of databases through abstraction of individual articles into the evidence tables, was a multistep, sequential process. This process is detailed below.

Literature Sources Used

MEDLINE, CancerLit, HealthSTAR, CINAHL, EMBASE, and EconLit computerized database searches, supplemented by manual journal searches and querying experts and device manufacturers, were the sources used to identify English language reports on the accuracy of cervical cytological screening, costs associated with screening and treatment, and cost-effectiveness.

Citations for the review of accuracy of cervical cytological testing were retrieved with a search strategy that combined various text word and index terms for cervical cytological tests with cervical cancer or dysplasia and sensitivity and specificity. The strategy to retrieve articles on the costs and health outcomes associated with cervical cancer screening combined cervical cytological test terms with terms describing cost analysis and mathematical modeling. Experienced librarians assisted with the design and translation of these search strategies for each database searched.

Screening of Articles

Separate sets of criteria for including articles in the evidence report were developed for the two topics that were the subject of literature reviews (diagnostic testing and cost and health outcomes). In each case, final screening criteria were developed through an iterative process. Each iteration of criteria was pilot-tested by each reviewer/abstractor on a subset of randomly chosen articles.

Articles on diagnostic testing were first screened based on information available through the online databases (primarily title, authors, and abstract when available). Citations were eliminated in Step 1 of the screening process if cervical cytology was not evaluated as a screening test or if the screening test results were not compared with a reference standard. In Step 2 of the screening process, full texts of articles were reviewed to select articles in which a reference standard of colposcopy or histology was used, the screening test and references standard were reasonably concurrent (i.e., within 3 months), and sufficient data to calculate both sensitivity and specificity were provided (i.e., all cells of a two-by-two table). Of the 939 bibliographic references reviewed, 561, or approximately 60 percent, were excluded during the first screening, and another 293, or 31 percent, during the second screening. Eighty-six articles were included according to these criteria: 84 studies of conventional Pap screening and one study each of ThinPrep® and Papnet®. Because so few studies of the new technologies met the original criteria, we modified the criteria to include studies of the new technologies that used a cytology reference standard and allowed estimation of either sensitivity or specificity. We considered a total of 59 studies (12 on AutoPap®, 27 on Papnet®, and 20 on ThinPrep®) during this final stage of the screening process (Step 3). The net result was the inclusion of 6 studies of AutoPap®, 11 of Papnet®, and 8 of ThinPrep®.

Articles on cost and health outcomes of cervical cytological screening were selected if they assessed the effect of screening on life expectancy or quality, number of cases of cervical cancer, or total health care costs for any of the following cytological screening technologies: conventional Pap smears, thin-layer cytology, or Pap smears with computerized rescreening. Of the 672 articles identified, 638, or 95 percent, were eliminated during the screening process. Thirty-four articles were included in the review.

Data Abstraction Process

Key information was abstracted onto specially designed forms and verified by either duplicate abstraction (two-by-two tables) or overreading by paired clinician-abstractors. Differences were resolved by consensus.

For the diagnostic testing articles, both members of each abstractor team also independently completed two-by-two tables for each study, extracting the key data to calculate sensitivity, specificity, and prevalence and other data to be used in the meta-analysis. The main outcome measures considered were the sensitivity and specificity of cytological abnormality by Pap test for detecting cases, where cytological abnormality was defined by one of three thresholds ranging from atypical squamous cells of uncertain significance (ASCUS) (threshold 1) to low-grade squamous intraepithelial lesion (LSIL) (threshold 2) to high-grade squamous intraepithelial lesion (HSIL) (threshold 3), and where a case was defined as a histological diagnosis of dysplasia or carcinoma. Equivalent categories in other classification schemes were also used. Two-by-two tables were constructed for four different combinations of cytological versus histological thresholds: ASCUS/cervical intraepithelial neoplasia (CIN1), LSIL/CIN1, LSIL/CIN2-3, and HSIL/CIN2-3.

Criteria for Evaluating the Quality of Articles

Quality scores for articles on diagnostic testing were assigned according to predetermined methodological criteria based on blind interpretation of screening test results, use of a reference standard of histology, selection of test-negative patients for verification, avoidance of bias in sample collection, description of the spectrum of disease in the sample, publication as a full report (as opposed to abstract), and source of support.

The quality of articles on costs and health outcomes was described according to recently published criteria by an expert panel on cost and effectiveness in medicine.

Return to Contents

Supplemental Analyses

Meta-analysis of Pap Test Accuracy

We used the effectiveness score to combine data from multiple studies describing the performance of the conventional Pap test in discriminating between patients with and without cervical lesions. The effectiveness score takes account of both sensitivity and specificity by fitting a receiver operating characteristic (ROC) curve through a logistic odds transformation of the two and thus accounts for their interdependence. The effectiveness score is more normally distributed than either sensitivity or specificity and can be thought of as a gauge of the overall discriminatory ability of the test. Standardized effectiveness scores can be interpreted across different diagnostic tests. In general, a score of 3 reflects a test with good discrimination, whereas a score of 1 reflects a test that does not discriminate between disease positives and disease negatives.

We used maximum likelihood estimation techniques and a random effects model to calculate summary measures of effectiveness at each of the four explicit diagnostic thresholds (ASCUS/CIN1, LSIL/CIN1, LSIL/CIN2-3, HSIL/CIN2-3). We further evaluated the effect of variations in disease prevalence and in quality of study design and reporting on test discrimination.

Cost Analysis

Several available datasets were analyzed to estimate direct medical costs of screening, diagnosing, and treating cervical cancer, calculating separate estimates for women 20-64 years of age and those 65 years and older (eligible for Medicare). For women 20-64, the unit cost of screening, diagnosis, and treatment of cervical cancer was estimated from MEDSTAT data from 1992, 1993, and 1994, inflated to reflect 1994 charges and converted to costs using 1994 cost-to-charge ratios published by the American Hospital Association.

For women over 65, Medicares resource-based relative value scale (RBRVS) fee schedule for physician services, Medicares clinical laboratory fee schedule for laboratory services, and national average diagnosis-related group (DRG) payments for hospital admissions were used to identify the payments associated with services received for cervical cancer screening, diagnosis, and treatment. Charges and payment information obtained from all sources were then converted to reflect costs associated with the services provided and all costs were inflated to 1997 dollars.

Cost-Effectiveness Model

We constructed a 20-State Markov model that follows a cohort of women from age 15 to 85 and assumes that there are no prevalent cases of HPV infection or squamous intraepithelial lesion (SIL) at age 15. Cycle lengths are 1 year long. No Pap smear screening is compared with the following screening strategies: conventional Pap smears at 1-, 2- and 3-year intervals, thin-layer cytology smears at 1-, 2- and 3-year intervals, and 100 percent computerized rescreening at 1-, 2- and 3-year intervals.

We used a U.S. health system perspective and evaluated the direct and health care-specific costs associated with screening, diagnosis, and treatment of cervical cancer and its precursors. We did not consider other societal costs such as work loss. The model considers the following outcomes: cost per year of life saved, cost per cervical cancer death prevented and per cervical cancer case prevented, and the number of morbid therapies avoided.

We discounted costs and years of life at 3 percent annually in the base case and varied the discount rate from 0 to 5 percent in a sensitivity analysis.

Specific parameter estimates were derived from a preliminary literature assessment conducted for this report and prior published models of cervical cancer screening.

Return to Contents

Findings

Important findings regarding the discrimination about the accuracy of cervical cytological screening include the following:

The accuracy of the Pap test is strongly affected by disease prevalence. Higher disease prevalence is associated with higher estimates of sensitivity and lower estimates of specificity (with a greater effect on specificity). These findings are consistent with prevalence as a marker for workup bias and perhaps also reflect an imperfect reference standard that is more specific than sensitive.

Important findings regarding the costs of cervical cytological screening and cervical cancer diagnosis and treatment include the following:

Important findings from a review of previously published models of the cost and effectiveness of cervical cytological screening include the following:

Important findings from a new model of cost and effectiveness of cervical cytological screening include the following:

Return to Contents

Future Research

Our research suggests several areas for possible future study.

Return to Contents

Availability of the Full Report

The full evidence report from which this summary was taken was prepared by Duke University, an AHCPR Evidence-based Practice Center, Durham, NC, under Contract No. 290-97-0014. Print copies may be obtained free of charge from the Publications Clearinghouse by calling 1-800-358-9295. Requestors should ask for Evidence Report/Technology Assessment No. 5, Evaluation of Cervical Cytology (AHCPR Publication No. 99-E010). The Evidence Report is available online on the National Library of Medicine Bookshelf.

Return to Contents

AHCPR Publication Number 99-E009
Current as of January 1999


Internet Citation:

Evaluation of Cervical Cytology. Summary, Evidence Report/Technology Assessment: Number 5, January 1999. Agency for Health Care Policy and Research, Rockville, MD. http://www.ahrq.gov/clinic/epcsums/cervsumm.htm


Return EPC Evidence Reports
Clinical Information
AHRQ Home Page
Department of Health and Human Services