Interobserver agreement in the evaluation of digitized cervical images

Jose Jeronimo, L. Stewart Massad, Philip E. Castle, Sholom Wacholder, Mark Schiffman

Research output: Contribution to journalArticle

38 Citations (Scopus)

Abstract

OBJECTIVE: To estimate the agreement among multiple expert colposcopists evaluating high-resolution digitized cervigrams taken from patients with a variety of human papillomavirus (HPV) infection states and previous cervigram interpretations. METHODS: Twenty expert colposcopists evaluated 939 digitized images of the uterine cervix obtained after the application of 5% acetic acid during the ASCUS-LSIL Triage Study. Twenty images selected to represent a broad range were graded by all the colposcopists. The remaining 919 pictures were distributed by stratified random sampling, such that each image was evaluated by two colposcopists, and each expert evaluated 112 images with similar distributions of cervigram diagnoses and HPV DNA test results. We evaluated interrater agreement among the pairs of colposcopists and confirmed the conclusions using the 20 images they all graded. RESULTS: Pairs of colposcopists agreed on the diagnosis for only 56.8% of images. Similar agreement was seen regarding number of visible lesions (of low-grade or greater). This variability in ratings remained when the images were stratified by final histologic diagnosis or HPV status. The results were confirmed by the presence of large variability in ratings (ranging in some cases from normal to cancer) for the 20 images graded by all colposcopists. CONCLUSION: Colposcopic diagnosis using static images is poorly reproducible and might reflect similar problems in clinical practice. Researchers should question the use of colposcopic images as a reference standard for teaching and evaluating the presence or severity of disease.

Original languageEnglish (US)
Pages (from-to)833-840
Number of pages8
JournalObstetrics and Gynecology
Volume110
Issue number4
DOIs
StatePublished - Oct 2007
Externally publishedYes

Fingerprint

Human Papillomavirus DNA Tests
Papillomavirus Infections
Triage
Cervix Uteri
Acetic Acid
Teaching
Research Personnel
Neoplasms
Atypical Squamous Cells of the Cervix

ASJC Scopus subject areas

  • Obstetrics and Gynecology

Cite this

Interobserver agreement in the evaluation of digitized cervical images. / Jeronimo, Jose; Massad, L. Stewart; Castle, Philip E.; Wacholder, Sholom; Schiffman, Mark.

In: Obstetrics and Gynecology, Vol. 110, No. 4, 10.2007, p. 833-840.

Research output: Contribution to journalArticle

Jeronimo, Jose ; Massad, L. Stewart ; Castle, Philip E. ; Wacholder, Sholom ; Schiffman, Mark. / Interobserver agreement in the evaluation of digitized cervical images. In: Obstetrics and Gynecology. 2007 ; Vol. 110, No. 4. pp. 833-840.
@article{d57f441115e743ee8fb26a96156d5096,
title = "Interobserver agreement in the evaluation of digitized cervical images",
abstract = "OBJECTIVE: To estimate the agreement among multiple expert colposcopists evaluating high-resolution digitized cervigrams taken from patients with a variety of human papillomavirus (HPV) infection states and previous cervigram interpretations. METHODS: Twenty expert colposcopists evaluated 939 digitized images of the uterine cervix obtained after the application of 5{\%} acetic acid during the ASCUS-LSIL Triage Study. Twenty images selected to represent a broad range were graded by all the colposcopists. The remaining 919 pictures were distributed by stratified random sampling, such that each image was evaluated by two colposcopists, and each expert evaluated 112 images with similar distributions of cervigram diagnoses and HPV DNA test results. We evaluated interrater agreement among the pairs of colposcopists and confirmed the conclusions using the 20 images they all graded. RESULTS: Pairs of colposcopists agreed on the diagnosis for only 56.8{\%} of images. Similar agreement was seen regarding number of visible lesions (of low-grade or greater). This variability in ratings remained when the images were stratified by final histologic diagnosis or HPV status. The results were confirmed by the presence of large variability in ratings (ranging in some cases from normal to cancer) for the 20 images graded by all colposcopists. CONCLUSION: Colposcopic diagnosis using static images is poorly reproducible and might reflect similar problems in clinical practice. Researchers should question the use of colposcopic images as a reference standard for teaching and evaluating the presence or severity of disease.",
author = "Jose Jeronimo and Massad, {L. Stewart} and Castle, {Philip E.} and Sholom Wacholder and Mark Schiffman",
year = "2007",
month = "10",
doi = "10.1097/01.AOG.0000281665.63550.8f",
language = "English (US)",
volume = "110",
pages = "833--840",
journal = "Obstetrics and Gynecology",
issn = "0029-7844",
publisher = "Lippincott Williams and Wilkins",
number = "4",

}

TY - JOUR

T1 - Interobserver agreement in the evaluation of digitized cervical images

AU - Jeronimo, Jose

AU - Massad, L. Stewart

AU - Castle, Philip E.

AU - Wacholder, Sholom

AU - Schiffman, Mark

PY - 2007/10

Y1 - 2007/10

N2 - OBJECTIVE: To estimate the agreement among multiple expert colposcopists evaluating high-resolution digitized cervigrams taken from patients with a variety of human papillomavirus (HPV) infection states and previous cervigram interpretations. METHODS: Twenty expert colposcopists evaluated 939 digitized images of the uterine cervix obtained after the application of 5% acetic acid during the ASCUS-LSIL Triage Study. Twenty images selected to represent a broad range were graded by all the colposcopists. The remaining 919 pictures were distributed by stratified random sampling, such that each image was evaluated by two colposcopists, and each expert evaluated 112 images with similar distributions of cervigram diagnoses and HPV DNA test results. We evaluated interrater agreement among the pairs of colposcopists and confirmed the conclusions using the 20 images they all graded. RESULTS: Pairs of colposcopists agreed on the diagnosis for only 56.8% of images. Similar agreement was seen regarding number of visible lesions (of low-grade or greater). This variability in ratings remained when the images were stratified by final histologic diagnosis or HPV status. The results were confirmed by the presence of large variability in ratings (ranging in some cases from normal to cancer) for the 20 images graded by all colposcopists. CONCLUSION: Colposcopic diagnosis using static images is poorly reproducible and might reflect similar problems in clinical practice. Researchers should question the use of colposcopic images as a reference standard for teaching and evaluating the presence or severity of disease.

AB - OBJECTIVE: To estimate the agreement among multiple expert colposcopists evaluating high-resolution digitized cervigrams taken from patients with a variety of human papillomavirus (HPV) infection states and previous cervigram interpretations. METHODS: Twenty expert colposcopists evaluated 939 digitized images of the uterine cervix obtained after the application of 5% acetic acid during the ASCUS-LSIL Triage Study. Twenty images selected to represent a broad range were graded by all the colposcopists. The remaining 919 pictures were distributed by stratified random sampling, such that each image was evaluated by two colposcopists, and each expert evaluated 112 images with similar distributions of cervigram diagnoses and HPV DNA test results. We evaluated interrater agreement among the pairs of colposcopists and confirmed the conclusions using the 20 images they all graded. RESULTS: Pairs of colposcopists agreed on the diagnosis for only 56.8% of images. Similar agreement was seen regarding number of visible lesions (of low-grade or greater). This variability in ratings remained when the images were stratified by final histologic diagnosis or HPV status. The results were confirmed by the presence of large variability in ratings (ranging in some cases from normal to cancer) for the 20 images graded by all colposcopists. CONCLUSION: Colposcopic diagnosis using static images is poorly reproducible and might reflect similar problems in clinical practice. Researchers should question the use of colposcopic images as a reference standard for teaching and evaluating the presence or severity of disease.

UR - http://www.scopus.com/inward/record.url?scp=34848822833&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34848822833&partnerID=8YFLogxK

U2 - 10.1097/01.AOG.0000281665.63550.8f

DO - 10.1097/01.AOG.0000281665.63550.8f

M3 - Article

VL - 110

SP - 833

EP - 840

JO - Obstetrics and Gynecology

JF - Obstetrics and Gynecology

SN - 0029-7844

IS - 4

ER -