Estimating the agreement and diagnostic accuracy of two diagnostic tests when one test is conducted on only a subsample of specimens

Hormuzd A. Katki, Yan Li, David W. Edelstein, Philip E. Castle

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

We focus on the efficient usage of specimen repositories for the evaluation of new diagnostic tests and for comparing new tests with existing tests. Typically, all pre-existing diagnostic tests will already have been conducted on all specimens. However, we propose retesting only a judicious subsample of the specimens by the new diagnostic test. Subsampling minimizes study costs and specimen consumption, yet estimates of agreement or diagnostic accuracy potentially retain adequate statistical efficiency. We introduce methods to estimate agreement statistics and conduct symmetry tests when the second test is conducted on only a subsample and no gold standard exists. The methods treat the subsample as a stratified two-phase sample and use inverse-probability weighting. Strata can be any information available on all specimens and can be used to oversample the most informative specimens. The verification bias framework applies if the test conducted on only the subsample is a gold standard. We also present inverse-probability-weighting-based estimators of diagnostic accuracy that take advantage of stratification. We present three examples demonstrating that adequate statistical efficiency can be achieved under subsampling while greatly reducing the number of specimens requiring retesting. Naively using standard estimators that ignore subsampling can lead to drastically misleading estimates. Through simulation, we assess the finite-sample properties of our estimators and consider other possible sampling designs for our examples that could have further improved statistical efficiency. To help promote subsampling designs, our R package CompareTests computes all of our agreement and diagnostic accuracy statistics.

Original languageEnglish (US)
Pages (from-to)436-448
Number of pages13
JournalStatistics in Medicine
Volume31
Issue number5
DOIs
StatePublished - Feb 28 2012
Externally publishedYes

Fingerprint

Diagnostic Accuracy
Diagnostic Tests
Subsampling
Routine Diagnostic Tests
Inverse Probability Weighting
Estimator
Gold
Verification Bias
Estimate
Statistics
Sampling Design
Costs and Cost Analysis
Stratification
Repository
Minimise
Symmetry
Evaluation
Costs
Standards
Simulation

Keywords

  • Gold standard
  • HPV
  • Kappa
  • Sensitivity
  • Specificity
  • Symmetry test
  • Two-phase design
  • Verification bias

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability

Cite this

Estimating the agreement and diagnostic accuracy of two diagnostic tests when one test is conducted on only a subsample of specimens. / Katki, Hormuzd A.; Li, Yan; Edelstein, David W.; Castle, Philip E.

In: Statistics in Medicine, Vol. 31, No. 5, 28.02.2012, p. 436-448.

Research output: Contribution to journalArticle

@article{0f0d898614744ce1a7667be23e421d90,
title = "Estimating the agreement and diagnostic accuracy of two diagnostic tests when one test is conducted on only a subsample of specimens",
abstract = "We focus on the efficient usage of specimen repositories for the evaluation of new diagnostic tests and for comparing new tests with existing tests. Typically, all pre-existing diagnostic tests will already have been conducted on all specimens. However, we propose retesting only a judicious subsample of the specimens by the new diagnostic test. Subsampling minimizes study costs and specimen consumption, yet estimates of agreement or diagnostic accuracy potentially retain adequate statistical efficiency. We introduce methods to estimate agreement statistics and conduct symmetry tests when the second test is conducted on only a subsample and no gold standard exists. The methods treat the subsample as a stratified two-phase sample and use inverse-probability weighting. Strata can be any information available on all specimens and can be used to oversample the most informative specimens. The verification bias framework applies if the test conducted on only the subsample is a gold standard. We also present inverse-probability-weighting-based estimators of diagnostic accuracy that take advantage of stratification. We present three examples demonstrating that adequate statistical efficiency can be achieved under subsampling while greatly reducing the number of specimens requiring retesting. Naively using standard estimators that ignore subsampling can lead to drastically misleading estimates. Through simulation, we assess the finite-sample properties of our estimators and consider other possible sampling designs for our examples that could have further improved statistical efficiency. To help promote subsampling designs, our R package CompareTests computes all of our agreement and diagnostic accuracy statistics.",
keywords = "Gold standard, HPV, Kappa, Sensitivity, Specificity, Symmetry test, Two-phase design, Verification bias",
author = "Katki, {Hormuzd A.} and Yan Li and Edelstein, {David W.} and Castle, {Philip E.}",
year = "2012",
month = "2",
day = "28",
doi = "10.1002/sim.4422",
language = "English (US)",
volume = "31",
pages = "436--448",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "5",

}

TY - JOUR

T1 - Estimating the agreement and diagnostic accuracy of two diagnostic tests when one test is conducted on only a subsample of specimens

AU - Katki, Hormuzd A.

AU - Li, Yan

AU - Edelstein, David W.

AU - Castle, Philip E.

PY - 2012/2/28

Y1 - 2012/2/28

N2 - We focus on the efficient usage of specimen repositories for the evaluation of new diagnostic tests and for comparing new tests with existing tests. Typically, all pre-existing diagnostic tests will already have been conducted on all specimens. However, we propose retesting only a judicious subsample of the specimens by the new diagnostic test. Subsampling minimizes study costs and specimen consumption, yet estimates of agreement or diagnostic accuracy potentially retain adequate statistical efficiency. We introduce methods to estimate agreement statistics and conduct symmetry tests when the second test is conducted on only a subsample and no gold standard exists. The methods treat the subsample as a stratified two-phase sample and use inverse-probability weighting. Strata can be any information available on all specimens and can be used to oversample the most informative specimens. The verification bias framework applies if the test conducted on only the subsample is a gold standard. We also present inverse-probability-weighting-based estimators of diagnostic accuracy that take advantage of stratification. We present three examples demonstrating that adequate statistical efficiency can be achieved under subsampling while greatly reducing the number of specimens requiring retesting. Naively using standard estimators that ignore subsampling can lead to drastically misleading estimates. Through simulation, we assess the finite-sample properties of our estimators and consider other possible sampling designs for our examples that could have further improved statistical efficiency. To help promote subsampling designs, our R package CompareTests computes all of our agreement and diagnostic accuracy statistics.

AB - We focus on the efficient usage of specimen repositories for the evaluation of new diagnostic tests and for comparing new tests with existing tests. Typically, all pre-existing diagnostic tests will already have been conducted on all specimens. However, we propose retesting only a judicious subsample of the specimens by the new diagnostic test. Subsampling minimizes study costs and specimen consumption, yet estimates of agreement or diagnostic accuracy potentially retain adequate statistical efficiency. We introduce methods to estimate agreement statistics and conduct symmetry tests when the second test is conducted on only a subsample and no gold standard exists. The methods treat the subsample as a stratified two-phase sample and use inverse-probability weighting. Strata can be any information available on all specimens and can be used to oversample the most informative specimens. The verification bias framework applies if the test conducted on only the subsample is a gold standard. We also present inverse-probability-weighting-based estimators of diagnostic accuracy that take advantage of stratification. We present three examples demonstrating that adequate statistical efficiency can be achieved under subsampling while greatly reducing the number of specimens requiring retesting. Naively using standard estimators that ignore subsampling can lead to drastically misleading estimates. Through simulation, we assess the finite-sample properties of our estimators and consider other possible sampling designs for our examples that could have further improved statistical efficiency. To help promote subsampling designs, our R package CompareTests computes all of our agreement and diagnostic accuracy statistics.

KW - Gold standard

KW - HPV

KW - Kappa

KW - Sensitivity

KW - Specificity

KW - Symmetry test

KW - Two-phase design

KW - Verification bias

UR - http://www.scopus.com/inward/record.url?scp=84856473125&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84856473125&partnerID=8YFLogxK

U2 - 10.1002/sim.4422

DO - 10.1002/sim.4422

M3 - Article

VL - 31

SP - 436

EP - 448

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 5

ER -