Background
Egan JP. Signal detection theory and ROC
analysis. New York: Academic Press, 1975.
Fryback DG, Thornbury JR. The efficacy of
diagnostic imaging. Med Decis Making 1991; 11: 88.
Griner PF, Mayewski RJ, Mushlin AI, Greenland P.
Selection and interpretation of diagnostic tests and procedures: principles
and applications. Annals Int Med 1981; 94: 553.
International Commission on Radiation Units and
Measurements. Medical imaging: the assessment of image quality (ICRU Report
54). Bethesda,MD: ICRU, 1996.
Lusted LB. Signal detectability and medical
decision-making. Science 1971; 171: 1217.
McNeil BJ, Adelstein SJ. Determining the value
of diagnostic and screening tests. J Nucl Med 1976; 17: 439.
McNeil BJ, Keeler E, Adelstein SJ. Primer on
certain elements of medical decision making. New Engl J Med 1975; 293: 211.
Metz CE, Wagner RF, Doi K, Brown DG, Nishikawa
RN, Myers KJ. Toward consensus on quantitative assessment of medical imaging
systems. Med Phys 22: 1057-1061, 1995.
National Council on Radiation Protection and
Measurements. An introduction to efficacy in diagnostic radiology and
nuclear medicine (NCRP Commentary 13). Bethesda, MD: NCRP, 1995.
Robertson EA, Zweig MH, Van Steirtghem AC.
Evaluating the clinical efficacy of laboratory tests. Am J Clin Path 1983;
79: 78.
Zweig MH, Campbell G. Receiver-operating
characteristic (ROC) plots: a fundamental evaluation tool in clinical
medicine. Clinical Chemistry 1993; 39: 561. [Erratum published in Clinical
Chemistry 1993; 39: 1589.]
General
Books
Pepe MS. The
statistical evaluation of medical tests for classification and prediction.
Oxford ; New York: Oxford University Press, 2004.
Articles
Hanley JA. Receiver operating characteristic
(ROC) methodology: the state of the art. Critical Reviews in Diagnostic
Imaging 1989; 29: 307.
King JL, Britton CA, Gur D, Rockette HE, Davis
PL. On the validity of the continuous and discrete confidence rating scales
in receiver operating characteristic studies. Invest Radiol 1993; 28: 962.
Metz CE. Basic principles of ROC analysis.
Seminars in Nucl Med 1978; 8: 283.
Metz CE. ROC methodology in radiologic imaging.
Invest Radiol 1986; 21: 720.
Metz CE. Some practical issues of experimental
design and data analysis in radiological ROC studies. Invest Radiol 1989;
24: 234.
Metz CE. Evaluation of CAD methods. In
Computer-Aided Diagnosis in Medical Imaging (K Doi, H MacMahon, ML Giger and
KR Hoffmann, eds.). Amsterdam: Elsevier Science (Excerpta Medica
International Congress Series, Vol. 1182), pp. 543-554, 1999.
Metz CE. Fundamental ROC analysis. In: Handbook
of Medical Imaging, Vol. 1: Physics and Psychophysics (J Beutel, H Kundel
and R Van Metter, eds.). Bellingham, WA; SPIE Press, 2000, pp. 751-769.
Metz CE. Receiver operating characteristic (ROC)
analysis: a tool for quantitative evaluation of observer performance and
imaging systems. JACR 3: 413-422, 2006
Metz CE, Shen J-H. Gains in accuracy from
replicated readings of diagnostic images: prediction and assessment in terms
of ROC analysis. Med Decis Making 1992; 12: 60.
Rockette HE, Gur D, Metz CE. The use of
continuous and discrete confidence judgments in receiver operating
characteristic studies of diagnostic imaging techniques. Invest Radiol 1992;
27: 169.
Swets JA. ROC analysis applied to the evaluation
of medical imaging techniques. Invest Radiol 1979; 14: 109.
Swets JA. Indices of discrimination or
diagnostic accuracy: their ROCs and implied models. Psychol Bull 1986; 99:
100.
Swets JA. Measuring the accuracy of diagnostic
systems. Science 1988; 240: 1285.
Swets JA. Signal detection theory and ROC
analysis in psychology and diagnostics: collected papers. Mahwah, NJ;
Lawrence Erlbaum Associates, 1996.
Swets JA, Pickett RM. Evaluation of diagnostic
systems: methods from signal detection theory. New York: Academic Press,
1982.
Wagner RF, Beiden SV, Metz CE. Continuous vs.
categorical data for ROC analysis: Some quantitative considerations.
Academic Radiol 2001, 8: 328, 2001.
Bias
Begg CB, Greenes RA. Assessment of diagnostic
tests when disease verification is subject to selection bias. Biometrics
1983; 39: 207.
Begg CB, McNeil BJ. Assessment of radiologic
tests: control of bias and other design considerations. Radiology 1988; 167:
565.
Gray R, Begg CB, Greenes RA. Construction of
receiver operating characteristic curves when disease verification is
subject to selection bias. Med Decis Making 1984; 4: 151.
Ransohoff DF, Feinstein AR. Problems of spectrum
and bias in evaluating the efficacy of diagnostic tests. New Engl J Med
1978; 299: 926.
Curve Fitting
Dorfman DD, Alf E. Maximum likelihood estimation
of parameters of signal detection theory and determination of confidence
intervals — rating method data. J Math Psych 1969; 6: 487.
Dorfman DD, Berbaum KS, Metz CE, Lenth RV,
Hanley JA, Dagga HA. Proper ROC analysis: the bigamma model. Academic Radiol
1997; 4: 138.
Grey DR, Morgan BJT. Some aspects of ROC
curve-fitting: normal and logistic models. J Math Psych 1972; 9: 128.
Hanley JA. The robustness of the "binormal"
assumptions used in fitting ROC curves. Med Decis Making 1988; 8: 197.
Metz CE, Herman BA, Shen J-H. Maximum-likelihood
estimation of ROC curves from continuously-distributed data. Stat Med 1998;
17: 1033.
Metz CE, Pan X. "Proper" binormal ROC curves:
theory and maximum-likelihood estimation. J Math Psych 1999; 43: 1.
Pan X, Metz CE. The "proper" binormal model:
parametric ROC curve estimation with degenerate data. Academic Radiol 1997;
4: 380.
Dorfman DD, Berbaum KS. A contaminated binormal
model for ROC data: Part II. A formal model. Acad Radiol 2000; 7:427-437.
Swensson RG. Unified measurement of observer
performance in detecting and localizing target objects on images. Med Phys
1996; 23: 1709.
Swets JA. Form of empirical ROCs in
discrimination and diagnostic tasks: implications for theory and measurement
of performance. Psychol Bull 1986; 99: 181.
Statistics
Multi-Case statistical analysis: only case variation considered
Agresti A. A survey of models for repeated
ordered categorical response data. Statistics in Medicine 1989; 8; 1209.
Bamber D. The area above the ordinal dominance
graph and the area below the receiver operating graph. J Math Psych 1975;
12: 387.
Bandos AI, Rockette HE, Gur D. A permutation
test sensitive to differences in areas for comparing ROC curves from a
paired design. STATISTICS IN MEDICINE 24 (18): 2873-2893 SEP 30 2005
DeLong ER, DeLong DM, Clarke-Pearson DL.
Comparing the areas under two or more correlated receiver operating
characteristic curves: a nonparametric approach. Biometrics 1988; 44: 837.
Hajian-Tilaki KO, Hanley JA. Comparison of three
methods for estimating the standard error of the area under the curve in ROC
analysis of quantitative data. ACADEMIC RADIOLOGY 9 (11): 1278-1285 NOV 2002
Hanley JA, McNeil BJ. The meaning and use of the
area under a receiver operating characteristic (ROC) curve. Radiology 1982;
143: 29.
Hanley JA, McNeil BJ. A method of comparing the
areas under receiver operating characteristic curves derived from the same
cases. Radiology 1983; 148: 839.
Jiang Y, Metz CE, Nishikawa RM. A receiver
operating characterisitc partial area index for highly sensitive diagnostic
tests. Radiology 1996; 201: 745.
Ma G, Hall WJ. Confidence bands for receiver
operating characteristic curves. Med Decis Making 1993; 13: 191.
McClish DK. Analyzing a portion of the ROC
curve. Med Decis Making 1989; 9: 190.
McClish DK. Determining a range of
false-positive rates for which ROC curves differ. Med Decis Making 1990; 10:
283.
McNeil BJ, Hanley JA. Statistical approaches to
the analysis of receiver operating characteristic (ROC) curves. Med Decis
Making 1984; 4: 137.
Metz CE. Statistical analysis of ROC data in
evaluating diagnostic performance. In: Multiple regression analysis:
applications in the health sciences (D Herbert and R Myers, eds.). New York:
American Institute of Physics, 1986, pp. 365.
Metz CE. Quantification of failure to
demonstrate statistical significance: the usefulness of confidence
intervals. Invest Radiol 1993; 28: 59.
Metz CE, Herman BA, Roe CA. Statistical
comparison of two ROC curve estimates obtained from partially-paired
datasets. Med Decis Making 1998; 18: 110.
Metz CE, Kronman HB. Statistical significance
tests for binormal ROC curves. J Math Psych 1980; 22: 218.
Metz CE, Wang P-L, Kronman HB. A new approach
for testing the significance of differences between ROC curves measured from
correlated data. In: Information processing in medical imaging (F Deconinck,
ed.). The Hague: Nijhoff, 1984, p. 432.
Thompson ML, Zucchini W. On the statistical
analysis of ROC curves. Statistics in Medicine 1989; 8: 1277.
Wieand S, Gail MH, James BR, James KL. A family
of nonparametric statistics for comparing diagnostic markers with paired or
unpaired data. Biometrika 1989; 76: 585.
Zhou XH, Gatsonis CA. A simple method for
comparing correlated ROC curves using incomplete data. Statistics in
Medicine 1996; 15: 1687-1693.
Multi-Reader
Multi-Case statistical analysis
Bandos AI, Rockette HE, Gur D. A permutation
test for comparing ROC curves in multireader studies
ACADEMIC RADIOLOGY 13 (4): 414-420 APR 2006
Beiden SV, Wagner RF, Campbell G.
Components-of-variance models and multiple-bootstrap experiments: and
alternative method for random-effects, receiver operating characteristic
analysis. Academic Radiol. 2000; 7: 341.
Beiden SV, Wagner RF, Campbell G, Metz CE, Jiang
Y. Components-of-variance models for random-effects ROC analysis: The case
of unequal variance structures across modalities. Academic Radiol. 2001; 8:
605.
Beiden SV, Wagner RF, Campbell G, Chan H-P.
Analysis of uncertainties in estimates of components of variance in
multivariate ROC analysis. Academic Radiol. 2001; 8: 616.
Dorfman DD, Berbaum KS, Metz CE. ROC rating
analysis: generalization to the population of readers and cases with the
jackknife method. Invest Radiol 1992; 27: 723.
Dorfman DD, Berbaum KS, Lenth RV, Chen Y-F,
Donaghy BA. Monte Carlo validation of a multireader method for receiver
operating characteristic discrtet rating data: factorial experimental
design. Academic Radiol 1998; 5: 591.
Dorfman DD, Metz CE. Multi-reader multi-case ROC
analysis: comments on Begg’s commentary. Academic Radiol 1995; 2 (Supplement
1): S76.
Gallas BD One-shot estimate of MRMC variance:
AUC. ACADEMIC RADIOLOGY 13 (3): 353-362 MAR 2006
Hillis SL, Obuchowski NA, Schartz KM, Berbaum
KS. A comparison of the Dorfman-Berbaum-Metz and Obuchowski-Rockette methods
for receiver operating characteristic (ROC) data. Stat Med 2005;
24:1579-1607.
Hillis SL, Berbaum KS. Monte Carlo validation of
the Dorfman-Berbaum-Metz method using normalized pseudovalues and less
data-based model simplification. Academic Radiology 2005; 12:1534-1541.
Hillis SL, Berbaum KS Power estimation for the
Dorfman-Berbaum-Metz method ACADEMIC RADIOLOGY 11 (11): 1260-1273 NOV 2004
Obuchowski NA. Multireader, multimodality
receiver operating characteristic curve studies: hypothesis testing and
sample size estimation using an analysis of variance approach with dependent
observations. Academic Radiol 1995; 2 [Supplement 1]: S22.
Obuchowski, NA. Sample size calculations in
studies of test accuracy. Stat Methods Med Res 1998; 7: 371.
Obuchowski NA, Beiden SV, Berbaum KS, et al.
Multireader, multicase receiver operating characteristic analysis: An
empirical comparsion of five methods ACADEMIC RADIOLOGY 11 (9): 980-995 SEP
2004
Rockette HE, Obuchowski N, Metz CE, Gur D.
Statistical issues in ROC curve analysis. Proc SPIE 1990; 1234: 111.
Roe CA, Metz CE. The Dorfman-Berbaum-Metz method
for statistical analysis of multi-reader, multi-modality ROC data:
validation by computer simulation. Academic Radiol 1997; 4: 298.
Roe CA, Metz CE. Variance-component modeling in
the analysis of receiver operating characteristic index estimates. Academic
Radiol 1997; 4: 587.
Regression analysis of ROC curves
Pepe MS. The
statistical evaluation of medical tests for classification and prediction.
Oxford ; New York: Oxford University Press, 2004.
Pepe MS, Cai TX. The analysis of placement
values for evaluating discriminatory measures. BIOMETRICS 60 (2): 528-535
JUN 2004
Toledano A, Gatsonis CA. Regression analysis of
correlated receiver operating characteristic data. Academic Radiol 1995; 2
[Supplement 1]: S30.
Toledano AY, Gatsonis C. Ordinal regression
methodology for ROC curves derived from correlated data. Statistics in
Medicine 1996, 15: 1807.
Toledano AY, Gatsonis C. GEEs for ordinal
categorical data: arbitrary patterns of missing responses and missingness in
a key covariate. Biometrics 1999; 22, 488.
Tosteson A, Begg C. A general regression
methodology for ROC curve estimation. Med Decis Making 1988; 8: 204.
Relationships with
Cost/Benefit Analysis
Halpern EJ, Alpert M, Krieger AM, Metz CE,
Maidment AD. Comparisons of ROC curves on the basis of optimal operating
points. Academic Radiology 1996; 3: 245-253.
Metz CE. Basic principles of ROC analysis.
Seminars in Nucl Med 1978; 8: 283-298.
Metz CE, Starr SJ, Lusted LB, Rossmann K.
Progress in evaluation of human observer visual detection performance using
the ROC curve approach. In: Information Processing in Scintigraphy (C
Raynaud and AE Todd-Pokropek, eds.). Orsay, France: Commissariat à l'Energie
Atomique, Département de Biologie, Service Hospitalier Frédéric Joliot,
1975, p. 420.
Phelps CE, Mushlin AI. Focusing technology
assessment. Med Decis Making 1988; 8: 279.
Sainfort F. Evaluation of medical technologies:
a generalized ROC analysis. Med Decis Making 1991; 11: 208.
Wagner RE, Beam CA, Beiden SV. Reader
variability in mammography and its implications for expected utility over
the population of readers and cases. MEDICAL DECISION MAKING 24 (6): 561-572
NOV-DEC 2004
Generalizations
Anastasio MA, Kupinski MA, Nishikawa RN.
Optimization and FROC analysis of rule-based detection schemes using a
multiobjective approach. IEEE Trans Med Imaging 1998; 17: 1089
Bunch PC, Hamilton JF, Sanderson GK, Simmons AH.
A free response approach to the measurement and characterization of
radiographic observer performance. Proc SPIE 1997; 127: 124.
Chakraborty DP. Maximum likelihood analysis of
free-response receiver operating characteristic (FROC) data. Med Phys 1989;
16: 561.
Chakraborty DP, Winter LHL. Free-response
methodology: alternate analysis and a new observer-performance experiment.
Radiology 1990; 174: 873.
Chakraborty DP, Berbaum KS. Observer studies
involving detection and localization: Modeling, analysis and validation.
Medical Physics 2004; 31:2313-2330.
Chakraborty DP. A search model and figure of
merit for observer data acquired according to the free-response paradigm.
Phys. Med. Biol. 2006; 51:3449-3462.
Chakraborty DP. ROC Curves predicted by a model
of visual search. Phys. Med. Biol. 2006; 51:3463-3482.
Edwards DC, Metz CE. Evaluating Bayesian ANN
estimates of ideal observer decision variables by comparison with identity
functions. Proc. SPIE 5749: 174-182, 2005.
Edwards DC, Metz CE. Optimization of an ROC
hypersurface constructed only from an observer's within-class sensitivities.
Proc. SPIE 6146: 61460A1-61460A7, 2006.
Edwards DC, Metz CE. Analysis of proposed
three-class classification decision rules in terms of the ideal observer
decision rule. J. Math. Psych. (in press), 2006.
Egan JP, Greenberg GZ, Schulman AI. Operating
characteristics, signal detection, and the method of free response. J Acoust
Soc Am 1961; 33: 993.
HajianTilaki KO, Hanley JA, Joseph L, et al.
Extension of receiver operating characteristic analysis to data concerning
multiple signal detection tasks. ACADEMIC RADIOLOGY 4 (3): 222-229 MAR 1997
Metz CE, Starr SJ, Lusted LB. Observer
performance in detecting multiple radiographic signals: prediction and
analysis using a generalized ROC approach. Radiology 1976; 121: 337.
Obuchowski NA, Lieber ML, Powell KA.Data
analysis for detection and localization of multiple abnormalities with
application to mammography. ACADEMIC RADIOLOGY 7 (7): 516-525 JUL 2000
Starr SJ, Metz CE, Lusted LB, Goodenough DJ.
Visual detection and localization of radiographic images. Radiology 1975;
116: 533.
Swensson RG. Unified measurement of observer
performance in detecting and localizing target objects on images. Med Phys
1996; 23: 1709.
Papers related
specifically to our Current Software
ROCKIT
Dorfman DD, Alf E. Maximum likelihood estimation
of parameters of signal detection theory and determination of confidence
intervals — rating method data. J Math Psych 1969; 6: 487.
Metz CE, Herman BA, Shen J-H. Maximum-likelihood
estimation of ROC curves from continuously-distributed data. Stat Med 1998;
17: 1033.
Metz CE, Herman BA, Roe CA. Statistical
comparison of two ROC curve estimates obtained from partially-paired
datasets. Med Decis Making 1998; 18: 110.
Metz CE. Statistical analysis of ROC data in
evaluating diagnostic performance. In: Multiple regression analysis:
applications in the health sciences (D Herbert and R Myers, eds.). New York:
American Institute of Physics, 1986, pp. 365.
Metz CE. Quantification of failure to
demonstrate statistical significance: the usefulness of confidence
intervals. Invest Radiol 1993; 28: 59.
LABMRMC
& MRMC
Dorfman DD, Berbaum KS, Metz CE. ROC rating
analysis: generalization to the population of readers and cases with the
jackknife method. Invest Radiol 1992; 27: 723.
Dorfman DD, Metz CE. Multi-reader multi-case ROC
analysis: comments on Begg’s commentary. Academic Radiol 1995; 2 (Supplement
1): S76.
Roe CA, Metz CE. The Dorfman-Berbaum-Metz method
for statistical analysis of multi-reader, multi-modality ROC data:
validation by computer simulation. Academic Radiol 1997; 4: 298.
Roe CA, Metz CE. Variance-component modeling in
the analysis of receiver operating characteristic index estimates. Academic
Radiol 1997; 4: 587.
Hillis SL, Obuchowski NA, Schartz KM, Berbaum
KS. A comparison of the Dorfman-Berbaum-Metz and Obuchowski-Rockette methods
for receiver operating characteristic (ROC) data. Stat Med 2005;
24:1579-1607.
Hillis SL, Berbaum KS. Monte Carlo validation of
the Dorfman-Berbaum-Metz method using normalized pseudovalues and less
data-based model simplification. Academic Radiology 2005; 12:1534-1541.
LABROC4
Metz CE, Herman BA, Shen J-H. Maximum-likelihood
estimation of ROC curves from continuously-distributed data. Stat Med 1998;
17: 1033.
ROCPWR
Metz CE, Wang P-L, Kronman HB. A new approach
for testing the significance of differences between ROC curves measured from
correlated data. In: Information processing in medical imaging (F Deconinck,
ed.). The Hague: Nijhoff, 1984, p. 432.
PROPROC
Pan X, Metz CE. The "proper" binormal model:
parametric ROC curve estimation with degenerate data. Academic Radiol 1997;
4: 380.
Metz CE, Pan X. "Proper" binormal ROC curves:
theory and maximum-likelihood estimation. J Math Psych 1999; 43: 1.