For the typical diagnostic radiology study design, each case (ie, patient) undergoes each of several diagnostic tests (or modalities) and the resulting images are interpreted by each of several readers. Often, each reader is asked to assign a confidence-of-disease rating to each case for each test, based on the corresponding image or set of images, and a receiver operating characteristic (ROC) curve for each reader is estimated from the case-level ratings. The resulting data are called multireader multicase (MRMC) data. The diagnostic tests are then compared with respect to reader-performance out-comes that are functions of the reader ROC curves. A commonly used reader-performance summary outcome is the area under the ROC curve (AUC).

The methods proposed by Obuchowski and Rockette (OR), [1, 2] and Dorfman, Berbaum, and Metz (DBM) [3, 4] are the most commonly used methods for analyzing such multireader multicase (MRMC) studies and have performed well in simulations. The OR procedure fits a correlated-by-error test-by-reader ANOVA to reader-performance outcomes such as the area under the ROC curve (AUC), while the DBM procedure fits a test-by-reader-by-case conventional ANOVA to case-specific pseudovalues. Although the two methods have been shown to be equivalent [5, 6] when based on the same procedural parameters, the OR procedure is more intuitive and its parameters more interpretable, because it models observed reader-performance outcomes rather than pseudovalues. Because of these reasons, we recommend using the OR procedure rather than the DBM procedure, even though DBM analyses can be performed using our software.   

Our lab has created software (click on the Software tab) for performing OR and DBM analysis of data, as well as software for sizing future studies that will be analyzed using these methods.

References:

1.          Obuchowski NA, Rockette HE. Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: an ANOVA approach with dependent observations. Communications in Statistics-Simulation and Computation 1995; 24(2), 285-308.

2.          Obuchowski NA. Multireader, multimodality receiver operating characteristic curve studies: hypothesis testing and sample size estimation using an analysis of variance approach with dependent observations. Academic Radiology 1995; 2[Suppl 1], S22-S29.

3.          Dorfman DD, Berbaum KS, Metz CE. Receiver operating characteristic rating analysis: generalization to the population of readers and patients with the jackknife method. Investigative Radiology 1992; 27(9), 723-731.

4.          Dorfman DD, Berbaum KS, Lenth RV, Chen YF, Donaghy BA. Monte Carlo validation of a multireader method for receiver operating characteristic discrete rating data: factorial experimental design. Academic Radiology 1998; 5(9), 591-602.

5.          Hillis SL, Obuchowski NA, Schartz KM, Berbaum KS. A comparison of the Dorfman-Berbaum-Metz and Obuchowski-Rockette Methods for receiver operating characteristic (ROC) data. Statistics in Medicine 2005; 24, 1579-1607. doi: 10.1002/sim.2024.

6.          Hillis SL. A comparison of denominator degrees of freedom methods for multiple observer ROC analysis. Statistics in Medicine 2007; 26(3), 596-619. doi: 10.1002/sim.2532.