ROC

ROC
Name	Receiver operating characteristic
Caption	Example of a receiver operating characteristic curve
Field	Signal detection theory, Statistics, Machine learning
Introduced	Early 20th century
Common uses	Diagnostic testing, Radar, Remote sensing, Medical imaging

Contents

Introduction
History and development
Definitions and concepts
ROC curve construction and interpretation
Performance metrics and related measures
Applications and examples
Limitations and extensions

ROC

Introduction

Receiver operating characteristic is a graphical and analytical tool used in Signal detection theory, Statistics, Machine learning, Diagnostic test evaluation, and Remote sensing to assess binary classifier performance. It summarizes trade-offs between true positive and false positive rates across decision thresholds, facilitating comparison among models like Logistic regression, Support vector machine, and Random forest. Widely applied in fields such as Medical imaging, Epidemiology, Radar systems, and Credit scoring, it underpins methods for threshold selection, model calibration, and decision analysis.

History and development

Origins trace to early 20th-century work in Radar signal detection during World War II and to psychophysics studies by researchers associated with Signal detection theory at institutions including Harvard University and Bell Labs. Postwar adoption by Meteorology and Remote sensing expanded use, while formal statistical treatments emerged through scholars connected to John von Neumann-era decision theory and later contributions from Bradley Efron-adjacent communities. The area under curve concept was popularized in diagnostic medicine through researchers at Johns Hopkins University and in machine learning via conferences like NeurIPS and ICML.

Definitions and concepts

Key quantities include true positive rate (sensitivity), false positive rate (1 − specificity), and threshold-dependent classification rules as used in Logistic regression and Linear discriminant analysis. The area under the ROC curve (AUC) gives a scalar summary comparable to the Mann–Whitney U statistic and relates to concordance measures used in Cox proportional hazards model assessment. Concepts of likelihood ratio, decision threshold, and cost-weighted errors connect to work by investigators tied to Neyman–Pearson lemma and to applied frameworks like Receiver operating characteristic analysis in ClinicalTrials.gov-linked studies.

ROC curve construction and interpretation

Constructing a curve typically involves scoring instances via models such as Naive Bayes, Gradient boosting machine, or Neural network classifiers, then plotting sensitivity versus false positive rate across score thresholds. Empirical curves derive from ranked lists and can be smoothed using techniques from Kernel density estimation or parametric fits like the binormal model used in Psychometrics. Interpretation often employs comparisons among curves using nonparametric tests related to DeLong test methodologies and uses partial-AUC for regions of practical interest, as applied in publications from American Medical Association journals and proceedings of IEEE conferences.

Beyond AUC, metrics include partial AUC, Youden's J statistic, and Net Reclassification Improvement (NRI) which have appeared in literature from European Society of Cardiology and American Heart Association guideline studies. Calibration measures such as Brier score, and discrimination indices like C-statistic in survival analysis, relate to ROC-based assessments in research from National Institutes of Health-funded groups. Comparisons to precision–recall curves involve contexts where class imbalance is discussed in papers presented at KDD and ICML.

Applications and examples

In Medical imaging, ROC analysis evaluates diagnostic modalities like Magnetic resonance imaging, Computed tomography, and Mammography in multicenter trials coordinated by agencies such as National Cancer Institute. In Radar engineering, ROC underlies detection thresholds for systems developed by organizations including Raytheon and Lockheed Martin. Ecological remote sensing studies using MODIS or Landsat products employ ROC for land-cover classification accuracy assessments cited in work from NASA and USGS. Financial institutions use ROC-based AUC for credit-scoring validations in reports from Federal Reserve-linked research groups.

Limitations and extensions

Limitations include insensitivity of AUC to clinically relevant ranges, potential misinterpretation under severe class imbalance, and lack of direct incorporation of cost or prevalence without extensions like decision curves endorsed in BMJ methodological papers. Extensions include cost-weighted ROC, covariate-adjusted ROC surfaces for ordinal outcomes, time-dependent ROC for censored survival outcomes popularized in studies from Johns Hopkins University and University of Pennsylvania, and multiclass generalizations using one-vs-rest schemes discussed at NeurIPS and in textbooks from Springer publishers.

Category:Statistical charts