7.2.3 - Receiver Operating Characteristic Curve (ROC)

Printer-friendly versionPrinter-friendly version

A Receiver Operating Characteristic Curve (ROC) is a standard technique for summarizing classifier performance over a range of trade-offs between true positive (TP) and false positive (FP) error rates (Sweets, 1988). ROC curve is a plot of sensitivity (the ability of the model to predict an event correctly) versus 1-specificity for the possible cut-off classification probability values π0.

For logistic regression you can create a 2 × 2 classification table of predicted values from your model for your response if \(\hat{y}=0\) or 1 versus the true value of y = 0 or 1. The prediction if \(\hat{y}=1\) depends on some cut-off probability, π0. For example, \(\hat{y}=1\) if  \(\hat{\pi}_i>\pi_0\) and \(\hat{y}=0\) if \(\hat{\pi}_i \leq \pi_0\). The most common value for π0 = 0.5. Then \(sensitivity=P(\hat{y}=1|y=1)\) and \(specificity=P(\hat{y}=0|y=0)\).

The ROC curve is more informative than the classification table since it summarizes the predictive power for all possible π0.

The position of the ROC on the graph reflects the accuracy of the diagnostic test. It covers all possible thresholds (cut-off points). The ROC of random guessing lies on the diagonal line. The ROC of a perfect diagnostic technique is a point at the upper left corner of the graph, where the TP proportion is 1.0 and the FP proportion is 0.

The Area Under the Curve (AUC), also referred to as index of accuracy (A), or concordance index, c, in SAS, and it is an accepted traditional performance metric for a ROC curve. The higher the area under the curve the better prediction power the model has. c = 0.8 can be interpreted to mean that a randomly selected individual from the positive group has a test value larger than that for a randomly chosen individual from the negative group 80 percent of the time.

For more details see Agresti(2007), Sections 5.1.6-5.1.8, Agresti (2013), Section 6.2, and SAS links provided at the beginning.

SAS logo  Here is the SAS program assay4.sas.

SAS program

Here is the resulting ROC graph.

plot

Area under the curve is c = 0.746 indicates good predictive power of the model.

SAS output

Option ctable prints the classification tables for various cut-off points. Each row of this output is a classification table for the specified Prob Level, π0.

table

R logo  Here is the R program file assay.R that corresponds to the SAS program assay4.sas.

R program code

Here is the ROC graph from R output:

R output plot

The area under the curve is c = 0.746 which indicates good predictive power of the model.

R output