The Receiver Operating Characteristic (ROC) curve allows you to graphically visualize the quality of a classifier with changing class cutoffs. In this tutorial, I will show you how to calculate it and provide intuition on how to use it to interpret your model.

From this entry you will learn:

This article is part of a series on measuring classification quality. The articles so far include:

  1. Precision, Recall and F1 – Classifier Evaluation Measures
  2. Precision-Recall Curve How to Plot and Interpret It
  3. ROC Curve (this post)

What is the ROC curve and why was it created – intuitions

Translating "Receiver Operating Characteristic" we get "receiver operating characteristic", but what is it about? Where does this name come from.

To understand the idea behind the ROC curve, let's go back to World War II. Imagine that you are a radar operator on a submarine. The entire crew depends on you, and your job is to inform them in time about an approaching enemy ship (you can choose which side you support: Allies, Germany).

What is your job like? You sit at the radar and spend hours listening and staring at the screen. Often you hear only noise, sometimes animals, sometimes other ships (fishing, passenger, etc.). Very rarely another enemy ship. How do you measure your effectiveness? Because on the one hand you have to inform with great efficiency about the enemy ship before it fires torpedoes. On the other hand you cannot raise false alarms all the time (the captain and crew would quickly feed you to the sharks).

A good receiver (radar) operator must have high efficiency (precision) and rarely raise false alarms. Let's discuss this in more detail.

How the ROC curve is created – a theoretical explanation step by step

The ROC curve presents a certain profile (power) of the classifier in a visual way. It is created based on two measures True Positive Rate (TPR) and False Positive Rate (FPR) calculated at different classifier cutoff levels and plotted on a graph. The curve is created similarly to the precision-recall curve, except that we plot different values.

Okay, but what do TPR and FPR mean and what are these different classifier cutoff levels?

Let's start by explaining the TPR and FPR metrics (I encourage you to read the article on precision and recall metrics).

Let's assume that we are considering a binary classifier that recognizes two classes "A" and "B". Let class A be a positive class (e.g. that the email is spam, or the patient is sick, or that the picture shows a cat), class "B" will be a negative class (the negation of class A). When we run our trained classifier and want to evaluate its effectiveness on the test set, 4 cases appear, which we can enter into the so-called "confusion matrix"

  1. True Positive Rate (TPR) , also known as sensitivity or recall: TPR = TP / (TP + FN)Where:
    • TP (True Positives) – number of correctly classified positive cases
    • FN (False Negatives) – number of positive cases incorrectly classified as negative
  2. False Positive Rate (FPR) : FPR = FP / (FP + TN)Where:
    • FP (False Positives) – number of negative cases wrongly classified as positive
    • TN (True Negatives) – number of correctly classified negative cases

The ROC curve is created by changing the decision threshold of the classifier and calculating the TPR and FPR for each threshold. The points (FPR, TPR) are then plotted on a graph to create the curve.

Interpreting the ROC curve

  1. The ideal ROC curve passes through the point (0,1), which means 100% TPR and 0% FPR.
  2. The diagonal line (y = x) represents a random classifier.
  3. The closer the curve is to the upper left corner, the better the classifier is.

Often, instead of the graph itself, we use the area under the graph of this curve ( A rea Under the Curve ROC) as an aggregate measure. AUC-ROC (Area Under the Curve ROC) is a single number representing the performance of the classifier:

How to plot ROC curve in scikit-learn

Below is an example of implementing an ROC curve in scikit-learn:

I have prepared an environment for you in replit https://replit.com/@ksopyla/Krzywa-ROC-scikit-learn?v=1

 import numpy as e.g.
import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve, auc
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification

# Generating sample data
X, y = make_classification(n_samples=1000, n_classes=2, weights=[0.7, 0.3], random_state=42)

# Division into training and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Model training (e.g. logistic regression)
model = LogisticRegression()
model.fit(X_train, y_train)

#Prediction
y_pred_proba = model.predict_proba(X_test)[:, 1]

# Calculation of FPR, TPR and thresholds
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)

#AUC Calculation
roc_auc = auc(fpr, tpr)

# Drawing the ROC curve
plt.figure()
plt.plot(fpr, tpr, color='darkorange', lw=2, label=f'ROC curve (AUC = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], color='navy', lw=2, line)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC) Curve')
plt.legend(loc="lower right")
plt.show()

As a result, we will get a graph similar to the one below

Additional materials

P.S. Featured image generated with Dalle3 prompt: "Please generate me a featured image for the post titled "ROC curve: Key to evaluating classifiers" The image should combine key elements of the post: ROC curve graph, reference to historical context, and aspects of data analysis and machine learning."