The Receiver Operating Characteristic (ROC) curve allows you to graphically visualize the quality of a classifier with changing class cutoffs. In this tutorial, I will show you how to calculate it and provide intuition on how to use it to interpret your model.
From this entry you will learn:
This article is part of a series on measuring classification quality. The articles so far include:
Translating "Receiver Operating Characteristic" we get "receiver operating characteristic", but what is it about? Where does this name come from.
To understand the idea behind the ROC curve, let's go back to World War II. Imagine that you are a radar operator on a submarine. The entire crew depends on you, and your job is to inform them in time about an approaching enemy ship (you can choose which side you support: Allies, Germany).
What is your job like? You sit at the radar and spend hours listening and staring at the screen. Often you hear only noise, sometimes animals, sometimes other ships (fishing, passenger, etc.). Very rarely another enemy ship. How do you measure your effectiveness? Because on the one hand you have to inform with great efficiency about the enemy ship before it fires torpedoes. On the other hand you cannot raise false alarms all the time (the captain and crew would quickly feed you to the sharks).
A good receiver (radar) operator must have high efficiency (precision) and rarely raise false alarms. Let's discuss this in more detail.
The ROC curve presents a certain profile (power) of the classifier in a visual way. It is created based on two measures True Positive Rate (TPR) and False Positive Rate (FPR) calculated at different classifier cutoff levels and plotted on a graph. The curve is created similarly to the precision-recall curve, except that we plot different values.
Okay, but what do TPR and FPR mean and what are these different classifier cutoff levels?
Let's start by explaining the TPR and FPR metrics (I encourage you to read the article on precision and recall metrics).
Let's assume that we are considering a binary classifier that recognizes two classes "A" and "B". Let class A be a positive class (e.g. that the email is spam, or the patient is sick, or that the picture shows a cat), class "B" will be a negative class (the negation of class A). When we run our trained classifier and want to evaluate its effectiveness on the test set, 4 cases appear, which we can enter into the so-called "confusion matrix"
The ROC curve is created by changing the decision threshold of the classifier and calculating the TPR and FPR for each threshold. The points (FPR, TPR) are then plotted on a graph to create the curve.
Often, instead of the graph itself, we use the area under the graph of this curve ( A rea Under the Curve ROC) as an aggregate measure. AUC-ROC (Area Under the Curve ROC) is a single number representing the performance of the classifier:
Below is an example of implementing an ROC curve in scikit-learn:
I have prepared an environment for you in replit https://replit.com/@ksopyla/Krzywa-ROC-scikit-learn?v=1
import numpy as e.g. import matplotlib.pyplot as plt from sklearn.metrics import roc_curve, auc from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification # Generating sample data X, y = make_classification(n_samples=1000, n_classes=2, weights=[0.7, 0.3], random_state=42) # Division into training and test set X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Model training (e.g. logistic regression) model = LogisticRegression() model.fit(X_train, y_train) #Prediction y_pred_proba = model.predict_proba(X_test)[:, 1] # Calculation of FPR, TPR and thresholds fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba) #AUC Calculation roc_auc = auc(fpr, tpr) # Drawing the ROC curve plt.figure() plt.plot(fpr, tpr, color='darkorange', lw=2, label=f'ROC curve (AUC = {roc_auc:.2f})') plt.plot([0, 1], [0, 1], color='navy', lw=2, line) plt.xlim([0.0, 1.0]) plt.ylim([0.0, 1.05]) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('Receiver Operating Characteristic (ROC) Curve') plt.legend(loc="lower right") plt.show()
As a result, we will get a graph similar to the one below
P.S. Featured image generated with Dalle3 prompt: "Please generate me a featured image for the post titled "ROC curve: Key to evaluating classifiers" The image should combine key elements of the post: ROC curve graph, reference to historical context, and aspects of data analysis and machine learning."