Skip to content

Confusion Matrix

What is a Confusion Matrix?

A confusion matrix is a table that summarizes how well a model is performing by comparing the model’s predictions with the actual ground truth labels. In machine learning (especially for computer vision tasks), a confusion matrix is a specific table layout where each row represents the instances of an actual class and each column represents the instances of a predicted class (or vice versa). The entries along the diagonal are the number of correct predictions for each class, while the off‑diagonal entries show where the model’s predictions were wrong—essentially, where the model got “confused.” This matrix provides a much more detailed snapshot of a model’s performance than a single metric like overall accuracy and lets you see which classes are being mistaken for which others.

 

Breaking Down the Confusion Matrix Components

Let’s break down the confusion matrix using a simple example. Imagine a binary image classifier that predicts whether a photo contains a cat (positive class) or not (negative class).

  • True Positives (TP): The model predicted “cat” and the image actually contains a cat.
  • False Positives (FP): The model predicted “cat” but the image does not contain a cat (a false alarm).
  • False Negatives (FN): The model predicted “no cat” but the image does contain a cat (a missed detection).
  • True Negatives (TN): The model predicted “no cat” and the image indeed has no cat.

In multi‑class settings, the confusion matrix expands to an N × N grid. Each cell (i, j) shows how many examples of class i were classified as class j. The diagonal entries are still the correctly predicted instances for each class, while off‑diagonal values reveal specific misclassifications—valuable hints about where your model is tripping up.

Metrics Derived from the Confusion Matrix

Because the confusion matrix captures TP, FP, FN, and TN, you can derive several popular evaluation metrics:

  • Accuracy = (TP + TN) / (TP + TN + FP + FN)
  • Precision = TP / (TP + FP) — when the model predicts a class, how often is it right?
  • Recall = TP / (TP + FN) — of all the actual instances of a class, how many did the model catch?
  • F1 Score = 2 * (precision · recall) / (precision + recall) — the harmonic mean of precision and recall.

Each metric answers a different question, so relying on a single number can be misleading, especially with imbalanced datasets.

Debugging Models with Confusion Matrices and FiftyOne

Beyond summarizing performance, a confusion matrix is a practical debugging tool. If you notice your model often confuses “fox” with “cat,” you can dive into those specific samples to understand why. Tools like FiftyOne make this process interactive: generate the matrix, click a cell, and immediately inspect the misclassified images. FiftyOne’s built‑in confusion‑matrix visualization helps you spot patterns quickly—whether you need more diverse training data, different augmentations, or architectural tweaks.

Further Reading

Want to build and deploy visual AI at scale?

Talk to an expert about FiftyOne for your enterprise.

Open Source

Like what you see on GitHub? Give the Open Source FiftyOne project a star

Give Us a Star

Get Started

Ready to get started? It’s easy to get up and running in a few minutes

Get Started

Community

Get answers and ask questions in a variety of use case-specific channels on Discord

Join Discord