Skip to main content

Confusion Matrix

Confusion Matrix

A confusion matrix is a performance measurement tool for classification problems. It is used to evaluate the accuracy of a classification, particularly in terms of how well it predicts different classes. The matrix compares the predicted classifications to the actual (true) classifications. A typical confusion matrix for a binary classification problem is a 2x2 table, with the following structure:



To understand and utilize confusion matrix we need to understand the following terms. As some of the matrices that are calculated will be using those terms.



Key Terms:

True Positive (TP): The number of positive instances that were correctly classified as positive. 
True Negative (TN): The number of negative instances that were correctly classified as negative. 
False Positive (FP): The number of negative instances that were incorrectly classified as positive (also called a Type I error). 
False Negative (FN): The number of positive instances that were incorrectly classified as negative (also called a Type II error).

The confusion matrix can help assess how well the model distinguishes normal vs. anomalous data. False positives (normal data classified as anomalous) and false negatives (anomalous data classified as normal) are especially critical here.

Accuracy, Precision, Recall, and F1-Score are key performance metrics for evaluating the effectiveness of classification models. They are especially important in understanding how well a model handles different types of errors. Let’s break each of them down, with a focus on how they are calculated and when they are useful:

Accuracy

Accuracy is the proportion of correct predictions (both true positives and true negatives) to the total number of predictions. Accuracy = (TP + TN)/(TP + TN + FP +FN)

Precision

Precision is the proportion of correct positive predictions out of all the instances that were predicted as positive. In other words, it answers the question: Of all the instances the model classified as positive, how many were actually positive? Precision = TP/(TP + FP)

Recall (Sensitivity or True Positive Rate)

Recall is the proportion of actual positive instances that were correctly predicted by the model. It answers the question: Of all the actual positives, how many did the model correctly identify? Precision = TP/(TP + FN)

Confusion Matrix with python

Import the libraries
Import required libraries
import numpy as np
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score, f1_score
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report
Spam Data
Sample data with actual and predicted values.

actual = np.array(['Spam', 'Spam', 'Spam', 'Not Spam', 'Spam', 'Not Spam', 'Spam', 'Spam', 'Not Spam', 'Not Spam', 'Spam', 'Not Spam','Spam','Spam'])
predicted = np.array(['Spam', 'Not Spam', 'Spam', 'Not Spam', 'Spam', 'Spam', 'Spam', 'Spam', 'Not Spam', 'Not Spam', 'Not Spam','Spam','Not Spam','Not Spam'])
Confusion Matrix
Get the confusion matrix using sklearn.metrics
# 2. Confusion Matrix
conf_matrix = confusion_matrix(actual, predicted, labels=['Spam', 'Not Spam'])
Display confusion matrix
Display the confusion matrix using heatmap
sns.heatmap(conf_matrix, 
            annot=True,
            cmap='viridis',
            fmt='g', 
            xticklabels=['Spam','Not Spam'],
            yticklabels=['Spam','Not Spam'])
plt.ylabel('Actual', fontsize=14)
plt.title('Confusion Matrix', fontsize=17, pad=20)
plt.gca().xaxis.set_label_position('top') 
plt.xlabel('Prediction', fontsize=13)
plt.gca().xaxis.tick_top()

plt.gca().figure.subplots_adjust(bottom=0.2)
plt.show()

Above code will generate the following image.




Calculate Accuracy, Precision, Recall

# 3. Accuracy, Precision, Recall
accuracy = accuracy_score(actual, predicted)
precision = precision_score(actual, predicted, pos_label='Spam')
recall = recall_score(actual, predicted, pos_label='Spam')
print(f'Accuracy: {accuracy}') 
print(f'Precision: {precision}') 
print(f'Recall: {recall}')
Output
Accuracy: 0.5714285714285714
Precision: 0.7142857142857143
Recall: 0.5555555555555556

Conclusion

Confusion matrix is a great way to talk about the model accuracy in terms of mathematical equations which gives a better understanding of the effectiveness of the model or data.

Comments

Popular posts from this blog

Ollama - Run AI Language models locally

In today’s rapidly evolving AI landscape, privacy, control, and performance are more important than ever. Ollama emerges as a powerful solution, enabling developers and AI enthusiasts to run open-source large language models (LLMs) locally on their own systems. Whether your goal is building AI-powered applications or exploring AI capabilities, Ollama provides a versatile platform tailored to diverse needs. What is Ollama? Ollama is an open-source platform designed to facilitate the local execution of LLMs . By running models directly on your hardware, Ollama ensures full data control , enhanced privacy , and reduced reliance on cloud services . Key Features 1. Local Model Execution Ollama supports running a variety of LLMs—including LLaMA 2, Mistral, and Phi-2 —directly on your machine. This eliminates the need for internet connectivity, keeping your data private and secure . 2. Cross-Platform Compatibility The platform works across macOS, Windows, and Linux , providing f...

Windows Failure to Y2k

Some of the recent software failures Bits and bytes are now part of billions of lives. As we get more and more involved with software and software related product and services around us the more it becomes important to safe guard those. Today after the microsoft outage I thought to write a note about some of the failures that we all encounted. Bits and bytes are now part of billions of lives. As we get more and more involved with software and software related product and services around us the more it becomes important to safe guard those. Following are some of the software worlds failures that was realized as a developer or user etc. Following are some of the software worlds failures that was realized as a developer or user etc. Y2k Problem - 2000 To save memory and storage space, dates are often recorded using only the last two digits of the year (e.g., "99" for 1999 or 24 as 2024 etc.). As the year 2000 was approaching, there was concern that systems would interp...