Importance of confusion matrix in Cybersecurity

M G GOVARDHAN GOWDA
4 min readJun 7, 2021

Are you concerned about cybercrime? and how the confusion matrix is interrelated with it. Let us come to know by this blog...

Cybercrime

What is cybercrime?

Cybercrime is a criminal activity that either targets or uses a computer, a computer network, or a networked device. Most cybercrime is an attack on information about individuals, corporations, or governments. Although the attacks do not take place on a physical body, they do take place on the personal or corporate virtual body, which is the set of informational attributes that define people and institutions on the Internet. In other words, in the digital age, our virtual identities are essential elements of everyday life: we are a bundle of numbers and identifiers in multiple computer databases owned by governments and corporations. Some of the cybercrime is committed by cybercriminals or hackers who want to make money.

Here are some of the top 5 cybercrime:

  1. Phishing Scams.
  2. Website Spoofing.
  3. Ransomware.
  4. Malware.
  5. IoT Hacking.

Now let us know about Confusion Matrix

When we get the data, after data cleaning, pre-processing and wrangling, the first step we do is to feed it to an outstanding model and of course, get output in probabilities. But hold on! How in the hell can we measure the effectiveness of our model. Better the effectiveness, better the performance, and that is exactly what we want. And it is where the Confusion matrix comes into the limelight. Confusion Matrix is a performance measurement for machine learning classification.

Well, it is a performance measurement for machine learning classification problem where output can be two or more classes. It is a table with 4 different combinations of predicted and actual values.

For a 2 x 2 matrix we have 4 outcomes:-

  • True Positive (TP)
  • False Positive (FP)
  • True Negative (TN)
  • False Negative (FN)

Let’s understand TP, FP, FN, TN in terms of cybercrime.

1) True Positive:

Actual: It’s true and you predicted positive .

Prediction: You predicted that cybercrime has happened and it had actually happened.

2) False Positive: (Type 1 Error)

Actual: It’s false and you predicted positive.

Prediction: You predicted that cybercrime has happened but it had actually not.

3) True Negative:

Actual: It is true and you predicted negative.

Prediction: You predicted that cybercrime has not happened and it had actually not.

4) False Negative: (Type 2 Error)

Actual: It is false and you predicted negative.

Prediction: You predicted that cybercrime has not happened but it had actually happened.

This is the most critical value because actually cyber attack happened and machine learning model haven’t informed the organization. And this causes huge losses to the organization. Because they are not able to get information at the right time and they haven’t taken any immediate action after the attack happened.

Just Remember, We describe predicted values as Positive and Negative and actual values as True and False.

Conclusion

A confusion matrix is a tabular summary of the number of correct and incorrect predictions made by a classifier. It is used to measure the performance of a classification model. It can be used to evaluate the performance of a classification model through the calculation of performance metrics like accuracy, precision. It is used to determine the performance of the classification models for a given set of test data. It can only be determined if the true values for test data are known. The matrix itself can be easily understood and implemented to test an ML model.

--

--

M G GOVARDHAN GOWDA

MLOPS internship trainee @LinuxWorld informatics Pvt. LTD. || Student @Dayananda Sagar University |