Ace your Machine Learning Interview – Part 3 | by Marcello Politi | Oct, 2022

By Jessie Hobb On Oct 27, 2022

Dive into Naive Bayes Classifier using Python

This is the third article in this series I have called “ Ace your Machine Learning Interview” in which I go over the foundations of Machine Learning. If you missed the first two articles you can find them here :

Introduction

Naive Bayes is a Machine Learning algorithm used to solve classification problems, and it is so-called because it is based on Bayes’ theorem.

An algorithm referred to as a classifier, assigns a class to each instance of data. For example, classifying whether an email is spam or non-spam.

Bayes Theorem

Bayes’ Theorem is used to calculate the probability of a cause resulting in the verified event. The formula we have all studied in probability courses is the following.

So this theorem answers the question: ‘What is the probability that event A will occur given that event B has occurred?’ And the interesting thing is that this formula turns the question around. That is, we can calculate this probability by going to see how many times B actually occurred each time event A had occurred. That is, we can answer the original question by going to see the past (the data).

Naive Bayes Classifier

But how then do we apply this theorem to create a Machine Learning classifier? Suppose we have a dataset consisting of n features and a target.

Therefore, our question now is ‘What is the probability of having a certain label y given that those features occurred?’

For example if y = spam/not-spam, x1 = len(email), x2 = number_of_attachments we might ask :

‘What is the probability that y is spam given that x1 = 100 chars and x2 = 2 attachments?’

To answer this question we need only apply Bayes’ theorem trivially, where A = {x1,x2,…,xn} and B = {y}.

But the classifier is not called Bayes Classifier but Naive Bayes Classifier. This is because a naive assumption is made to simplify the calculations, that is, the features are assumed to be independent of each other. This allows us to simplify the formula.

In this way, we can calculate the probability that y = spam. Next, we will calculate the probability that y = not_spam and see which one is more likely. But if you think about it, between the two labels, the one having higher probability will be the one with the larger numerator since the denominator is always the same : P(x1) * P(x2)*…

Then we can also eliminate for simplicity the denominator since for the purpose of comparison we do not care about it.

Now we are going to choose the class that maximizes this probability, we only need to use argmax.

Argmax for classification (Image By Author)

Naive Bayes Classifier for Text Data

This algorithm is often used in the field of NLP for textual data. This is because we can treat individual words that appear in the text as features, and the naive assumption is that therefore these words are independent (which of course is not actually true).

Suppose we have a dataset in which on each row we have a single sentence, and each column tells us whether or not that word appears in the sentence. We have eliminated unnecessary words such as articles, etc.

Dive into Naive Bayes Classifier using Python

Introduction

Naive Bayes is a Machine Learning algorithm used to solve classification problems, and it is so-called because it is based on Bayes’ theorem.

An algorithm referred to as a classifier, assigns a class to each instance of data. For example, classifying whether an email is spam or non-spam.

Bayes Theorem

Bayes’ Theorem is used to calculate the probability of a cause resulting in the verified event. The formula we have all studied in probability courses is the following.

Naive Bayes Classifier

But how then do we apply this theorem to create a Machine Learning classifier? Suppose we have a dataset consisting of n features and a target.

Therefore, our question now is ‘What is the probability of having a certain label y given that those features occurred?’

For example if y = spam/not-spam, x1 = len(email), x2 = number_of_attachments we might ask :

‘What is the probability that y is spam given that x1 = 100 chars and x2 = 2 attachments?’

To answer this question we need only apply Bayes’ theorem trivially, where A = {x1,x2,…,xn} and B = {y}.

Then we can also eliminate for simplicity the denominator since for the purpose of comparison we do not care about it.

Now we are going to choose the class that maximizes this probability, we only need to use argmax.

Naive Bayes Classifier for Text Data

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Ace your Machine Learning Interview – Part 3 | by Marcello Politi | Oct, 2022

Dive into Naive Bayes Classifier using Python

Introduction

Bayes Theorem

Naive Bayes Classifier

Naive Bayes Classifier for Text Data

Let’s code!

Advantages

Disadvantages

Handle Missing Values

Dive into Naive Bayes Classifier using Python

Introduction

Bayes Theorem

Naive Bayes Classifier

Naive Bayes Classifier for Text Data

Let’s code!

Advantages

Disadvantages

Handle Missing Values