Techno Blender
Digitally Yours.

Updated Machine Learning Specialization by DeepLearing.AI | by Dimid

0 113


Everything new is a well-forgotten old, isn’t it?

Photo by Dietmar Becker on Unsplash

On April 18th, 2022, at Coursera’s 10th anniversary, Andrew Ng (founder of DeepLearning.AI, co-founder of Coursera, and one of the most famous popularizers of Machine Learning and AI) released a new Machine Learning Specialization. This is an updated version of the original Machine Learning Stanford course created back in 2010, which was taken by about 5 million people since it was initially released.

I took the original course some time ago and have taken the updated one as soon as it was finished in July. Let’s take a closer look at these learning materials and find out, what exactly has been updated and if is it worth taking this course in 2022.

TL;DR

Yes, it is. In my view, now this is the best way to start your machine learning journey especially if you are a complete beginner and have never heard about data science and machine learning before.

In my previous article on this topic, I shared my thoughts about Stanford’s original Machine Learning course. There I listed my personal opinion about the drawbacks of the course and suggestions on what it would be worth to supplement it with.

Now that I have taken both versions — original and updated, in my further reasoning, I will rely on the theses expressed in this article (mostly in Comparison with the Original Course section). So, if you want to get familiar with my initial thoughts, I share the link to my previous article below.

In this section, I’m going to compare two versions of the course based largely on what I’ve stated before. I don’t know if someone from the DeepLearning.AI team has read my review, but almost all of my claims were properly considered and satisfied. 🙂

Outdating

In my previous article, I said that the main drawback of this course is that it is outdated. Of course, I was not the only one who understand this, and I think this was the main reason for its update.

Although the basics that the course talks about will never change, it’s much more pleasant to know that all the information you receive, including assignments and practical examples, is topical and relevant for the present.

Based on the numerous advantages of the course, the elimination of this disadvantage only is already a sufficient condition for this course to be a very good learning material. But its creators went further, refining and improving the educational plan and ways of interacting with students, not only by updating the course but also by making it more complete, broad, and interesting.

Python Instead of Octave

In the original course, practical assignments had to be solved on Octave (MATLAB). And although there were ways around it, the official implementation didn’t use Python. Fortunately for everyone, this has been fixed, and now the course uses Python.

This means not only that you don’t need to learn a programming language that you will almost certainly never use again in your life (sorry, MATLAB programmers). It also means that together with Python, you will get the basic knowledge about the libraries that you will use almost every day if you start working in data science or machine learning: numpy, scikit-learn, and TensorFlow. It’s just great!

It is important to note that the course does not teach you Python, and you need to understand the basics of this language and programming in general — variables, conditional expressions, loops, and functions will be quite enough (classes and Object Oriented Programming in its purest form is not used in the course). So if you are a complete beginner, maybe you should take some Python basics course first.

Ensemble Learning

The original course did not mention any of the ensemble models, but now it will tell you in detail about Decision Trees (which also didn’t happen before), explain the motivation of ensemble learning, and demonstrate the Random Forest and Boosting models.

Although ensemble learning is explained here only in the context of decision trees and not other models, it is a step in the right direction that allows you to get a more complete picture of the machine learning models zoo.

You will also be shown the XGBoost library, one of the most powerful tools of classical machine learning (by classical I mean machine learning without deep neural networks), so you get it into your toolbox right away!

Categorical Variables

The original course also did not mention categorical variables at all, which seems very strange to me, because it will confuse a beginner in the very first task that he decided to solve on his own (since categorical variables are very common). This fault has been corrected here, and the course will tell you about one-hot encoding.

You will have to get acquainted with the rest of the methods by yourself, but since this course is designed for beginners, I do not consider this a disadvantage.

Unsupervised Learning Algorithms: Clustering, Anomaly Detection, and Dimensionality Reduction

As for unsupervised learning algorithms, the situation has remained about the same. K-Means only is considered for clustering, as well as Gaussian Mixture Models (GMM) for anomaly detection. Dimensionality reduction is mentioned in the course as an auxiliary technique, and less attention is paid to it than before when we have a couple of separate lectures on Principal Component Analysis (PCA).

Although it may seem that I consider many other approaches overlooked (and last time I thought so) I’ve changed my mind, which I’ll talk about a bit later.

Sorry, Support Vector Machine

Before moving on to more general considerations, to be completely honest, I should note that Support Vector Machine materials were cut out of the updated course, although the original course contained them.

Perhaps the mention of SVM did not fit into the course due to timekeeping, but I think that it was removed because of the complexity since this algorithm with its kernel trick is quite difficult to understand and is probably the most difficult aspect of classical supervised learning.

So, I think it was removed to not scare off newcomers; in any case, changing SVM to Decision Trees, Random Forest and Boosting is a good exchange.

Structuredness

Updated specialization has become much more structured. The original source consisted of 11 weeks, the materials inside of which were not separated from each other.

Now, the content of the specialization is divided into 3 courses, each of which has a specific topic and an internal division by weeks. So it becomes much easier to navigate inside the course, return to it and search for particular materials. By the way, these three courses are:

  1. Supervised Machine Learning: introduction to machine learning, regression (linear regression), and classification (logistic regression);
  2. Advanced Learning Algorithms: neural networks, decision trees, random forest, boosting, and practical advice for applying machine learning;
  3. Unsupervised Learning, Recommenders, Reinforcement Learning: clustering, anomaly detection, recommender systems, and reinforcement learning.

See the specialization homepage to get a more detailed description.

More Labs, More Visualizations, and More Interactivity

Another good solution, in my opinion, is to include more practical labs and allow people to see more code. Since the code can now be executed in Jupyter Notebooks, this has freed the authors’ hands. Labs (almost all of which are optional, so no one will be scared off) have become more numerous, detailed, and diverse.

At the same time, many of them contain very intuitive and interactive visualizations. The quality of some of them, let’s be honest, could have been better (I honestly don’t understand why they used pure matplotlib even for interactive plots instead of for example plotly) but some of them really allow you to get very useful insights.

To get an idea of what kind of interactivity is meant, you can see the most recent specialization demo on YouTube.

The Third Course is Amazing

Although I said that it would be possible to mention more algorithms and approaches for unsupervised learning tasks, the third course of specialization covers much more than an analogous part before.

Clustering and anomaly detection are reviewed only during the first week. The second week is devoted to two algorithms of the most commercially successful machine learning application, recommender systems: collaborative filtering and content-based filtering, and the third week is dedicated to Deep Q-Learning (DQL), a reinforcement learning algorithm, which is really awesome!

First, tales about reinforcement learning are always very inspiring, at least to me since I think this is the most “machine”-based part of machine learning. Second, the teaching talent of Andrew Ng and the course team makes this, one of the most complex fields of science, or at least its basics, understandable for everyone.

Putting all together, the third course shows an extremely broad picture of real-world machine learning applications and takes specialization to the next level.

Definitely, the Specialization was Designed for Beginners

For a variety of authors’ reasons, specialization was made as simple as possible, sometimes even too much. This is not a drawback, but rather an advantage for the majority of learners, but in any case, it is a feature that has several consequences.

Most of all, this concerns people who are familiar with high school mathematics (and understand what is a derivative, optimization tasks, why gradient descent really works, etc.). There will be a lot of understatement in the course for you, as if you lack additional lectures describing the math behind certain algorithms. This is because the course tries to do without math at all, but sometimes it looks strange, for example not mentioning backpropagation when talking about neural networks.

Of course, you can fill in these understatements by reading one of the many books or by taking a more advanced ML course. But if you are a computer science student, there may be more effective ways to dive into the world of AI, data science, and machine learning specifically for you. After all, you really don’t need a strong mathematical background to start exploring ML, but you may need to know a lot of mathematical points if you decide to do it seriously.

The hypothesis that I got during the course completion was completely proved when I was watching the Accelerating Your AI Career discussion by the DeepLearning.AI team.

Apparently, Andrew Ng and the team have done everything possible to give anyone the opportunity to pass this specialization and understand what machine learning is. This can be helpful for:

  • students who want to find ML-related jobs,
  • established professionals who want to change their field of activity,
  • or just people who are interested in what ML is and whether there is a real danger from Artificial Intelligence at least for the current moment.

Andrew Ng mentioned that one of the goals of spreading knowledge about AI and ML is to make AI more accessible for people of non-technical specialties because it will still be very useful for them. AI-related products and goods are entering our lives more and more actively and by understanding the basics, you can greatly facilitate your work or see the opportunities to do so, no matter what profession you are working in. We are talking about automation, and if programming has made it possible to automate routine actions, then AI makes it possible to automate routine actions, requiring some level of intelligence right before our eyes.

This specialization is one of the steps to achieving this goal. It does not tell you at once about all approaches to training neural networks or all the pitfalls of the dimensionality reduction task. But none of the specializations, courses, or books can. Instead, it gives you the broadest possible understanding of machine learning, and it is deep enough to engage in it. The lack of math may annoy some enthusiasts, but this lowers the entry threshold so much that it has much more pros than cons.

If you:

  • have never heard about machine learning,
  • are afraid that your knowledge of math is not enough, or
  • are not working in IT or in a technical occupation,

this specialization is an excellent choice. It will reward your decision to explore new things without scaring you with complex terms and formulas. And when you finish it, you will be able to start implementing your own project or be not afraid to continue to get acquainted with AI word with the help of more advanced materials.

In the worst case, learning about AI will be interesting, and in the best one, it can change your life for the better.


Everything new is a well-forgotten old, isn’t it?

Photo by Dietmar Becker on Unsplash

On April 18th, 2022, at Coursera’s 10th anniversary, Andrew Ng (founder of DeepLearning.AI, co-founder of Coursera, and one of the most famous popularizers of Machine Learning and AI) released a new Machine Learning Specialization. This is an updated version of the original Machine Learning Stanford course created back in 2010, which was taken by about 5 million people since it was initially released.

I took the original course some time ago and have taken the updated one as soon as it was finished in July. Let’s take a closer look at these learning materials and find out, what exactly has been updated and if is it worth taking this course in 2022.

TL;DR

Yes, it is. In my view, now this is the best way to start your machine learning journey especially if you are a complete beginner and have never heard about data science and machine learning before.

In my previous article on this topic, I shared my thoughts about Stanford’s original Machine Learning course. There I listed my personal opinion about the drawbacks of the course and suggestions on what it would be worth to supplement it with.

Now that I have taken both versions — original and updated, in my further reasoning, I will rely on the theses expressed in this article (mostly in Comparison with the Original Course section). So, if you want to get familiar with my initial thoughts, I share the link to my previous article below.

In this section, I’m going to compare two versions of the course based largely on what I’ve stated before. I don’t know if someone from the DeepLearning.AI team has read my review, but almost all of my claims were properly considered and satisfied. 🙂

Outdating

In my previous article, I said that the main drawback of this course is that it is outdated. Of course, I was not the only one who understand this, and I think this was the main reason for its update.

Although the basics that the course talks about will never change, it’s much more pleasant to know that all the information you receive, including assignments and practical examples, is topical and relevant for the present.

Based on the numerous advantages of the course, the elimination of this disadvantage only is already a sufficient condition for this course to be a very good learning material. But its creators went further, refining and improving the educational plan and ways of interacting with students, not only by updating the course but also by making it more complete, broad, and interesting.

Python Instead of Octave

In the original course, practical assignments had to be solved on Octave (MATLAB). And although there were ways around it, the official implementation didn’t use Python. Fortunately for everyone, this has been fixed, and now the course uses Python.

This means not only that you don’t need to learn a programming language that you will almost certainly never use again in your life (sorry, MATLAB programmers). It also means that together with Python, you will get the basic knowledge about the libraries that you will use almost every day if you start working in data science or machine learning: numpy, scikit-learn, and TensorFlow. It’s just great!

It is important to note that the course does not teach you Python, and you need to understand the basics of this language and programming in general — variables, conditional expressions, loops, and functions will be quite enough (classes and Object Oriented Programming in its purest form is not used in the course). So if you are a complete beginner, maybe you should take some Python basics course first.

Ensemble Learning

The original course did not mention any of the ensemble models, but now it will tell you in detail about Decision Trees (which also didn’t happen before), explain the motivation of ensemble learning, and demonstrate the Random Forest and Boosting models.

Although ensemble learning is explained here only in the context of decision trees and not other models, it is a step in the right direction that allows you to get a more complete picture of the machine learning models zoo.

You will also be shown the XGBoost library, one of the most powerful tools of classical machine learning (by classical I mean machine learning without deep neural networks), so you get it into your toolbox right away!

Categorical Variables

The original course also did not mention categorical variables at all, which seems very strange to me, because it will confuse a beginner in the very first task that he decided to solve on his own (since categorical variables are very common). This fault has been corrected here, and the course will tell you about one-hot encoding.

You will have to get acquainted with the rest of the methods by yourself, but since this course is designed for beginners, I do not consider this a disadvantage.

Unsupervised Learning Algorithms: Clustering, Anomaly Detection, and Dimensionality Reduction

As for unsupervised learning algorithms, the situation has remained about the same. K-Means only is considered for clustering, as well as Gaussian Mixture Models (GMM) for anomaly detection. Dimensionality reduction is mentioned in the course as an auxiliary technique, and less attention is paid to it than before when we have a couple of separate lectures on Principal Component Analysis (PCA).

Although it may seem that I consider many other approaches overlooked (and last time I thought so) I’ve changed my mind, which I’ll talk about a bit later.

Sorry, Support Vector Machine

Before moving on to more general considerations, to be completely honest, I should note that Support Vector Machine materials were cut out of the updated course, although the original course contained them.

Perhaps the mention of SVM did not fit into the course due to timekeeping, but I think that it was removed because of the complexity since this algorithm with its kernel trick is quite difficult to understand and is probably the most difficult aspect of classical supervised learning.

So, I think it was removed to not scare off newcomers; in any case, changing SVM to Decision Trees, Random Forest and Boosting is a good exchange.

Structuredness

Updated specialization has become much more structured. The original source consisted of 11 weeks, the materials inside of which were not separated from each other.

Now, the content of the specialization is divided into 3 courses, each of which has a specific topic and an internal division by weeks. So it becomes much easier to navigate inside the course, return to it and search for particular materials. By the way, these three courses are:

  1. Supervised Machine Learning: introduction to machine learning, regression (linear regression), and classification (logistic regression);
  2. Advanced Learning Algorithms: neural networks, decision trees, random forest, boosting, and practical advice for applying machine learning;
  3. Unsupervised Learning, Recommenders, Reinforcement Learning: clustering, anomaly detection, recommender systems, and reinforcement learning.

See the specialization homepage to get a more detailed description.

More Labs, More Visualizations, and More Interactivity

Another good solution, in my opinion, is to include more practical labs and allow people to see more code. Since the code can now be executed in Jupyter Notebooks, this has freed the authors’ hands. Labs (almost all of which are optional, so no one will be scared off) have become more numerous, detailed, and diverse.

At the same time, many of them contain very intuitive and interactive visualizations. The quality of some of them, let’s be honest, could have been better (I honestly don’t understand why they used pure matplotlib even for interactive plots instead of for example plotly) but some of them really allow you to get very useful insights.

To get an idea of what kind of interactivity is meant, you can see the most recent specialization demo on YouTube.

The Third Course is Amazing

Although I said that it would be possible to mention more algorithms and approaches for unsupervised learning tasks, the third course of specialization covers much more than an analogous part before.

Clustering and anomaly detection are reviewed only during the first week. The second week is devoted to two algorithms of the most commercially successful machine learning application, recommender systems: collaborative filtering and content-based filtering, and the third week is dedicated to Deep Q-Learning (DQL), a reinforcement learning algorithm, which is really awesome!

First, tales about reinforcement learning are always very inspiring, at least to me since I think this is the most “machine”-based part of machine learning. Second, the teaching talent of Andrew Ng and the course team makes this, one of the most complex fields of science, or at least its basics, understandable for everyone.

Putting all together, the third course shows an extremely broad picture of real-world machine learning applications and takes specialization to the next level.

Definitely, the Specialization was Designed for Beginners

For a variety of authors’ reasons, specialization was made as simple as possible, sometimes even too much. This is not a drawback, but rather an advantage for the majority of learners, but in any case, it is a feature that has several consequences.

Most of all, this concerns people who are familiar with high school mathematics (and understand what is a derivative, optimization tasks, why gradient descent really works, etc.). There will be a lot of understatement in the course for you, as if you lack additional lectures describing the math behind certain algorithms. This is because the course tries to do without math at all, but sometimes it looks strange, for example not mentioning backpropagation when talking about neural networks.

Of course, you can fill in these understatements by reading one of the many books or by taking a more advanced ML course. But if you are a computer science student, there may be more effective ways to dive into the world of AI, data science, and machine learning specifically for you. After all, you really don’t need a strong mathematical background to start exploring ML, but you may need to know a lot of mathematical points if you decide to do it seriously.

The hypothesis that I got during the course completion was completely proved when I was watching the Accelerating Your AI Career discussion by the DeepLearning.AI team.

Apparently, Andrew Ng and the team have done everything possible to give anyone the opportunity to pass this specialization and understand what machine learning is. This can be helpful for:

  • students who want to find ML-related jobs,
  • established professionals who want to change their field of activity,
  • or just people who are interested in what ML is and whether there is a real danger from Artificial Intelligence at least for the current moment.

Andrew Ng mentioned that one of the goals of spreading knowledge about AI and ML is to make AI more accessible for people of non-technical specialties because it will still be very useful for them. AI-related products and goods are entering our lives more and more actively and by understanding the basics, you can greatly facilitate your work or see the opportunities to do so, no matter what profession you are working in. We are talking about automation, and if programming has made it possible to automate routine actions, then AI makes it possible to automate routine actions, requiring some level of intelligence right before our eyes.

This specialization is one of the steps to achieving this goal. It does not tell you at once about all approaches to training neural networks or all the pitfalls of the dimensionality reduction task. But none of the specializations, courses, or books can. Instead, it gives you the broadest possible understanding of machine learning, and it is deep enough to engage in it. The lack of math may annoy some enthusiasts, but this lowers the entry threshold so much that it has much more pros than cons.

If you:

  • have never heard about machine learning,
  • are afraid that your knowledge of math is not enough, or
  • are not working in IT or in a technical occupation,

this specialization is an excellent choice. It will reward your decision to explore new things without scaring you with complex terms and formulas. And when you finish it, you will be able to start implementing your own project or be not afraid to continue to get acquainted with AI word with the help of more advanced materials.

In the worst case, learning about AI will be interesting, and in the best one, it can change your life for the better.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment