Techno Blender
Digitally Yours.

How To Learn Data Science If You Want To Accomplish Your Goals | by Ken Jee | Nov, 2022

0 48


Periodically, I like to refresh my advice about how I would go about learning data science from ground zero. The data domain is changing rapidly, and, as my own knowledge grows, I think it is important to re-evaluate the approach that I would recommend.

My advice for learning data science has changed since my previous efforts at imparting this wisdom. In this article, I give you my updated approach about how I would take on learning this field and have some fun along the way. As a disclaimer, I don’t think there is one “correct” way to learn data science. Different things work for different people, and your own experimentation is integral to your success in any career. Be sure to read until the end because I answer the most commonly asked question I get on this topic: how long does it take to learn data science?

If you prefer a video format, watch here:

Starting off, I want to debunk the idea that you can “learn data science.” This implies that data science is a static subject that can be learned in its entirety. For better or for worse, data science is constantly evolving and growing. I don’t know a single person, including myself, that could possibly know the whole field. Learning data science is a journey, not a destination, and coming in with this mindset can make this process far more enjoyable for you. I see so many students who get overwhelmed by how large the field is. It will 100% be completely overwhelming if your goal is to learn the entire domain. On the other hand, it becomes manageable if you focus on just learning a little at a time and growing your knowledge with specific smaller goals in mind.

With that being said, let’s jump straight into something that I would change about my prior approach. When looking back, my advice about the very beginning has been extremely vague. I usually say something like “learn enough Python and statistics to get started with projects.” While this isn’t bad advice, this time around I want to get more into the weeds about exactly how you should start this learning process.

swimming without direction in the ocean is like learning without planning out the steps
Image by author

The real first step is getting an understanding of the components of the field and creating a learning program for yourself to navigate the journey. If we just jump into the ocean without any clear direction and start swimming, we get tired out really easily and may just give up. On the other hand, if we have a map and a clear objective, we at least know what we are getting ourselves into. You need to create this map for your data science learning before you do anything else. The really cool part of this is that by creating this map, you will also learn a lot about the field of data science in general.

symbol for a learning map
Image by author

So, how would I create a map? There are a couple ways to do this that vary in the level of overhead work. The easiest way is to take an online course or certificate that lays it out for you. That is the huge benefit of online courses: they can lay out the entire learning path for you so you just have to follow along. On the other hand, they cost money. All the information is out there for you to learn for free if you are willing to put in the time to create this roadmap for yourself. To be clear, there is no right or wrong here. Whether or not you decide to pay an organization, do what’s worth it to you. If courses are your speed, I have a discount code for 365 DataScience in the appendix below; if not, I’ve included some links to my favorite free resources.

The next way would be to look at multiple online courses, university courses, and other resources and to get a general feel for how they lay out the path. Most paid courses let you see how the offering is structured. You can then make your own roadmap based on the classes and concepts that you see there. By doing this, you also get a feeling for what skills and techniques are needed in the domain. The fun here is you get to see what is most interesting and appealing to you and adapt your roadmap to that. This is going to be really important later on.

Ok, let’s set up what my learning plan would look like for myself. I encourage you to do your own research here and adjust this based on your interests and aptitudes.

If I were to lay out a learning plan for myself, I would almost certainly start with learning Python. Coding languages allow you to build things. If you can build things, you can apply almost anything you’re working on to a real problem. I look at learning programming like building out my toolset. I could build a shed with just my bare hands, but it would be a heck of a lot easier with a hammer and a drill. Python is my power tool. I’ve personally almost always felt that coding is what held me back from picking things up faster compared to math. To be clear, math is very important, I just wouldn’t personally focus on it first.

For the programming, I would make sure I had a solid understanding of the basics like variables, loops, and functions. I would also really focus on learning how to use imported libraries like pandas. In fact, I recommend looking through as much of the pandas documentation as possible. In my mind, coding for data science isn’t really coding. You are more leveraging tools that other people built that serve a specific purpose. For example, I think having a really great understanding of pandas would serve most data scientists better than having a phenomenal understanding of pure Python. I’ve included some free and paid resources for learning Python in the appendix below. Additionally, if you are interested in my approach for learning coding, check out this article I’ve written on the matter.

Previously, I would have told you to get started with projects right after this. In this article, I would recommend an extra step for most people. I’ve gotten a ton of feedback that most people don’t know where to start with projects after learning some of these basic skills. There is a really good solution to this. Look at other people’s projects. You can go on Kaggle and see the other projects people have done. You get to see all their code and all the comments they leave about their thought process. To me, this is an absolute goldmine. You get to have a front row seat to how brilliant data scientists approach a problem. I’m definitely not brilliant, but here’s a video of me walking through the Titanic dataset:

When starting with projects, they don’t have to be original. You can go through the exact same analysis that someone else did and still learn something. A typical learning session could just be you having someone else’s project on half the screen, and typing it line by line and running it on the other half of the screen. As you do this, you can change the parameters, experiment with the different visuals, and see how it all works as you go. Obviously, you shouldn’t take credit for this work or publish it as your own, but you can absolutely learn from it in this way. Many people think that they won’t learn anything with this approach, but I personally do this and it is the single thing that has taken me the furthest recently.

While you’re going through these different workbooks. You’ll inevitably start seeing different tools, algorithms, and techniques that you aren’t familiar with. You should be taking note of these and doing research about what they are.

Ok, now you’re getting familiar with the process. I recommend starting to get familiar with some of the statistics and algorithms you will be using. You will want to have a solid foundational understanding of statistics (e.g., central tendency, probability theory, etc.), linear algebra, and calculus (you could probably wait a little for this one). Start learning about what the difference is between classification, regression, and clustering algorithms, and start thinking about the types of problems that you can solve with these. Are there datasets that you are seeing that you could apply these algorithms to? Are there questions that you have that could be answered in one of these categories?

This is where projects would become the main focus of my learning. I would do as many projects that I could find. I would do them on Kaggle, with my own data, with any data I could find really. In the appendix below, I have added some data science project video playlists.

My friend the Data Professor says that the best way to learn data science is by doing data science, and I couldn’t agree more. Projects are the first place where you are doing real data science. Earlier, I mentioned that being introspective about what parts of data science are exciting to you was really important. This is where it plays a key role. In the early stages, you should focus your projects on things you’re interested in. The most important thing you can do with a project is actually make progress on it. If you’re excited enough about the topic or the techniques you’re using, you increase the odds that you learn as much as possible.

After learning the basics of Python and doing some projects, the world is really your oyster. I recommend doing more projects that are focused on skills that you have found to be relevant to your own journey. For example, in most companies, SQL is really important. If your goal is to get a job, it could be very worthwhile to pick that up. I don’t start with SQL because I think it is very easy to learn compared to Python and, if you can learn Python, you should be able to pick up SQL pretty quickly. If you’re fascinated with image analysis, you should probably direct your learning and projects towards deep learning or some of the other techniques there.

As you can tell, after a certain point in time, you really need to adapt your plan to fit your exact interests and aptitudes. You probably don’t want to hear this, but this is something you need to do for yourself.

Ken’s learning map
Image by author

And that is all there really is to it. If I want to learn a new skill or technique after this point, I read up on it and try to apply it as quickly as possible. Your projects and your work become a reference for how you have used many algorithms or techniques in the past.

As you grow, your iteration loops become tighter and you want to focus more on good learning habits than anything else. I created the #66DaysOfData to help perpetuate good habits on this process. You’re welcome to join in the initiative any time, I’ve left some links below in the appendix about what it is.

Most of you are probably wondering: How long does this process take? And that is a very difficult question. To be honest, I think you can get a good understanding of the basics and do projects in as little as 3 months. Most people will probably take around 6 months, though. I really don’t recommend focusing too much on how long it takes. This is a lifelong learning process, so it is not important if you learn it in 3 months, 6 months, 1 year, or even 5 years as long as you acquire the knowledge.

One thing I want to end on is the concept of goals. When you create your roadmap, start thinking about your goals for learning. What concepts would you like to learn, what analysis would you like to do? Most people shouldn’t be learning data science just to know the material, it should be what you want to use these skills to achieve. Have these things in mind when you learn, but don’t be afraid to adjust accordingly. How could you possibly set accurate goals if you know so little about the field in the beginning? Your goal setting, your projects, and your learning have to evolve as you continue to grow in the domain. I see so many people getting disappointed that they didn’t accomplish what they set out to do when they really had no clue what they were actually setting out to do to start with.

It may be some extra work, but I recommend reading this article again and think about your learning plan. Share your plan and goals below so that we can all keep each other accountable!

If you enjoyed this article, remember to follow me on Medium for more content like this and sign up for my newsletter to get weekly updates on my content creation and on additional learning resources in the data science industry! Also, consider supporting me and thousands of other writers by signing up for a membership.

Thank you so much for reading and good luck on your data science journey!


Periodically, I like to refresh my advice about how I would go about learning data science from ground zero. The data domain is changing rapidly, and, as my own knowledge grows, I think it is important to re-evaluate the approach that I would recommend.

My advice for learning data science has changed since my previous efforts at imparting this wisdom. In this article, I give you my updated approach about how I would take on learning this field and have some fun along the way. As a disclaimer, I don’t think there is one “correct” way to learn data science. Different things work for different people, and your own experimentation is integral to your success in any career. Be sure to read until the end because I answer the most commonly asked question I get on this topic: how long does it take to learn data science?

If you prefer a video format, watch here:

Starting off, I want to debunk the idea that you can “learn data science.” This implies that data science is a static subject that can be learned in its entirety. For better or for worse, data science is constantly evolving and growing. I don’t know a single person, including myself, that could possibly know the whole field. Learning data science is a journey, not a destination, and coming in with this mindset can make this process far more enjoyable for you. I see so many students who get overwhelmed by how large the field is. It will 100% be completely overwhelming if your goal is to learn the entire domain. On the other hand, it becomes manageable if you focus on just learning a little at a time and growing your knowledge with specific smaller goals in mind.

With that being said, let’s jump straight into something that I would change about my prior approach. When looking back, my advice about the very beginning has been extremely vague. I usually say something like “learn enough Python and statistics to get started with projects.” While this isn’t bad advice, this time around I want to get more into the weeds about exactly how you should start this learning process.

swimming without direction in the ocean is like learning without planning out the steps
Image by author

The real first step is getting an understanding of the components of the field and creating a learning program for yourself to navigate the journey. If we just jump into the ocean without any clear direction and start swimming, we get tired out really easily and may just give up. On the other hand, if we have a map and a clear objective, we at least know what we are getting ourselves into. You need to create this map for your data science learning before you do anything else. The really cool part of this is that by creating this map, you will also learn a lot about the field of data science in general.

symbol for a learning map
Image by author

So, how would I create a map? There are a couple ways to do this that vary in the level of overhead work. The easiest way is to take an online course or certificate that lays it out for you. That is the huge benefit of online courses: they can lay out the entire learning path for you so you just have to follow along. On the other hand, they cost money. All the information is out there for you to learn for free if you are willing to put in the time to create this roadmap for yourself. To be clear, there is no right or wrong here. Whether or not you decide to pay an organization, do what’s worth it to you. If courses are your speed, I have a discount code for 365 DataScience in the appendix below; if not, I’ve included some links to my favorite free resources.

The next way would be to look at multiple online courses, university courses, and other resources and to get a general feel for how they lay out the path. Most paid courses let you see how the offering is structured. You can then make your own roadmap based on the classes and concepts that you see there. By doing this, you also get a feeling for what skills and techniques are needed in the domain. The fun here is you get to see what is most interesting and appealing to you and adapt your roadmap to that. This is going to be really important later on.

Ok, let’s set up what my learning plan would look like for myself. I encourage you to do your own research here and adjust this based on your interests and aptitudes.

If I were to lay out a learning plan for myself, I would almost certainly start with learning Python. Coding languages allow you to build things. If you can build things, you can apply almost anything you’re working on to a real problem. I look at learning programming like building out my toolset. I could build a shed with just my bare hands, but it would be a heck of a lot easier with a hammer and a drill. Python is my power tool. I’ve personally almost always felt that coding is what held me back from picking things up faster compared to math. To be clear, math is very important, I just wouldn’t personally focus on it first.

For the programming, I would make sure I had a solid understanding of the basics like variables, loops, and functions. I would also really focus on learning how to use imported libraries like pandas. In fact, I recommend looking through as much of the pandas documentation as possible. In my mind, coding for data science isn’t really coding. You are more leveraging tools that other people built that serve a specific purpose. For example, I think having a really great understanding of pandas would serve most data scientists better than having a phenomenal understanding of pure Python. I’ve included some free and paid resources for learning Python in the appendix below. Additionally, if you are interested in my approach for learning coding, check out this article I’ve written on the matter.

Previously, I would have told you to get started with projects right after this. In this article, I would recommend an extra step for most people. I’ve gotten a ton of feedback that most people don’t know where to start with projects after learning some of these basic skills. There is a really good solution to this. Look at other people’s projects. You can go on Kaggle and see the other projects people have done. You get to see all their code and all the comments they leave about their thought process. To me, this is an absolute goldmine. You get to have a front row seat to how brilliant data scientists approach a problem. I’m definitely not brilliant, but here’s a video of me walking through the Titanic dataset:

When starting with projects, they don’t have to be original. You can go through the exact same analysis that someone else did and still learn something. A typical learning session could just be you having someone else’s project on half the screen, and typing it line by line and running it on the other half of the screen. As you do this, you can change the parameters, experiment with the different visuals, and see how it all works as you go. Obviously, you shouldn’t take credit for this work or publish it as your own, but you can absolutely learn from it in this way. Many people think that they won’t learn anything with this approach, but I personally do this and it is the single thing that has taken me the furthest recently.

While you’re going through these different workbooks. You’ll inevitably start seeing different tools, algorithms, and techniques that you aren’t familiar with. You should be taking note of these and doing research about what they are.

Ok, now you’re getting familiar with the process. I recommend starting to get familiar with some of the statistics and algorithms you will be using. You will want to have a solid foundational understanding of statistics (e.g., central tendency, probability theory, etc.), linear algebra, and calculus (you could probably wait a little for this one). Start learning about what the difference is between classification, regression, and clustering algorithms, and start thinking about the types of problems that you can solve with these. Are there datasets that you are seeing that you could apply these algorithms to? Are there questions that you have that could be answered in one of these categories?

This is where projects would become the main focus of my learning. I would do as many projects that I could find. I would do them on Kaggle, with my own data, with any data I could find really. In the appendix below, I have added some data science project video playlists.

My friend the Data Professor says that the best way to learn data science is by doing data science, and I couldn’t agree more. Projects are the first place where you are doing real data science. Earlier, I mentioned that being introspective about what parts of data science are exciting to you was really important. This is where it plays a key role. In the early stages, you should focus your projects on things you’re interested in. The most important thing you can do with a project is actually make progress on it. If you’re excited enough about the topic or the techniques you’re using, you increase the odds that you learn as much as possible.

After learning the basics of Python and doing some projects, the world is really your oyster. I recommend doing more projects that are focused on skills that you have found to be relevant to your own journey. For example, in most companies, SQL is really important. If your goal is to get a job, it could be very worthwhile to pick that up. I don’t start with SQL because I think it is very easy to learn compared to Python and, if you can learn Python, you should be able to pick up SQL pretty quickly. If you’re fascinated with image analysis, you should probably direct your learning and projects towards deep learning or some of the other techniques there.

As you can tell, after a certain point in time, you really need to adapt your plan to fit your exact interests and aptitudes. You probably don’t want to hear this, but this is something you need to do for yourself.

Ken’s learning map
Image by author

And that is all there really is to it. If I want to learn a new skill or technique after this point, I read up on it and try to apply it as quickly as possible. Your projects and your work become a reference for how you have used many algorithms or techniques in the past.

As you grow, your iteration loops become tighter and you want to focus more on good learning habits than anything else. I created the #66DaysOfData to help perpetuate good habits on this process. You’re welcome to join in the initiative any time, I’ve left some links below in the appendix about what it is.

Most of you are probably wondering: How long does this process take? And that is a very difficult question. To be honest, I think you can get a good understanding of the basics and do projects in as little as 3 months. Most people will probably take around 6 months, though. I really don’t recommend focusing too much on how long it takes. This is a lifelong learning process, so it is not important if you learn it in 3 months, 6 months, 1 year, or even 5 years as long as you acquire the knowledge.

One thing I want to end on is the concept of goals. When you create your roadmap, start thinking about your goals for learning. What concepts would you like to learn, what analysis would you like to do? Most people shouldn’t be learning data science just to know the material, it should be what you want to use these skills to achieve. Have these things in mind when you learn, but don’t be afraid to adjust accordingly. How could you possibly set accurate goals if you know so little about the field in the beginning? Your goal setting, your projects, and your learning have to evolve as you continue to grow in the domain. I see so many people getting disappointed that they didn’t accomplish what they set out to do when they really had no clue what they were actually setting out to do to start with.

It may be some extra work, but I recommend reading this article again and think about your learning plan. Share your plan and goals below so that we can all keep each other accountable!

If you enjoyed this article, remember to follow me on Medium for more content like this and sign up for my newsletter to get weekly updates on my content creation and on additional learning resources in the data science industry! Also, consider supporting me and thousands of other writers by signing up for a membership.

Thank you so much for reading and good luck on your data science journey!

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment