Techno Blender
Digitally Yours.

5 Reasons Why Everyone Should Take a Data Science Class | by Elise Landman | Oct, 2022

0 147


Why you should have an understanding of data and its science

Image from Unsplash.

Data Science is a hot topic at the moment, which you could consider a temporary hype — or not, depending on your point of view. Whatever view you follow, one thing is for sure: data science is here to stay. And it will be everywhere.

Google Ngram Viewer for “data science” from 1920–2019. Data from Google, Image by Author.

So, what makes it so popular, and why so suddenly? Some of the reasons for the rise of “data science” — also depicted by the above graph from the Google Ngram Viewer tool— can be summarized into three main points:

  • increase in computational power over the last few years
  • various breakthroughs and developments of algorithms and frameworks
  • more, and more data has been collected — there is a sense of need to make use of that data

… and time will tell even more reasons why data science as a field is, and can be so important.

Knowledge of data science is not just useful for those developing models, processing the data, analyzing the statistics, and coding all day. I suggest everyone to take a class in data science, no matter what industry you are working in.

Data Science is useful for everyone.

Why?

Statistics is a major building block of data science. A lot of popular methods used in data science rely in their core on classic statistical models. Doing a data science class will give you a basic understanding of statistics, and will generally help you understand descriptions, situations, and implications better.

Correlation vs Causation

You will get an understanding that a correlation is not always equal to causation, and that you cannot just make assumptions about certain events, just because there is a correlation between them. Causation means one thing causes another, whereas correlation is just a simple relationship between two events that relate to each other.

We often say “correlation does not imply causation”.

A simple example might be: we generally observe that ice cream sales and the number of shark attacks increase when the weather is warm and sunny. But, one does not cause the other. They are related, because good weather causes more people to eat ice cream at the beach and go for a swim in the sea, which in turn, increases the likelihood of a shark attack.

In your everyday life, this will help you understand situations and analyze claims better. It will make you more capable of distinguishing good arguments from false ones, and will help you in formulating valid and correct arguments to your peers.

Mean vs Median

Understanding the difference between the mean and the median is also something that is usually undervalued. People tend to use the mean more frequently than the median. Nevertheless, the mean can hide information on the data, which will only become visible to us by also looking at the median.

The mean is the average over all data points – the median splits the data in half i. e. 50% of our points are above, and 50% points the median value.

As an example, let’s look at the below datasets A and B of people’s salaries:

Example of two Datasets with Mean & Median Values. Image by Author.

We can see that group A has a much lower mean close to the median value. For group B, we changed one value in the dataset to a much bigger number than in group A, and instantly we can see how the mean increases — by a lot. Suddenly we realize that the mean value is quite a poor indication for the data in this set.

We can see that the median for both groups stayed equal, no matter the change in mean. We now get a much better understanding of our data: for group B the average is fairly high, but this is likely caused by some extreme values in our data, as our median is much lower.

In your everyday life, this will help you in better interpreting data and will make you more sceptical, when you are having data presented to you. It will make you understand that there is more to data than just the mean, and will make you a more credible and precise presenter, in case you should ever present data yourself.

… there is more!

There are far more benefits that you get from understanding the basics of statistics, the two above representing only a small portion and very basic examples of its potential value in your everyday life.

This is fairly similar and goes hand in hand with reason 1. You can hardly have a good understanding of data, without knowledge of statistics. One of the main things that I realized as soon as I started getting familiar with data, is that you get a much better understanding of the general tools and technologies around it.

By that I mean:

What is Data?

You will understand what forms and types data can come in: from structured data, to unstructured data. From quantitative and qualitative values, to nominal, ordinal, continuous and discrete.

You understand how and where data can be stored, to name a few:

  • CSV, XML and XLS files, as well as plain text files
  • relational and non-relational databases
  • binary data, like images.

You will also build up an understanding of how the data can be accessed: we can read in a CSV file with the help of a programming language (f. e. R, or Python), or query our relational data with some SQL language directly from our database.

You understand the 4 (or even up to 7) V’s of data: volume, variety, veracity, velocity, etc.

In everyday life, this will not only give you a good general overview of IT, but will also be handy knowledge for some IT-related conversations and small talks you might be having.

How do I show Data?

Data itself is a first step, explaining and visualizing is the next.

Most basic data science classes will touch upon the subject of data visualization, and very soon you will be hearing the sentence:

“Better not use pie charts for visualizing data.”

At first you may think: why? But the more examples you will see, the more you realize there is something to it, and soon you might be advocating the non-use of pie charts yourself.

Bar Chart vs Pie Chart Comparison. Image by Author.

In the above image, we have a good example of why pie charts (in general) are less favorable for visualization than f. e. bar charts. It is known that the human eye has a hard time interpreting and estimating angles, especially the more pieces the pie chart will have. If the pie chart would not have been annotated, would you have correctly estimated the difference between the “Debit Card” and “Digital Wallet” group values? Maybe, but it is often easier and clearer to read bar charts, which let us easily compare even the slightest differences in values.

In your everyday life, this will make you understand that you should take your interpretations of pie charts with a grain of salt, as these might mislead you. This will also make you create better presentations, and be a more impactful presenter of your data.

In almost every class in data science, you will likely have to deal with some introductory lessons in programming. This might sound scary to you — believe me, it’s not. You do not have to bring your programming skills to expert level in order to benefit from them.

Prior to taking my first data science class, I had never done any programming. So of course, it has been fairly tough — but every new thing you start learning is tough in the beginning. The question is: how long will you stick to learning it? Or, how consistent will you be? No matter how consistent you will stay, having any basic understanding of programming already has it’s perks. Here’s what I mean:

Seeing through a more logical lens

Firstly, you will start getting a feeling for thinking more logically, more programmatically. It will help you seeing the world through a more logical lens. Most of our modern world solutions are based on technology, and have been programmed in some way. You will be enlightened by finally understanding how these things could potentially work, high-level.

Automating your own tasks

Going further, you might even start working on your own little programming projects. You might start writing your own scripts and automate daily tasks. That sounds cool, doesn’t it? I can confirm: it is.

I wrote myself a bunch of scripts to automate certain time-intensive tasks, like f. e. copy-pasting or scraping content, or finding duplicate images on my PC. It brings such a rewarding feeling to see these projects come to life after hours of work.

You might now ask yourself what programming language to start with. In my opinion, nearly any programming language will give you a more logical view on the world. Nevertheless, some languages are better suited for certain tasks than others, and more flexible than others. I personally started with R, continued (and stayed) with Python, and I looked into SQL, Matlab and Java.

When it comes to the ease of use and support for data science-related content, I can definitely recommend to start with Python.

You might come in a situation where you talk to people, and the conversation about data science comes up. You will now be part of this conversation, since you have a basic understanding of what it is about. This might help you in having more meaningful conversations, and make you create connections to some people, or teams easier than before.

Your understanding of data science, no matter in whatever industry you might be in, will make you more knowledgeable and will give you the opportunity to be the connection between teams — the person that both understands one side of the story, and the data science part of the story.

You might read an article about some AI algorithm taking over the world, or some breakthroughs with a newly developed prediction algorithm: now you will have a brief understanding of what this consists of. Not (yet) detailed of course, but enough to understand the meaning of the content such an article might have.

In everyday life, social connections to people, as well as staying up to date on technology are two very important parts for our future. Technology is everywhere, and it won’t get less. It is necessary to understand what surrounds us everyday.

After a few introductory classes in data science, you will understand better why data is so valuable, what you can achieve with it, and why f. e. the online ads business has been so successful.

You will understand that the technologies in data science are great and can be used for good, but that data can also be concerning, f. e. when it is biased. There might also be concerns about the data ethics, or about the lack of interpretability of a certain algorithm.

In everyday life, this will give you a good understanding of why there is such potential, but also concern about data. You will start asking yourself more questions about your own data, and about your data privacy.


Why you should have an understanding of data and its science

Image from Unsplash.

Data Science is a hot topic at the moment, which you could consider a temporary hype — or not, depending on your point of view. Whatever view you follow, one thing is for sure: data science is here to stay. And it will be everywhere.

Google Ngram Viewer for “data science” from 1920–2019. Data from Google, Image by Author.

So, what makes it so popular, and why so suddenly? Some of the reasons for the rise of “data science” — also depicted by the above graph from the Google Ngram Viewer tool— can be summarized into three main points:

  • increase in computational power over the last few years
  • various breakthroughs and developments of algorithms and frameworks
  • more, and more data has been collected — there is a sense of need to make use of that data

… and time will tell even more reasons why data science as a field is, and can be so important.

Knowledge of data science is not just useful for those developing models, processing the data, analyzing the statistics, and coding all day. I suggest everyone to take a class in data science, no matter what industry you are working in.

Data Science is useful for everyone.

Why?

Statistics is a major building block of data science. A lot of popular methods used in data science rely in their core on classic statistical models. Doing a data science class will give you a basic understanding of statistics, and will generally help you understand descriptions, situations, and implications better.

Correlation vs Causation

You will get an understanding that a correlation is not always equal to causation, and that you cannot just make assumptions about certain events, just because there is a correlation between them. Causation means one thing causes another, whereas correlation is just a simple relationship between two events that relate to each other.

We often say “correlation does not imply causation”.

A simple example might be: we generally observe that ice cream sales and the number of shark attacks increase when the weather is warm and sunny. But, one does not cause the other. They are related, because good weather causes more people to eat ice cream at the beach and go for a swim in the sea, which in turn, increases the likelihood of a shark attack.

In your everyday life, this will help you understand situations and analyze claims better. It will make you more capable of distinguishing good arguments from false ones, and will help you in formulating valid and correct arguments to your peers.

Mean vs Median

Understanding the difference between the mean and the median is also something that is usually undervalued. People tend to use the mean more frequently than the median. Nevertheless, the mean can hide information on the data, which will only become visible to us by also looking at the median.

The mean is the average over all data points – the median splits the data in half i. e. 50% of our points are above, and 50% points the median value.

As an example, let’s look at the below datasets A and B of people’s salaries:

Example of two Datasets with Mean & Median Values. Image by Author.

We can see that group A has a much lower mean close to the median value. For group B, we changed one value in the dataset to a much bigger number than in group A, and instantly we can see how the mean increases — by a lot. Suddenly we realize that the mean value is quite a poor indication for the data in this set.

We can see that the median for both groups stayed equal, no matter the change in mean. We now get a much better understanding of our data: for group B the average is fairly high, but this is likely caused by some extreme values in our data, as our median is much lower.

In your everyday life, this will help you in better interpreting data and will make you more sceptical, when you are having data presented to you. It will make you understand that there is more to data than just the mean, and will make you a more credible and precise presenter, in case you should ever present data yourself.

… there is more!

There are far more benefits that you get from understanding the basics of statistics, the two above representing only a small portion and very basic examples of its potential value in your everyday life.

This is fairly similar and goes hand in hand with reason 1. You can hardly have a good understanding of data, without knowledge of statistics. One of the main things that I realized as soon as I started getting familiar with data, is that you get a much better understanding of the general tools and technologies around it.

By that I mean:

What is Data?

You will understand what forms and types data can come in: from structured data, to unstructured data. From quantitative and qualitative values, to nominal, ordinal, continuous and discrete.

You understand how and where data can be stored, to name a few:

  • CSV, XML and XLS files, as well as plain text files
  • relational and non-relational databases
  • binary data, like images.

You will also build up an understanding of how the data can be accessed: we can read in a CSV file with the help of a programming language (f. e. R, or Python), or query our relational data with some SQL language directly from our database.

You understand the 4 (or even up to 7) V’s of data: volume, variety, veracity, velocity, etc.

In everyday life, this will not only give you a good general overview of IT, but will also be handy knowledge for some IT-related conversations and small talks you might be having.

How do I show Data?

Data itself is a first step, explaining and visualizing is the next.

Most basic data science classes will touch upon the subject of data visualization, and very soon you will be hearing the sentence:

“Better not use pie charts for visualizing data.”

At first you may think: why? But the more examples you will see, the more you realize there is something to it, and soon you might be advocating the non-use of pie charts yourself.

Bar Chart vs Pie Chart Comparison. Image by Author.

In the above image, we have a good example of why pie charts (in general) are less favorable for visualization than f. e. bar charts. It is known that the human eye has a hard time interpreting and estimating angles, especially the more pieces the pie chart will have. If the pie chart would not have been annotated, would you have correctly estimated the difference between the “Debit Card” and “Digital Wallet” group values? Maybe, but it is often easier and clearer to read bar charts, which let us easily compare even the slightest differences in values.

In your everyday life, this will make you understand that you should take your interpretations of pie charts with a grain of salt, as these might mislead you. This will also make you create better presentations, and be a more impactful presenter of your data.

In almost every class in data science, you will likely have to deal with some introductory lessons in programming. This might sound scary to you — believe me, it’s not. You do not have to bring your programming skills to expert level in order to benefit from them.

Prior to taking my first data science class, I had never done any programming. So of course, it has been fairly tough — but every new thing you start learning is tough in the beginning. The question is: how long will you stick to learning it? Or, how consistent will you be? No matter how consistent you will stay, having any basic understanding of programming already has it’s perks. Here’s what I mean:

Seeing through a more logical lens

Firstly, you will start getting a feeling for thinking more logically, more programmatically. It will help you seeing the world through a more logical lens. Most of our modern world solutions are based on technology, and have been programmed in some way. You will be enlightened by finally understanding how these things could potentially work, high-level.

Automating your own tasks

Going further, you might even start working on your own little programming projects. You might start writing your own scripts and automate daily tasks. That sounds cool, doesn’t it? I can confirm: it is.

I wrote myself a bunch of scripts to automate certain time-intensive tasks, like f. e. copy-pasting or scraping content, or finding duplicate images on my PC. It brings such a rewarding feeling to see these projects come to life after hours of work.

You might now ask yourself what programming language to start with. In my opinion, nearly any programming language will give you a more logical view on the world. Nevertheless, some languages are better suited for certain tasks than others, and more flexible than others. I personally started with R, continued (and stayed) with Python, and I looked into SQL, Matlab and Java.

When it comes to the ease of use and support for data science-related content, I can definitely recommend to start with Python.

You might come in a situation where you talk to people, and the conversation about data science comes up. You will now be part of this conversation, since you have a basic understanding of what it is about. This might help you in having more meaningful conversations, and make you create connections to some people, or teams easier than before.

Your understanding of data science, no matter in whatever industry you might be in, will make you more knowledgeable and will give you the opportunity to be the connection between teams — the person that both understands one side of the story, and the data science part of the story.

You might read an article about some AI algorithm taking over the world, or some breakthroughs with a newly developed prediction algorithm: now you will have a brief understanding of what this consists of. Not (yet) detailed of course, but enough to understand the meaning of the content such an article might have.

In everyday life, social connections to people, as well as staying up to date on technology are two very important parts for our future. Technology is everywhere, and it won’t get less. It is necessary to understand what surrounds us everyday.

After a few introductory classes in data science, you will understand better why data is so valuable, what you can achieve with it, and why f. e. the online ads business has been so successful.

You will understand that the technologies in data science are great and can be used for good, but that data can also be concerning, f. e. when it is biased. There might also be concerns about the data ethics, or about the lack of interpretability of a certain algorithm.

In everyday life, this will give you a good understanding of why there is such potential, but also concern about data. You will start asking yourself more questions about your own data, and about your data privacy.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment