Techno Blender
Digitally Yours.

Pandas & Python Tricks for Data Science & Data Analysis — Part 3 | by Zoumana Keita | Feb, 2023

0 27


Photo by Andrew Neel on Unsplash

A couple of days ago, I shared some Python and Pandas tricks to help Data Analysts and Data Scientists quickly learn new valuable concepts that they might not be aware of. This is also part of the collection of tricks I share daily on LinkedIn.

Replace values from a dataframe based on conditions

If you want to replace values from a dataframe based on conditions

✅ You can use the built-in 𝗺𝗮𝘀𝗸() function from Pandas.

Below is an illustration 💡

Apply colors to your Pandas dataframe

Have you ever wanted to quickly find some information JUST by looking at your dataframe❓

Things like:

✨ Which values are negative in each column?

✨ What is the maximum or the minimum value of each column?

✨ Which values are below or above the average?

The list goes on and on…

A great way of viewing such information is by using colors 🎨

✅ 𝗣𝗮𝗻𝗱𝗮𝘀.𝘀𝘁𝘆𝗹𝗲 is a built-in module that provides a high-level interface for styling your dataframe.

Print Pandas dataframe in Markdown

It is always better to print your data frame in a way that makes it easier to understand.

✅ One way of doing that is to render it in a markdown format using the .𝚝𝚘_𝚖𝚊𝚛𝚔𝚍𝚘𝚠𝚗() function.

Let me know in the comment which one is your favorite.

✨ With Markdown ✅ or without markdown ❌

SQL-like queries through dataframe

Pandas’ power can’t be explored enough in Data Science💻📊

As Data Analyst or Scientist, you might want to filter 🔎 through your data to find relevant insights.

✅ This can be achieved using the built-in 𝗾𝘂𝗲𝗿𝘆() function in Pandas.

It runs queries based on boolean expressions, as you would write a natural language sentence! 💬

Below is an illustration 🚀

Transform Scikit Learn Processing to Pandas dataframe

If you have been taking a closer look 🧐 at Scikit learn preprocessing module, you might have noticed that the underlying functions return a numpy array 🔢.

This can make it difficult to keep track of the original names of the features in the data.

Wouldn’t it be nice to have a Pandas 🐼 dataframe instead without any additional lines of code to keep those features’ names?

✅ This can be achieved using the 𝘀𝗲𝘁_𝗼𝘂𝘁𝗽𝘂𝘁 API from the new version (1.2) of Scikit Learn.

Below is an illustration 💡

Extract periods from the Datetime column

Days, weeks, months, or quarters 🗓, ….. Each one can play an important role depending on the tasks at hand.

✅ With the 𝘁𝗼_𝗽𝗲𝗿𝗶𝗼𝗱() function, you can extract from the date column each of such relevant information.

Below is an illustration 💡

Number of elements in a list

Still using loops 🔁 to determine how often each item occurs in a list?

Maybe there is a better and much more elegant Pythonic 🐍 way!

✅ You can use the 𝗖𝗼𝘂𝗻𝘁𝗲𝗿 class from Python to compute the counts of the elements in a list.

Below is an illustration 💡

Combine elements from multiple lists

Are you trying to aggregate elements from multiple lists?

❌ Stop using 𝗳𝗼𝗿 loops 🔁 and adopt the following approach.

✅ The Python built-in 𝘇𝗶𝗽() function.

Below is an illustration 💡

Create multiple lists from aggregated elements

When trying to aggregate elements from multiple lists, the most elegant and Pythonic way is to use the built-in 𝘇𝗶𝗽() function.

Now, what if you want to proceed the other way around: create multiple lists from those aggregated elements❓

❌ Forget 𝗳𝗼𝗿 loops 🔁

✅ Just combine the 𝘇𝗶𝗽() function with 𝗮𝘀𝘁𝗲𝗿𝗶𝘀𝗸 *

Below is an illustration 💡

I am a big fan of list comprehension

Don’t just code in Python like most people, take the shortcut and the most efficient approach.

Imagine that you want to create a list with only even numbers from an existing one. The most obvious idea is using a “for” loop. But the most elegant one is using a list comprehension, which is more compact, simpler, and easier to debug.

Below is an illustration 💡

Where there is list comprehension there is a dictionary comprehension

Similarly to list comprehension, it is also possible to create dictionary comprehension. It also provides the same benefits compared to list comprehension.

Let’s consider having a dictionary where the keys are the index and the values are the actual numbers from the original list with the constraint of being an even number.

Below is an illustration 💡

Thank you for reading! 🎉 🍾

I hope you found this list of Python and Pandas tricks helpful! Keep an eye on here, because the content will be maintained with more tricks on a daily basis.

Also, If you like reading my stories and wish to support my writing, consider becoming a Medium member. With a $ 5-a-month commitment, you unlock unlimited access to stories on Medium.

Would you like to buy me a coffee ☕️? → Here you go!

Feel free to follow me on Medium, Twitter, and YouTube, or say Hi on LinkedIn. It is always a pleasure to discuss AI, ML, Data Science, NLP, and MLOps stuff!

Before you leave find the last two parts of this series below:

Pandas & Python Tricks for Data Science & Data Analysis — Part 1

Pandas & Python Tricks for Data Science & Data Analysis — Part 2




Photo by Andrew Neel on Unsplash

A couple of days ago, I shared some Python and Pandas tricks to help Data Analysts and Data Scientists quickly learn new valuable concepts that they might not be aware of. This is also part of the collection of tricks I share daily on LinkedIn.

Replace values from a dataframe based on conditions

If you want to replace values from a dataframe based on conditions

✅ You can use the built-in 𝗺𝗮𝘀𝗸() function from Pandas.

Below is an illustration 💡

Apply colors to your Pandas dataframe

Have you ever wanted to quickly find some information JUST by looking at your dataframe❓

Things like:

✨ Which values are negative in each column?

✨ What is the maximum or the minimum value of each column?

✨ Which values are below or above the average?

The list goes on and on…

A great way of viewing such information is by using colors 🎨

✅ 𝗣𝗮𝗻𝗱𝗮𝘀.𝘀𝘁𝘆𝗹𝗲 is a built-in module that provides a high-level interface for styling your dataframe.

Print Pandas dataframe in Markdown

It is always better to print your data frame in a way that makes it easier to understand.

✅ One way of doing that is to render it in a markdown format using the .𝚝𝚘_𝚖𝚊𝚛𝚔𝚍𝚘𝚠𝚗() function.

Let me know in the comment which one is your favorite.

✨ With Markdown ✅ or without markdown ❌

SQL-like queries through dataframe

Pandas’ power can’t be explored enough in Data Science💻📊

As Data Analyst or Scientist, you might want to filter 🔎 through your data to find relevant insights.

✅ This can be achieved using the built-in 𝗾𝘂𝗲𝗿𝘆() function in Pandas.

It runs queries based on boolean expressions, as you would write a natural language sentence! 💬

Below is an illustration 🚀

Transform Scikit Learn Processing to Pandas dataframe

If you have been taking a closer look 🧐 at Scikit learn preprocessing module, you might have noticed that the underlying functions return a numpy array 🔢.

This can make it difficult to keep track of the original names of the features in the data.

Wouldn’t it be nice to have a Pandas 🐼 dataframe instead without any additional lines of code to keep those features’ names?

✅ This can be achieved using the 𝘀𝗲𝘁_𝗼𝘂𝘁𝗽𝘂𝘁 API from the new version (1.2) of Scikit Learn.

Below is an illustration 💡

Extract periods from the Datetime column

Days, weeks, months, or quarters 🗓, ….. Each one can play an important role depending on the tasks at hand.

✅ With the 𝘁𝗼_𝗽𝗲𝗿𝗶𝗼𝗱() function, you can extract from the date column each of such relevant information.

Below is an illustration 💡

Number of elements in a list

Still using loops 🔁 to determine how often each item occurs in a list?

Maybe there is a better and much more elegant Pythonic 🐍 way!

✅ You can use the 𝗖𝗼𝘂𝗻𝘁𝗲𝗿 class from Python to compute the counts of the elements in a list.

Below is an illustration 💡

Combine elements from multiple lists

Are you trying to aggregate elements from multiple lists?

❌ Stop using 𝗳𝗼𝗿 loops 🔁 and adopt the following approach.

✅ The Python built-in 𝘇𝗶𝗽() function.

Below is an illustration 💡

Create multiple lists from aggregated elements

When trying to aggregate elements from multiple lists, the most elegant and Pythonic way is to use the built-in 𝘇𝗶𝗽() function.

Now, what if you want to proceed the other way around: create multiple lists from those aggregated elements❓

❌ Forget 𝗳𝗼𝗿 loops 🔁

✅ Just combine the 𝘇𝗶𝗽() function with 𝗮𝘀𝘁𝗲𝗿𝗶𝘀𝗸 *

Below is an illustration 💡

I am a big fan of list comprehension

Don’t just code in Python like most people, take the shortcut and the most efficient approach.

Imagine that you want to create a list with only even numbers from an existing one. The most obvious idea is using a “for” loop. But the most elegant one is using a list comprehension, which is more compact, simpler, and easier to debug.

Below is an illustration 💡

Where there is list comprehension there is a dictionary comprehension

Similarly to list comprehension, it is also possible to create dictionary comprehension. It also provides the same benefits compared to list comprehension.

Let’s consider having a dictionary where the keys are the index and the values are the actual numbers from the original list with the constraint of being an even number.

Below is an illustration 💡

Thank you for reading! 🎉 🍾

I hope you found this list of Python and Pandas tricks helpful! Keep an eye on here, because the content will be maintained with more tricks on a daily basis.

Also, If you like reading my stories and wish to support my writing, consider becoming a Medium member. With a $ 5-a-month commitment, you unlock unlimited access to stories on Medium.

Would you like to buy me a coffee ☕️? → Here you go!

Feel free to follow me on Medium, Twitter, and YouTube, or say Hi on LinkedIn. It is always a pleasure to discuss AI, ML, Data Science, NLP, and MLOps stuff!

Before you leave find the last two parts of this series below:

Pandas & Python Tricks for Data Science & Data Analysis — Part 1

Pandas & Python Tricks for Data Science & Data Analysis — Part 2

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment