Techno Blender
Digitally Yours.

Automate Your Mundane Excel Reporting with Python | by Nik Piepenbreier | May, 2022

0 83


Learn How to Use Excel to Automate Excel Reporting

Photo by Campaign Creators on Unsplash

Excel is powerful and it’s everywhere. But it’s also repetitive and manual. Sure, you could use tools like VBA to automate reporting, but using a general-purpose language like Python allows you to automate broader aspects of your reporting processes (like moving and emailing files).

By the end of this post, you’ll have learned how to:

  1. How to combine multiple Excel files into a single file
  2. Summarize Excel data with Pandas pivot tables
  3. Adding header rows in your Excel reports
  4. Adding dynamic charts into your Excel files with Python
  5. Styling your Excel files with Python

Let’s get started!

Want to watch this as a video tutorial? Check out my full video tutorial below:

In order to follow along with this tutorial, we’ll be working with three libraries: os, pandas, and openpyxl. The first of these, os, is bundled with Python so you don’t need to install it. The other two, however, need to be installed using either pip or conda.

Let’s take a look at how you can install these libraries in Python:

Using either method above in your terminal will install the required libraries. Now, let’s take a look at the data that you can use to follow along with this tutorial.

Now that we have the libraries installed, we can import the libraries and classes that we’ll be using:

You can download the files here. The zip file contains 3 different Excel files, each containing sales information for different months. While storing data like this makes sense, but it can make analyzing the data quite difficult.

Because of this, we need to first combine all of these files. That’s what you’ll learn in the next section!

The Pandas library stores data in DataFrame object, which can be thought of as an Excel table (though this is a bit of a simplification). Let’s break down what we want to do and then we’ll take a look at how to do this using Python:

  1. Collect all of our files into a Python list
  2. Loop over each file and append it to a Pandas DataFrame

Let’s write some code and see how we can make this happen in Python!

Let’s break down what we did here:

  1. In Section 1, we first loaded the path where the files are saved and used the os.list() function to get a list of all the files contained in that folder
  2. In Section 2, we first created an empty DataFrame then looped over each file, loaded it into a DataFrame and appended it to our combined DataFrame
  3. Finally, we saved the file to its own Excel file

That was easy, wasn’t it? There are many different ways to accomplish this task, and that’s the beauty of Python!

Now that our data is loaded, let’s learn how to summarize our data with Pandas.

Photo by Scott Graham on Unsplash

In this section, you’ll learn how to use Pandas to create a summary table! For this, we’ll use the aptly-named Pandas .pivot_table() function.

The function is designed to feel familiar with pivot tables in Excel. With this, we can figure out how we’d like to summarize our data. Let’s say we wanted to figure out what the total amount sold by each Salesperson was. We could write the following:

In the code above, we used the Pandas .pivot_table() function to create a pivot table. Much of this should feel similar to an Excel pivot table. By default, however, Pandas will use an aggregation function of 'mean', so we need to set it to 'sum'.

In this section, you’ll learn how to add descriptive header rows to your Excel reports, in order to make them more print-ready. For this, we’ll start using the Openpyxl library.

Openpyxl works by loading the workbook into memory. From there, you can access different attributes and objects within it, such as worksheets and the cells within these worksheets.

The key difference here is that you’re working directly with the workbook. With Pandas, we simply saved to the workbook. (I’ll admit there’s much more to this, but it’s a good way of thinking about it.)

In the code above:

  1. We loaded a workbook object, wb. From there, we accessed the worksheet.
  2. We were able to manipulate the worksheet by using the .insert_rows() method to insert three rows.
  3. These rows were added at index 0 (the first row) and included three rows. We assigned helpful values to two cells.
  4. Finally, we saved the workbook using the Openpyxl .save() method.
Photo by Towfiqu barbhuiya on Unsplash

In this section, we’ll take a look at adding a chart to the Excel file. One of the great things about the Openpyxl library is that we can create Excel-based charts, that remain dynamic to the data.

In order to make our chart dynamic, we need to create Reference() objects which, as the name might imply, hold references to places in our workbook.

We’ll be creating a bar chart and affixing both data and categories to the workbook. Let’s take a look at how we can do this:

Let’s break down what the code above does:

  1. We set two Referenceobjects: one for our data and one for our categories. These Reference objects will tie our graph to specific cells, allowing them to remain dynamic.
  2. We add a BarChartobject. To the chart, we use the .add_data() and .set_categories() methods to pass in our Reference objects.
  3. Finally, we add the chart to specific anchor point on our worksheet.

In this final section, we’ll take a look at how we can style our workbook using OpenPyxl. OpenPyxl can add styles to cells in an Excel workbook based on the styles that exist in Excel.

This means that we can add styles like Title and currency. Let’s see how this works, using the .style attribute:

In the example above, we used the .style attribute to assign different styles to different cells. The reason we use the for loop in cells 5–6 is that OpenPyxl doesn’t let you assign styles to ranges, so we assign them one at a time.

Great! You’ve automated your boring Excel reporting! Now, what should you look at next? I’d recommend thinking about other elements that you may need to automate. For example:

  • How can you add names to worksheets?
  • How can you email your resulting file automatically?
  • How can you style values as tables?

Thanks for reading all the way through the tutorial! I hope that you enjoyed it and that you learned something. Using Python to automate your work can be a really rewarding thing to learn. If you want to learn more about it, please consider subscribing to my YouTube channel by clicking here.


Learn How to Use Excel to Automate Excel Reporting

Photo by Campaign Creators on Unsplash

Excel is powerful and it’s everywhere. But it’s also repetitive and manual. Sure, you could use tools like VBA to automate reporting, but using a general-purpose language like Python allows you to automate broader aspects of your reporting processes (like moving and emailing files).

By the end of this post, you’ll have learned how to:

  1. How to combine multiple Excel files into a single file
  2. Summarize Excel data with Pandas pivot tables
  3. Adding header rows in your Excel reports
  4. Adding dynamic charts into your Excel files with Python
  5. Styling your Excel files with Python

Let’s get started!

Want to watch this as a video tutorial? Check out my full video tutorial below:

In order to follow along with this tutorial, we’ll be working with three libraries: os, pandas, and openpyxl. The first of these, os, is bundled with Python so you don’t need to install it. The other two, however, need to be installed using either pip or conda.

Let’s take a look at how you can install these libraries in Python:

Using either method above in your terminal will install the required libraries. Now, let’s take a look at the data that you can use to follow along with this tutorial.

Now that we have the libraries installed, we can import the libraries and classes that we’ll be using:

You can download the files here. The zip file contains 3 different Excel files, each containing sales information for different months. While storing data like this makes sense, but it can make analyzing the data quite difficult.

Because of this, we need to first combine all of these files. That’s what you’ll learn in the next section!

The Pandas library stores data in DataFrame object, which can be thought of as an Excel table (though this is a bit of a simplification). Let’s break down what we want to do and then we’ll take a look at how to do this using Python:

  1. Collect all of our files into a Python list
  2. Loop over each file and append it to a Pandas DataFrame

Let’s write some code and see how we can make this happen in Python!

Let’s break down what we did here:

  1. In Section 1, we first loaded the path where the files are saved and used the os.list() function to get a list of all the files contained in that folder
  2. In Section 2, we first created an empty DataFrame then looped over each file, loaded it into a DataFrame and appended it to our combined DataFrame
  3. Finally, we saved the file to its own Excel file

That was easy, wasn’t it? There are many different ways to accomplish this task, and that’s the beauty of Python!

Now that our data is loaded, let’s learn how to summarize our data with Pandas.

Photo by Scott Graham on Unsplash

In this section, you’ll learn how to use Pandas to create a summary table! For this, we’ll use the aptly-named Pandas .pivot_table() function.

The function is designed to feel familiar with pivot tables in Excel. With this, we can figure out how we’d like to summarize our data. Let’s say we wanted to figure out what the total amount sold by each Salesperson was. We could write the following:

In the code above, we used the Pandas .pivot_table() function to create a pivot table. Much of this should feel similar to an Excel pivot table. By default, however, Pandas will use an aggregation function of 'mean', so we need to set it to 'sum'.

In this section, you’ll learn how to add descriptive header rows to your Excel reports, in order to make them more print-ready. For this, we’ll start using the Openpyxl library.

Openpyxl works by loading the workbook into memory. From there, you can access different attributes and objects within it, such as worksheets and the cells within these worksheets.

The key difference here is that you’re working directly with the workbook. With Pandas, we simply saved to the workbook. (I’ll admit there’s much more to this, but it’s a good way of thinking about it.)

In the code above:

  1. We loaded a workbook object, wb. From there, we accessed the worksheet.
  2. We were able to manipulate the worksheet by using the .insert_rows() method to insert three rows.
  3. These rows were added at index 0 (the first row) and included three rows. We assigned helpful values to two cells.
  4. Finally, we saved the workbook using the Openpyxl .save() method.
Photo by Towfiqu barbhuiya on Unsplash

In this section, we’ll take a look at adding a chart to the Excel file. One of the great things about the Openpyxl library is that we can create Excel-based charts, that remain dynamic to the data.

In order to make our chart dynamic, we need to create Reference() objects which, as the name might imply, hold references to places in our workbook.

We’ll be creating a bar chart and affixing both data and categories to the workbook. Let’s take a look at how we can do this:

Let’s break down what the code above does:

  1. We set two Referenceobjects: one for our data and one for our categories. These Reference objects will tie our graph to specific cells, allowing them to remain dynamic.
  2. We add a BarChartobject. To the chart, we use the .add_data() and .set_categories() methods to pass in our Reference objects.
  3. Finally, we add the chart to specific anchor point on our worksheet.

In this final section, we’ll take a look at how we can style our workbook using OpenPyxl. OpenPyxl can add styles to cells in an Excel workbook based on the styles that exist in Excel.

This means that we can add styles like Title and currency. Let’s see how this works, using the .style attribute:

In the example above, we used the .style attribute to assign different styles to different cells. The reason we use the for loop in cells 5–6 is that OpenPyxl doesn’t let you assign styles to ranges, so we assign them one at a time.

Great! You’ve automated your boring Excel reporting! Now, what should you look at next? I’d recommend thinking about other elements that you may need to automate. For example:

  • How can you add names to worksheets?
  • How can you email your resulting file automatically?
  • How can you style values as tables?

Thanks for reading all the way through the tutorial! I hope that you enjoyed it and that you learned something. Using Python to automate your work can be a really rewarding thing to learn. If you want to learn more about it, please consider subscribing to my YouTube channel by clicking here.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment