Techno Blender
Digitally Yours.

PowerBI Performance Improvement Strategies Used by Professionals | by Saleha | Dec, 2022

0 64


Photo by Campaign Creators on Unsplash

Creating a report is simple. Anyone can do it. However, what distinguishes professionals from others is the report’s performance. Performance is crucial when delivering the report. Improving the performance of Power BI can have several benefits. First, it can make the tool more responsive and easier to use, which can improve user satisfaction and productivity. Second, faster performance can allow for more complex and data-intensive operations, which can enable more advanced analysis and insights. Overall, improving the performance of Power BI can help to ensure that the tool is effective and efficient in supporting the needs of users and their data analysis tasks.

Before we begin, let’s review the Power BI quick summary. Power BI is a data analytics and visualization platform that can collect, analyze, and visualize data from various sources. It includes tools for data acquisition, cleaning, and transformation, as well as a range of pre-built visualizations and dashboards that can be used to gain insights from data. Power BI also offers connectivity to a wide range of data sources, including databases, cloud services, and web APIs, which makes it easy to integrate data from multiple sources and analyze it in a single platform. With Power BI, users can collect, clean, and transform data, create interactive visualizations and dashboards, and share their insights with others in their organization.

Let’s get started. I’ve broken down the report creation process into four steps. They are as follows:

  • Data acquisition: Getting Data into PowerBI
  • Data modeling: Establishing relation mapping between data tables
  • Model advancement: Creating measures and new columns if needed
  • Report creation (Add visualization): Put all visuals together

This post will focus on the first component, which is data acquisition. Which include both Data Connection and Data Load. We shall explore connection type, Direct query, and import mode in Data connection. In the Data Load section, we will then explore the distinction between Full Load and Incremental Load, as well as the recommended practices for Data connection and Dat Load.

**Note: Performance improvement of data modelling, model advancement, and report creation will be published in the successive articles.

Report generation process

Getting data into PowerBI is called Data Acquisition. I am dividing the data acquisition process into two logical partitions. First, creating a connection with the data source and then defining how powerBI will load the data in the system on refresh.

1.1: Type of data connection

Data can be acquired in powerBI in two ways:

i. Direct Query

Direct Query is the method where your dashboard will be directly querying the data source at run time. Every filter and interaction with the report will kick off further queries.

Direct Query Diagram

Benefits of using Direct Query:

  • It enables you to interact with your data in real time, ensuring you always have the most up-to-date information. This is especially handy when working with massive, continually changing data sets.
  • You can connect to a wide range of data sources. This includes relational databases, big data sources, and even streaming data.

Considerations to Make When Using DirectQuery (performance wise):

-The time it takes the back-end source to respond with the query results determines the required visual refresh time because all DirectQuery requests are forwarded to the source database. The recommended response time (with the return of the requested data) when using DirectQuery for visuals is less than five seconds.

It is critical that your data source respond quickly. If I have to do a direct query on a SQL database, I always start by writing a SQL query to get the total number of row counts for the specific table. I would consider database performance tuning if it takes more than five seconds. Here are two methods for optimizing performance.

  • Use indexes:
    Indexes are an important aspect of database performance tuning because they allow the database system to quickly discover and get data from tables, which considerably improves query efficiency. Instead of having to scan the entire table, the database may quickly search the index and discover the relevant rows in the table
    – However, Indexes may not serve to increase database performance in some circumstances, and may even have a negative impact on performance. For example, if a query does not relate to the indexed columns, the index will not be used, and the query will have to scan the entire database to retrieve the necessary data. If an index contains too many rows, it may not provide much of a performance boost because the database must still scan a significant number of rows to identify the relevant data.
  • Type of Joins:
    Joins are an important part of database performance tuning because they can greatly affect the performance of queries that use multiple tables.
    – There are several different types of table joins, including inner joins, outer joins, and cross joins. Inner joins are the most common and return rows that have matching values in both tables. Outer joins, on the other hand, return all rows from one table and the matching rows from the other table. Cross joins return all possible combinations of rows from both tables. By optimizing the way that tables are joined, it is possible to improve the efficiency and performance of these queries.
  • Create a query plan:
    – A query plan is a detailed description of the actions that will be taken by the database system to perform a SQL query. It describes the order of operations, the data structures and algorithms to be used, and other parameters about how the query will be conducted. Query plans are an important part of database performance tuning since they provide information about how a query will be run and can assist discover areas for improvement.
    – Query plan performance tuning frequently involves altering the query, the database design, or the database configuration to increase query performance. This could include optimizing the usage of indices, changing the schema to optimize data access, or using other approaches to increase query efficiency.
    – Understanding and analyzing query plans is a crucial component of database performance tuning since it can assist guarantee that queries are executed efficiently.
  • Query Hints:
    – A query hint is a directive that is used in a SQL query to provide additional information or instructions to the database system on how to perform the query. Query hints can be used to influence the query strategy and hence the query’s performance. Query hints are often used when the database optimizer is unable to select the best plan for a query and the user wishes to provide further direction to the system.
    – A query hint, for example, could be used to identify a certain index to use for a query or to specify how a query should be processed.
    – Performance tuning with query hints can be an effective technique to increase the performance of individual queries, but it must be handled with care. Overuse of query hints can lead to complex and fragile queries that are difficult to maintain and understand, and that may not perform well in different situations.
    – In general, query hints should be used only as a last resort, after all, other performance tuning strategies have been explored.

ii. Import

Import makes a copy of your data source and creates a snapshot. Whenever you request data, it directly goes with the stored data instead of connecting to the direct source.

Import Mode Diagram

Benefits of using Import:

  • Since data is cached, the performance is better than direct query.
  • All DAX expressions are fully supported.

Considerations to make when using Import Mode:

  • Data is not real-time without refresh.
  • Depending on the size of the imported data, a large amount of memory and disk space is needed. On your computer (during the implementation), On the cloud/on-premises server (when it is published).
  • PowerBI file size is limited to 1 GB; if the file is larger, upgrade to a premium license, allowing 50 GB of data.

In a nutshell,

Import mode is best if the data size is modest, but you won’t be able to access the most recent data until it has been refreshed. DirectQuery is the best option in this situation, but you must ensure that your data source performs well.

Photo by Claudio Schwarz on Unsplash

1.2: Data Loading

When it comes to reports, the data-loading process is crucial. A typical scenario is to load a large dataset from the cloud and then from the data source to a report.

The DataSource in powerBI can be an Excel spreadsheet (for example), JSON file (for example), or any type of input file, such as XML, CSV, etc. This source must be loaded into Power BI by using the “Load Data Source” button on the DataSource pane in Power BI’s Home tab; this is the first step in creating a dataset. This step takes some time and may require additional steps if you don’t have access to a data source that allows loading.

Make sure to use efficient methods for loading data into the report. This will help to improve the overall performance. There are two main load types in the powerBI.

i. Full Load

This is the default load type in powerBI. Full Load means retrieving the complete dataset each time and discarding previous data. This can be a bit of a hassle, especially if you want to generate a report that is not limited to a specific subset of the dataset. Full Load might not work for all datasets, but it will work for them!

Full Load for Import Mode

A full load is preferred if the data is small and the changing data frequency is high, however, it is quite expensive if the data is large because data would be totally wiped and loaded every time. This is where Incremental Load would come in helpful.

ii. Incremental Load

Assume you have data going back five years. Only recent data changes and other data remains intact. An incremental load is ideal to use in this circumstance. It only refreshes recent data, no previous data.

When you apply an incremental load, it creates a dynamic partition on your data. As a result of this partition, read operations on the data become faster, which increases performance. The partition separates data that needs to be refreshed regularly from data that can be refreshed less frequently.

Since, incremental loading can help to reduce the amount of time and resources required to refresh the dataset and can make the PowerBI reports more responsive. Below image shows the refresh cycle of incremental load.

Incremental Load

To use incremental loading, you first need to identify which columns in your dataset contain the information that will be used to determine which records are new or updated. This is typically a timestamp or a unique identifier. You can then configure PowerBI to only load records where the value in these columns is greater than the maximum value in the previous load.

For example, let’s say you have a dataset containing customer orders, and you want to use incremental loading to only load new orders. You could use the order date as the column to determine which records are new. On the first load, PowerBI would load all of the orders in the dataset. On subsequent loads, PowerBI would only load orders with a date greater than the maximum date from the previous load.

In a nutshell,

Use Full Load if data is small and does not change frequently. However, if you have a significant amount of data and the majority of the history does not change, Incremental refresh is the best option. Incremental refresh will decrease load time, hence enhancing performance.

In this article, we explored the data acquisition considerations that increase the performance of PowerBI reports. We covered Data Connection, which includes Import mode and Direct inquiry, and Data Load, which includes Full Load and Incremental Load. In the following post, we will discuss the factors to consider while developing a data model for optimal performance.

If you liked this article and want to read the next part of this series, please follow me to get a notification when it’s released. You can also subscribe to Medium by clicking here to read interesting articles like this at your convenient time.

Image credits

All images, unless otherwise noted, are by the author.


Photo by Campaign Creators on Unsplash

Creating a report is simple. Anyone can do it. However, what distinguishes professionals from others is the report’s performance. Performance is crucial when delivering the report. Improving the performance of Power BI can have several benefits. First, it can make the tool more responsive and easier to use, which can improve user satisfaction and productivity. Second, faster performance can allow for more complex and data-intensive operations, which can enable more advanced analysis and insights. Overall, improving the performance of Power BI can help to ensure that the tool is effective and efficient in supporting the needs of users and their data analysis tasks.

Before we begin, let’s review the Power BI quick summary. Power BI is a data analytics and visualization platform that can collect, analyze, and visualize data from various sources. It includes tools for data acquisition, cleaning, and transformation, as well as a range of pre-built visualizations and dashboards that can be used to gain insights from data. Power BI also offers connectivity to a wide range of data sources, including databases, cloud services, and web APIs, which makes it easy to integrate data from multiple sources and analyze it in a single platform. With Power BI, users can collect, clean, and transform data, create interactive visualizations and dashboards, and share their insights with others in their organization.

Let’s get started. I’ve broken down the report creation process into four steps. They are as follows:

  • Data acquisition: Getting Data into PowerBI
  • Data modeling: Establishing relation mapping between data tables
  • Model advancement: Creating measures and new columns if needed
  • Report creation (Add visualization): Put all visuals together

This post will focus on the first component, which is data acquisition. Which include both Data Connection and Data Load. We shall explore connection type, Direct query, and import mode in Data connection. In the Data Load section, we will then explore the distinction between Full Load and Incremental Load, as well as the recommended practices for Data connection and Dat Load.

**Note: Performance improvement of data modelling, model advancement, and report creation will be published in the successive articles.

Report generation process

Getting data into PowerBI is called Data Acquisition. I am dividing the data acquisition process into two logical partitions. First, creating a connection with the data source and then defining how powerBI will load the data in the system on refresh.

1.1: Type of data connection

Data can be acquired in powerBI in two ways:

i. Direct Query

Direct Query is the method where your dashboard will be directly querying the data source at run time. Every filter and interaction with the report will kick off further queries.

Direct Query Diagram

Benefits of using Direct Query:

  • It enables you to interact with your data in real time, ensuring you always have the most up-to-date information. This is especially handy when working with massive, continually changing data sets.
  • You can connect to a wide range of data sources. This includes relational databases, big data sources, and even streaming data.

Considerations to Make When Using DirectQuery (performance wise):

-The time it takes the back-end source to respond with the query results determines the required visual refresh time because all DirectQuery requests are forwarded to the source database. The recommended response time (with the return of the requested data) when using DirectQuery for visuals is less than five seconds.

It is critical that your data source respond quickly. If I have to do a direct query on a SQL database, I always start by writing a SQL query to get the total number of row counts for the specific table. I would consider database performance tuning if it takes more than five seconds. Here are two methods for optimizing performance.

  • Use indexes:
    Indexes are an important aspect of database performance tuning because they allow the database system to quickly discover and get data from tables, which considerably improves query efficiency. Instead of having to scan the entire table, the database may quickly search the index and discover the relevant rows in the table
    – However, Indexes may not serve to increase database performance in some circumstances, and may even have a negative impact on performance. For example, if a query does not relate to the indexed columns, the index will not be used, and the query will have to scan the entire database to retrieve the necessary data. If an index contains too many rows, it may not provide much of a performance boost because the database must still scan a significant number of rows to identify the relevant data.
  • Type of Joins:
    Joins are an important part of database performance tuning because they can greatly affect the performance of queries that use multiple tables.
    – There are several different types of table joins, including inner joins, outer joins, and cross joins. Inner joins are the most common and return rows that have matching values in both tables. Outer joins, on the other hand, return all rows from one table and the matching rows from the other table. Cross joins return all possible combinations of rows from both tables. By optimizing the way that tables are joined, it is possible to improve the efficiency and performance of these queries.
  • Create a query plan:
    – A query plan is a detailed description of the actions that will be taken by the database system to perform a SQL query. It describes the order of operations, the data structures and algorithms to be used, and other parameters about how the query will be conducted. Query plans are an important part of database performance tuning since they provide information about how a query will be run and can assist discover areas for improvement.
    – Query plan performance tuning frequently involves altering the query, the database design, or the database configuration to increase query performance. This could include optimizing the usage of indices, changing the schema to optimize data access, or using other approaches to increase query efficiency.
    – Understanding and analyzing query plans is a crucial component of database performance tuning since it can assist guarantee that queries are executed efficiently.
  • Query Hints:
    – A query hint is a directive that is used in a SQL query to provide additional information or instructions to the database system on how to perform the query. Query hints can be used to influence the query strategy and hence the query’s performance. Query hints are often used when the database optimizer is unable to select the best plan for a query and the user wishes to provide further direction to the system.
    – A query hint, for example, could be used to identify a certain index to use for a query or to specify how a query should be processed.
    – Performance tuning with query hints can be an effective technique to increase the performance of individual queries, but it must be handled with care. Overuse of query hints can lead to complex and fragile queries that are difficult to maintain and understand, and that may not perform well in different situations.
    – In general, query hints should be used only as a last resort, after all, other performance tuning strategies have been explored.

ii. Import

Import makes a copy of your data source and creates a snapshot. Whenever you request data, it directly goes with the stored data instead of connecting to the direct source.

Import Mode Diagram

Benefits of using Import:

  • Since data is cached, the performance is better than direct query.
  • All DAX expressions are fully supported.

Considerations to make when using Import Mode:

  • Data is not real-time without refresh.
  • Depending on the size of the imported data, a large amount of memory and disk space is needed. On your computer (during the implementation), On the cloud/on-premises server (when it is published).
  • PowerBI file size is limited to 1 GB; if the file is larger, upgrade to a premium license, allowing 50 GB of data.

In a nutshell,

Import mode is best if the data size is modest, but you won’t be able to access the most recent data until it has been refreshed. DirectQuery is the best option in this situation, but you must ensure that your data source performs well.

Photo by Claudio Schwarz on Unsplash

1.2: Data Loading

When it comes to reports, the data-loading process is crucial. A typical scenario is to load a large dataset from the cloud and then from the data source to a report.

The DataSource in powerBI can be an Excel spreadsheet (for example), JSON file (for example), or any type of input file, such as XML, CSV, etc. This source must be loaded into Power BI by using the “Load Data Source” button on the DataSource pane in Power BI’s Home tab; this is the first step in creating a dataset. This step takes some time and may require additional steps if you don’t have access to a data source that allows loading.

Make sure to use efficient methods for loading data into the report. This will help to improve the overall performance. There are two main load types in the powerBI.

i. Full Load

This is the default load type in powerBI. Full Load means retrieving the complete dataset each time and discarding previous data. This can be a bit of a hassle, especially if you want to generate a report that is not limited to a specific subset of the dataset. Full Load might not work for all datasets, but it will work for them!

Full Load for Import Mode

A full load is preferred if the data is small and the changing data frequency is high, however, it is quite expensive if the data is large because data would be totally wiped and loaded every time. This is where Incremental Load would come in helpful.

ii. Incremental Load

Assume you have data going back five years. Only recent data changes and other data remains intact. An incremental load is ideal to use in this circumstance. It only refreshes recent data, no previous data.

When you apply an incremental load, it creates a dynamic partition on your data. As a result of this partition, read operations on the data become faster, which increases performance. The partition separates data that needs to be refreshed regularly from data that can be refreshed less frequently.

Since, incremental loading can help to reduce the amount of time and resources required to refresh the dataset and can make the PowerBI reports more responsive. Below image shows the refresh cycle of incremental load.

Incremental Load

To use incremental loading, you first need to identify which columns in your dataset contain the information that will be used to determine which records are new or updated. This is typically a timestamp or a unique identifier. You can then configure PowerBI to only load records where the value in these columns is greater than the maximum value in the previous load.

For example, let’s say you have a dataset containing customer orders, and you want to use incremental loading to only load new orders. You could use the order date as the column to determine which records are new. On the first load, PowerBI would load all of the orders in the dataset. On subsequent loads, PowerBI would only load orders with a date greater than the maximum date from the previous load.

In a nutshell,

Use Full Load if data is small and does not change frequently. However, if you have a significant amount of data and the majority of the history does not change, Incremental refresh is the best option. Incremental refresh will decrease load time, hence enhancing performance.

In this article, we explored the data acquisition considerations that increase the performance of PowerBI reports. We covered Data Connection, which includes Import mode and Direct inquiry, and Data Load, which includes Full Load and Incremental Load. In the following post, we will discuss the factors to consider while developing a data model for optimal performance.

If you liked this article and want to read the next part of this series, please follow me to get a notification when it’s released. You can also subscribe to Medium by clicking here to read interesting articles like this at your convenient time.

Image credits

All images, unless otherwise noted, are by the author.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment