Matplotlib vs. Plotly: Let’s Decide Once and for All


Deep and rapid comparison in terms of 7 key aspects

Goofy Image by Author

There is an annoying habit of soccer fans. Whenever a young but admittedly exceptional player emerges, they compare him to legends like Messi or Ronaldo. They choose to forget that the legends have been dominating the game since before the newbies had regrown teeth. Comparing Plotly to Matplotlib was, in a sense, similar to that in the beginning.

Matplotlib had been in heavy use since 2003, and Plotly had just come out in 2014.

Many were bored with Matplotlib by this time, so Plotly was warmly welcomed for its freshness and interactivity. Still, the library couldn’t hope to steal the top spot as the king of Python plotting packages from Matplotlib.

In 2019, things changed dramatically when Plotly released its Express API in July. This fueled an explosion of interest in the library, and people started using it left and right.

With another major version (5.0.0) released in June last year, I think Plotly matured more than enough to be compared to Matplotlib.

With that said, let’s get started:

Custom function to plot the scores. The full function body can be seen on this GitHub gist I created.

Let’s start by comparing the ease of use of their APIs. Both offer high-level and low-level interfaces to interact with the core functionality.

1.1 Consistency of higher-level APIs (Pyplot vs. Express)

On the one hand, Plotly Express excels in consistency. It only contains higher-level functions to access the built-in plots. It does not introduce new ways of performing existing functionality — it is a wrapper. All plot calls to Express return the core Figure object.

On the other hand, the PyPlot interface packages all plotting functions and customizations into a single, new API. Even though plot calls have the same signature, customization functions differ from those in the object-oriented API of Matplotlib.

This means you have to spend your time learning the differences if you want to switch interfaces.

Besides, creating plots returns different objects under the hood in Matplotlib. For example, plt.scatter returns a PathCollection object whereas plt.boxplot returns a dictionary. This is because Matplotlib implements different base classes for each plot type. It can be genuinely confusing for many.

plot_scores(mpl=0, px=1)

1.2 Amount of code required to switch between APIs

To switch from PyPlot to OOP API of Matplotlib, you simply change the way you interact with the core data structures, such as figure and axes objects. Calls to plots have similar signatures, and the parameter names do not change.

Switching from Plotly Express to Plotly Graph Objects (GO) requires a steep learning curve. The functions signatures to create all plots change, and GO adds much more parameters to each plot call. Even though this is done to introduce more customizations, I think plots end up unusually complex.

Another disadvantage is that GO moves some of the core parameters outside the plot calls. For example, creating logarithmic axes can be achieved directly inside a plot in Plotly Express. In GO, you do this with update_layout or update_axes functions. This is not the case in PyPlot or OOP API of Matplotlib (parameters don’t move or change names).

plot_scores(mpl=1, px=1)

1.3 Customization API

Even though there is a separate section on customization, we have to talk about it in terms of API.

All customizations in Matplotlib have separate functions. This allows you to change the plot in chunks of code and use loops or other procedures.

In contrast, Plotly uses dictionaries extensively. While this offers certain consistency to how you interact with the plots and data, it comes at the high cost of code readability and length. As many people prefer the update_layout function, its arguments often end up with a jungle of nested dictionaries.

You may pause and think about these differences between the APIs, but Matplotlib’s is more Pythonic and readable.

plot_scores(2, 1)

To see the true difference between speeds, we have to use bigger datasets. I will import the diamonds dataset from Seaborn and compare the time it takes to create a simple scatterplot.

I will be using the %%timeit magic command, which runs the same chunk of code several times to see the standard deviation error.

Measuring Matplotlib:

Measuring Plotly:

Matplotlib is almost 80 times faster than Plotly, with lower SD errors. Maybe this is because Plotly renders interactive plots. Let’s check the speeds once again, this time turning off the interactivity:

Unfortunately, turning off interactivity didn’t help much. Matplotlib crushes Plotly in terms of speed:

plot_scores(3, 1)

In this area, Plotly takes the lead. From the Plotly API reference, I counted close to 50 unique plots. Especially, Plotly is superb when it comes to certain types of charts.

For example, it has dedicated support for financial plotting, and it provides figure_factory subpackage to create more complex, custom charts.

On the other hand, Matplotlib has a modest selection of plots. I don’t think they would quite match the rich selection that Plotly offers even if we added the plots from Seaborn:

plot_scores(3, 2)

Um, how can we compare interactivity if only Plotly has this feature?

Not many know this, but outside Jupyter Notebooks, Matplotlib plots render in an interactivity mode by default.

Screenshot by author

Unfortunately, this level of interactivity is nothing compared to Plotly’s. So, let’s raise Plotly’s score by one:

plot_scores(3, 3)

Now, for a tiebreaker — apart from general interactivity, Plotly also offers custom buttons:

sliders:

and many more features that take the whole user experience to the next level. This deserves another point:

plot_scores(3, 4)

For many data scientists, customizations are everything. You might want to create custom themes and use brand colors depending on your project’s data (an excellent example can be seen here for visualizing Netflix data).

With that said, let’s see the most important components you can tune and their differences between the packages.

5.1 Colors and Palettes

Both Matplotlib and Plotly have dedicated sub-modules for colors and palettes.

Matplotlib allows users to change the color of plot elements using color labels, hex codes, RGB and RGBA systems. Most notably, under mpl.colors.CSS4_COLORS, you can pass more than 100 CSS color labels.

Plotly indeed implements the same features, but Matplotlib offers colors from other plotting software like Tableau. Besides, passing colors and palettes in Matplotlib make less of a mess.

In Plotly, there are no less than six arguments dealing with different palettes. In comparison, MPL has only two flexible parameters color and cmap that can adapt to any color system or palette you pass.

plot_scores(4, 4)

5.2 Default styles

During casual analysis, there is no need to go beyond the default settings. These types of analyses can often last long, so these defaults must produce charts in as high quality as possible on the fly.

I think all of us can agree that Matplotlib defaults, well, suck. Look at this scatterplot of height vs. weight of Olympic athletes created in both libraries:

Plotly’s looks better.

Besides, I like how Plotly adheres to data visualization best practices, like using extra colors only when necessary.

For example, Plotly uses uniform colors rather than a palette when creating bar charts or boxplots. Matplotlib does the opposite — it colors every bar or boxplot even though the color doesn’t add new information to the plot.

plot_scores(4, 5)

5.3 Themes

Plotly takes this section solely because it has the fantastic ‘Dark Mode’ (call me subjective, I don’t care). It is easier on the eye and gives plots a feel of luxury (especially when used with red, my favorite):

It looks so slick in Jupyter Lab!

plot_scores(4, 6)

5.4 Global settings

The reason why it is taking me such a long time to integrate to Plotly is its lack of features to control the global settings.

Matplotlib has the rcParams dictionary, which you can easily tweak to set plotting options globally. You would think that a library that heavily depends on dictionaries would have a similar dictionary, but no!

Plotly really disappoints me in this regard.

plot_scores(5, 6)

5.5 Axes

The most critical components of axes are tick marks and tick labels.

Honestly, to this day, I don’t have a full grasp on controlling ticks in Matplotlib. This is because Matplotlib does not have a consistent API for controlling axes.

You might blame me for not trying hard enough, but I figured out everything I needed to learn about controlling ticks in Plotly by looking at the documentation once.

Plotly has a single way of working with ticks (through update_xaxes/update_yaxes or update_layout). These functions do not change when you switch between Express and GO, while in Matplotlib, that is not the case.

plot_scores(5, 7)

5.6 How about controlling individual aspects of charts?

We will have to give this one to Matplotlib. I am not doing this because Plotly is already in the lead, and I must keep up the intrigue.

Matplotlib implements individual components as separate classes, making its API very granular. More granular means more options to control the objects visible on the plot.

Take boxplots as an example:

Even though the plot looks blank, it allows virtually unlimited customization. For example, under the boxes key of the returned dictionary, you can access each of the boxplots as Patch objects:

These objects open the doors to all the magic that happens under the hood of Matplotlib. They are not limited to boxplots either, and you can access Patch objects from many other plots. Using these patch objects, you can customize every line, corner, and dot around the shapes in the plot.

plot_scores(6, 7)

Scatterplots play a pivotal role in statistical analysis.

They are used to understand correlation and causation, detect outliers and plot the line of best fit in linear regression. So, I decided to dedicate an entire section to compare both libraries’ scatterplots.

For that, I will choose the height vs. weight scatterplot from the earlier section:

Matplotlib’s default scatterplot.

I know, disgusting to look at. However, watch as I perform some customizations that turn the plot into a piece of (well, I won’t say art):

The same scatterplot after decreasing tick sizes and making them less opaque

Before applying the last step, we can see that the dots are grouped around distinct rows and columns. Let’s add jittering and see what happens:

The same scatterplot after jittering.

Now compare the initial plot to the last:

Left — initial plot. Right — after transformations.

We wouldn’t even have known the dots are grouped as rows and columns in Plotly. It does not allow changing the marker size any smaller than the default.

This means we wouldn’t be able to jitter the distributions to account for the fact that weights and heights were rounded off at discrete values.

plot_scores(7, 7)

Wow! It has been neck-and-neck up to this point.

As a final component and a tiebreaker, let’s compare the documentations.

When I was a beginner, Matplotlib documentation was the last place I expected to find answers to my questions.

First, you can’t read a single page of the documentation without opening several other linked pages. The documentation is a jumbled monstrosity.

Second, its tutorials and example scripts seem to be written for academic researchers, and it is almost like Matplotlib tries to intimidate beginners on purpose.

In contrast, Plotly is much more organized. It has a full API reference, and its tutorials are mostly stand-alone.

It is not perfect, but at least it is nice to look at — it does not feel like you are reading a newspaper page from the 90s.

plot_scores(7, 8)

> 2022 update: Matplotlib completely revamped its docs, and it is positively pleasing to look at. We could call this section a tie.

Truthfully, I went into writing this comparison entirely sure that Matplotlib would end up winning.

By the time I finished the half, I knew that the opposite would happen, and I was right.

Plotly is genuinely a remarkable library. Even though I let my personal preferences and biases affect the scores, no one can deny that the package has already achieved many milestones and still rapidly developing.

This post was not meant to convince you to ditch one package and use the other. Rather, I wanted to highlight the areas each package excels at and show what you can create if you add both libraries to your toolbelt.

Thank you for reading!


Deep and rapid comparison in terms of 7 key aspects

Goofy Image by Author

There is an annoying habit of soccer fans. Whenever a young but admittedly exceptional player emerges, they compare him to legends like Messi or Ronaldo. They choose to forget that the legends have been dominating the game since before the newbies had regrown teeth. Comparing Plotly to Matplotlib was, in a sense, similar to that in the beginning.

Matplotlib had been in heavy use since 2003, and Plotly had just come out in 2014.

Many were bored with Matplotlib by this time, so Plotly was warmly welcomed for its freshness and interactivity. Still, the library couldn’t hope to steal the top spot as the king of Python plotting packages from Matplotlib.

In 2019, things changed dramatically when Plotly released its Express API in July. This fueled an explosion of interest in the library, and people started using it left and right.

With another major version (5.0.0) released in June last year, I think Plotly matured more than enough to be compared to Matplotlib.

With that said, let’s get started:

Custom function to plot the scores. The full function body can be seen on this GitHub gist I created.

Let’s start by comparing the ease of use of their APIs. Both offer high-level and low-level interfaces to interact with the core functionality.

1.1 Consistency of higher-level APIs (Pyplot vs. Express)

On the one hand, Plotly Express excels in consistency. It only contains higher-level functions to access the built-in plots. It does not introduce new ways of performing existing functionality — it is a wrapper. All plot calls to Express return the core Figure object.

On the other hand, the PyPlot interface packages all plotting functions and customizations into a single, new API. Even though plot calls have the same signature, customization functions differ from those in the object-oriented API of Matplotlib.

This means you have to spend your time learning the differences if you want to switch interfaces.

Besides, creating plots returns different objects under the hood in Matplotlib. For example, plt.scatter returns a PathCollection object whereas plt.boxplot returns a dictionary. This is because Matplotlib implements different base classes for each plot type. It can be genuinely confusing for many.

plot_scores(mpl=0, px=1)

1.2 Amount of code required to switch between APIs

To switch from PyPlot to OOP API of Matplotlib, you simply change the way you interact with the core data structures, such as figure and axes objects. Calls to plots have similar signatures, and the parameter names do not change.

Switching from Plotly Express to Plotly Graph Objects (GO) requires a steep learning curve. The functions signatures to create all plots change, and GO adds much more parameters to each plot call. Even though this is done to introduce more customizations, I think plots end up unusually complex.

Another disadvantage is that GO moves some of the core parameters outside the plot calls. For example, creating logarithmic axes can be achieved directly inside a plot in Plotly Express. In GO, you do this with update_layout or update_axes functions. This is not the case in PyPlot or OOP API of Matplotlib (parameters don’t move or change names).

plot_scores(mpl=1, px=1)

1.3 Customization API

Even though there is a separate section on customization, we have to talk about it in terms of API.

All customizations in Matplotlib have separate functions. This allows you to change the plot in chunks of code and use loops or other procedures.

In contrast, Plotly uses dictionaries extensively. While this offers certain consistency to how you interact with the plots and data, it comes at the high cost of code readability and length. As many people prefer the update_layout function, its arguments often end up with a jungle of nested dictionaries.

You may pause and think about these differences between the APIs, but Matplotlib’s is more Pythonic and readable.

plot_scores(2, 1)

To see the true difference between speeds, we have to use bigger datasets. I will import the diamonds dataset from Seaborn and compare the time it takes to create a simple scatterplot.

I will be using the %%timeit magic command, which runs the same chunk of code several times to see the standard deviation error.

Measuring Matplotlib:

Measuring Plotly:

Matplotlib is almost 80 times faster than Plotly, with lower SD errors. Maybe this is because Plotly renders interactive plots. Let’s check the speeds once again, this time turning off the interactivity:

Unfortunately, turning off interactivity didn’t help much. Matplotlib crushes Plotly in terms of speed:

plot_scores(3, 1)

In this area, Plotly takes the lead. From the Plotly API reference, I counted close to 50 unique plots. Especially, Plotly is superb when it comes to certain types of charts.

For example, it has dedicated support for financial plotting, and it provides figure_factory subpackage to create more complex, custom charts.

On the other hand, Matplotlib has a modest selection of plots. I don’t think they would quite match the rich selection that Plotly offers even if we added the plots from Seaborn:

plot_scores(3, 2)

Um, how can we compare interactivity if only Plotly has this feature?

Not many know this, but outside Jupyter Notebooks, Matplotlib plots render in an interactivity mode by default.

Screenshot by author

Unfortunately, this level of interactivity is nothing compared to Plotly’s. So, let’s raise Plotly’s score by one:

plot_scores(3, 3)

Now, for a tiebreaker — apart from general interactivity, Plotly also offers custom buttons:

sliders:

and many more features that take the whole user experience to the next level. This deserves another point:

plot_scores(3, 4)

For many data scientists, customizations are everything. You might want to create custom themes and use brand colors depending on your project’s data (an excellent example can be seen here for visualizing Netflix data).

With that said, let’s see the most important components you can tune and their differences between the packages.

5.1 Colors and Palettes

Both Matplotlib and Plotly have dedicated sub-modules for colors and palettes.

Matplotlib allows users to change the color of plot elements using color labels, hex codes, RGB and RGBA systems. Most notably, under mpl.colors.CSS4_COLORS, you can pass more than 100 CSS color labels.

Plotly indeed implements the same features, but Matplotlib offers colors from other plotting software like Tableau. Besides, passing colors and palettes in Matplotlib make less of a mess.

In Plotly, there are no less than six arguments dealing with different palettes. In comparison, MPL has only two flexible parameters color and cmap that can adapt to any color system or palette you pass.

plot_scores(4, 4)

5.2 Default styles

During casual analysis, there is no need to go beyond the default settings. These types of analyses can often last long, so these defaults must produce charts in as high quality as possible on the fly.

I think all of us can agree that Matplotlib defaults, well, suck. Look at this scatterplot of height vs. weight of Olympic athletes created in both libraries:

Plotly’s looks better.

Besides, I like how Plotly adheres to data visualization best practices, like using extra colors only when necessary.

For example, Plotly uses uniform colors rather than a palette when creating bar charts or boxplots. Matplotlib does the opposite — it colors every bar or boxplot even though the color doesn’t add new information to the plot.

plot_scores(4, 5)

5.3 Themes

Plotly takes this section solely because it has the fantastic ‘Dark Mode’ (call me subjective, I don’t care). It is easier on the eye and gives plots a feel of luxury (especially when used with red, my favorite):

It looks so slick in Jupyter Lab!

plot_scores(4, 6)

5.4 Global settings

The reason why it is taking me such a long time to integrate to Plotly is its lack of features to control the global settings.

Matplotlib has the rcParams dictionary, which you can easily tweak to set plotting options globally. You would think that a library that heavily depends on dictionaries would have a similar dictionary, but no!

Plotly really disappoints me in this regard.

plot_scores(5, 6)

5.5 Axes

The most critical components of axes are tick marks and tick labels.

Honestly, to this day, I don’t have a full grasp on controlling ticks in Matplotlib. This is because Matplotlib does not have a consistent API for controlling axes.

You might blame me for not trying hard enough, but I figured out everything I needed to learn about controlling ticks in Plotly by looking at the documentation once.

Plotly has a single way of working with ticks (through update_xaxes/update_yaxes or update_layout). These functions do not change when you switch between Express and GO, while in Matplotlib, that is not the case.

plot_scores(5, 7)

5.6 How about controlling individual aspects of charts?

We will have to give this one to Matplotlib. I am not doing this because Plotly is already in the lead, and I must keep up the intrigue.

Matplotlib implements individual components as separate classes, making its API very granular. More granular means more options to control the objects visible on the plot.

Take boxplots as an example:

Even though the plot looks blank, it allows virtually unlimited customization. For example, under the boxes key of the returned dictionary, you can access each of the boxplots as Patch objects:

These objects open the doors to all the magic that happens under the hood of Matplotlib. They are not limited to boxplots either, and you can access Patch objects from many other plots. Using these patch objects, you can customize every line, corner, and dot around the shapes in the plot.

plot_scores(6, 7)

Scatterplots play a pivotal role in statistical analysis.

They are used to understand correlation and causation, detect outliers and plot the line of best fit in linear regression. So, I decided to dedicate an entire section to compare both libraries’ scatterplots.

For that, I will choose the height vs. weight scatterplot from the earlier section:

Matplotlib’s default scatterplot.

I know, disgusting to look at. However, watch as I perform some customizations that turn the plot into a piece of (well, I won’t say art):

The same scatterplot after decreasing tick sizes and making them less opaque

Before applying the last step, we can see that the dots are grouped around distinct rows and columns. Let’s add jittering and see what happens:

The same scatterplot after jittering.

Now compare the initial plot to the last:

Left — initial plot. Right — after transformations.

We wouldn’t even have known the dots are grouped as rows and columns in Plotly. It does not allow changing the marker size any smaller than the default.

This means we wouldn’t be able to jitter the distributions to account for the fact that weights and heights were rounded off at discrete values.

plot_scores(7, 7)

Wow! It has been neck-and-neck up to this point.

As a final component and a tiebreaker, let’s compare the documentations.

When I was a beginner, Matplotlib documentation was the last place I expected to find answers to my questions.

First, you can’t read a single page of the documentation without opening several other linked pages. The documentation is a jumbled monstrosity.

Second, its tutorials and example scripts seem to be written for academic researchers, and it is almost like Matplotlib tries to intimidate beginners on purpose.

In contrast, Plotly is much more organized. It has a full API reference, and its tutorials are mostly stand-alone.

It is not perfect, but at least it is nice to look at — it does not feel like you are reading a newspaper page from the 90s.

plot_scores(7, 8)

> 2022 update: Matplotlib completely revamped its docs, and it is positively pleasing to look at. We could call this section a tie.

Truthfully, I went into writing this comparison entirely sure that Matplotlib would end up winning.

By the time I finished the half, I knew that the opposite would happen, and I was right.

Plotly is genuinely a remarkable library. Even though I let my personal preferences and biases affect the scores, no one can deny that the package has already achieved many milestones and still rapidly developing.

This post was not meant to convince you to ditch one package and use the other. Rather, I wanted to highlight the areas each package excels at and show what you can create if you add both libraries to your toolbelt.

Thank you for reading!

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – admin@technoblender.com. The content will be deleted within 24 hours.
decidelatest newsLetsmachine learningMatplotlibPlotlyTechnology
Comments (0)
Add Comment