8 Tips for Creating Data Visualizations in Python using Bokeh | by Payal Patel | Dec, 2022


Photo by Lukas Blazek on Unsplash

Python is a great open-source tool to create data visualizations. There are many data visualization libraries available including Matplotlib, Seaborn, and Bokeh.

Bokeh is a Python data visualization library designed to create interactive charts. While free-to-use, a significant amount of time is often needed to learn the specifics.

Below are a few tips I’ve learned along the way to create data visualizations using Bokeh.

Note: All examples shown, including code and datasets, are available here.

1 — Format chart title, axis titles, and axis labels for easy readability

Well-formatted titles and labels can improve the readability and usability of a chart. Titles, labels and axes that are easy to read allow users to quickly view and process a visualization.

When designing a data visualization, it’s important to keep in mind the format of the titles and labels including font-weight, font-size, and font-color.

For example, the following bar chart displays the average salary for data science professionals by experience level. This visualization is using the default values for the titles and labels.

Image by Author: Bar graph with default title and axis labels

Changing the default settings can enhance a data visualization. Below are a few techniques I use when formatting chart titles and axes using Bokeh.

  • Bold Chart & Axis Titles — Bold titles to make them stand out. This makes it easier for the user to quickly find identifying information.
  • Center Chart Title — By default, the chart title is left aligned. Center the title to balance the visualization with a symmetrical element.
  • Increase Font of the Chart & Axis Titles — Increase the font-size of the titles to make them easier to read. Ensure the axis titles are smaller than the chart title, so they don’t overpower the chart.
  • Set Axis Range — By default, the axes do not start at 0. As we can see in the example above, this results in bars that appear to be floating off of the x-axis. To format the range of the x or y axis, use the Range1d function.
  • Modify Axis Labels with Tick Formatters— Bokeh contains several tick formatters, such as NumeralTickFormatter and CategoricalTickFormatter. These tick formatters can change the format of the tick labels of the x and y axis. In the above example, the NumeralTickFormatter can remove the scientific notation format of the y-axis. Check the Bokeh documentation to view formatters available for your version.
  • Override Axis Labels with Custom Labels — Customization helps when the labels contain large numbers in the thousands, millions, etc. Use custom labels to create an alias, or shorthand. i.e. if a label shows a value of “1,000,000”, a custom label can change the display to “1M” for easy readability.

The following image shows the bar graph after applying these techniques. While the changes are subtle, they improve the readability and usability of the chart.

Image by Author: Bar graph with modified titles and labels

2 — Create Interactive Legends

Legends provide useful context for data visualizations, allowing users to quickly identify similar data points — whether that be through color, shape, or size. Using the Bokeh library in Python, users can create interactive legends. Interactive legends hide or mute portions of the data out of view. Interactive legends are helpful when there are a large number of groups, or if several data points overlap.

For example, the following line graph displays the total acres burned by county due to California wildfires from 2013 to 2019. There are 6 counties represented on this line graph, with several overlapping lines.

Image by Author: California Wildfires Line Graph

Creating an interactive legend allows user to select items to remove from view. Items can be “hidden” or “muted” by setting the figure’s click_policy to “hide” or “mute”, as seen with the below command.

# Set Legend Click Policy for Figure 'p' 
p.legend.click_policy="mute"

I prefer to mute items in a legend as muted items appear grayed-out, rather than completely removed from view. This way users can focus on the groups they want to, while ensuring the full dataset is represented.

Note: If muting items, the muted_color and muted_alpha fields should be specified for each line plotted. View the full code for this visualization here.

In the California Wildfires Line Graph, the click_policy is set to “mute”, with gray as the muted_color and 0.2 as the muted_alpha value. By removing specific counties out of the view, users are able to compare counties on a smaller scale more quickly and efficiently. For example, if a user wants to compare wildfires in LA with wildfires in San Diego, it would be difficult as several lines overlap. In the following image, all counties except for Los Angeles and San Diego have been muted, allowing for easier comparison of the total acres burned between these two counties.

Image by Author: California Wildfires Line Graph with Select Counties Muted

Interactive legends can be applied to other visualizations as well. The following scatterplot shows the relationship between student math and reading scores by race. Adding an interactive legend helps here as there are several overlapping data points. To create a scatterplot with an interactive legend, plot each group individually. We can see that by muting groups A, B, and C, we are able to compare groups D and E with ease. View the full code for this data visualization here.

Image by Author: Scatterplot with interactive legend

3 — Maximize space by placing the legend outside the figure

In Python, many data visualizations, such as line graphs, scatterplots, and bar graphs, allow you to add a legend with a simple command. Often users will either leave the default placement of the legend, or move the legend somewhere within the figure containing the visualization, such as on the top left or right. While in most cases this is fine, there are cases when the legend covers key areas of the visualization. In the Average Chocolate Rating by Country bar chart below, the default legend location covers Belgium and the U.K., making it difficult to determine if one is greater than the other.

Image by Author: Bar graph with default legend placement

If a visualization has several data points, or if adding a legend to a visualization results in covering key information, place the visualization to the side of the figure.

Image by Author: Bar graph with legend placed to the right of the figure

By moving the legend outside of the frame, we are able to see the visualization in it’s entirety while having the legend as a useful reference.

To add a legend to the the right of the figure p, use the following command.

p.add_layout(Legend(), ‘right’)

Note: To use the legend feature in Bokeh, import the following functions. View the full code for this visualization here.

from bokeh.models import Legend
from bokeh.models import Range1d

4 — Add tooltips

Tooltips, often referred to as hover text, is the text that appears when you move your cursor over a visualization, or parts of a visualization. Use tooltips to share additional information with viewers. The Bokeh library allows for tooltips to be added to several visualization types including bar charts, line graphs, and scatterplots.

To add tooltips to a visualization, import the HoverTool function, as shown below.

from bokeh.models.tools import HoverTool

Take for example the following bar chart, Average Chocolate Rating by Country. The tooltip in this chart displays the Country Name and Average Rating for the country the cursor is hovering on.

Image by Author: Bar graph with tooltip enabled

Similar to formatting chart titles and labels, you’ll also want to keep in mind how you style the text in a tooltip!

In the following bar chart, Average Salary by Experience Level, the tooltip contains information regarding the Experience Level and Average Salary in US Dollars.

Image by Author: Bar graph with tooltip enabled

By default, the Average Salary text is not automatically formatted as currency; however, with a couple of modifications, we can format the text to include a dollar sign, and commas. The following line of code was added when creating the data visualization to format the tooltip. View the full code for this visualization here.

# Add hover text 
p.add_tools(HoverTool(tooltips=[(“Experience Level”, “@types”),
(“Average Salary (USD)”, “$@values{0,0}”)]))

Tooltips can also contain information outside of the x and y axis. The following scatterplot shows student math and reading scores by gender. The tooltip in this example shows the gender, math score, and reading score for each student.

Image by Author: Scatterplot with tooltip enabled

5 — Use tabs to organize data visualizations

With Bokeh, data visualizations can be displayed using tabs. Similar to a dashboard, each tab consists of its own content. Tabs display multiple visualizations that are related to one another, without having to generate a dashboard, or scroll through several images in a Jupyter Notebook. They are also useful to display different views of the same graph.

The following image shows how to use tabs to display variations of a scatterplot. The first scatterplot shows student math and reading scores by gender, while the second scatterplot shows student math and reading scores by race.

Image by Author: Tabs Object in Bokeh Example

To create an object with tabs, import the Tabs and Panel widgets with the below command.

 from bokeh.models.widgets import Tabs, Panel

Once the figures have been created, they can be added to a tabbed object. The following code snippet shows how the tabbed object for the student grades scatterplots was created. View the full code for this visualization here.

# Create tab panel for scatterplot visualizations

# Create the two panels
tab1 = Panel(child = p, title = 'Student Math and Reading Scores by Gender')
tab2 = Panel(child = r, title = 'Student Math and Reading Scores by Race')

# Add the tabs into a Tabs object
tabs_object = Tabs(tabs = [tab1, tab2])

# Output the plot
show(tabs_object)

While the above example shows one visualization per tab, it is possible to add multiple visualizations to each tab — just remember to keep in mind the layout and overall flow!

6 — Remove gridlines

By default, gridlines appear on data visualizations created using Bokeh. Reduce visual clutter by removing gridlines from the visualization. This makes it easier for users to view and interpret the data at hand.

Looking at the Average Salary by Experience Level bar graph below, we see the gridlines that are automatically added.

Image by Author: Bar graph with default gridlines

By removing the gridlines, the visualization becomes less cluttered, as seen in the following image.

Image by Author: Bar graph with gridlines removed

In Bokeh, removing gridlines is a quick process, and can be done by setting the grid_line_color to “None”. View the full code for this visualization here.

p.xgrid.grid_line_color = None
p.ygrid.grid_line_color = None

7 — Use pre-defined colors and color palettes

Color is a key part of any data visualization, and deciding on the right colors to use can take time. The Bokeh library comes with several pre-defined colors and color palettes.

Available color palettes can vary depending on the version of Bokeh you are using. To view the list of individual color names view the Bokeh documentation here.

To see the available color palettes for a specific version, check the Bokeh official documentation, or run the below command. This command lists the available color palettes based on the Bokeh version running.

bokeh.palettes.all_palettes.keys()

Bokeh color palettes consists of different sizes. To view the specific HEX colors available in a color palette, and the different sizes available, use the following command. This command lists the available sizes for the ‘Set3’ color palette, include the HEX colors.

bokeh.palettes.all_palettes[‘Set3’]

To import a color palette of a specific size, run the following command. This command is importing a size 3 Set3 color palette.

from bokeh.palettes import Set3_3

Alternatively, import all sizes in a color palette by specifying the palette name. The following example shows how to import all sizes for the Cividis palette.

from bokeh.palettes import cividis

It can be difficult to interpret HEX colors. To quickly view these HEX colors, you can use a function like the one below.

from IPython.display import Markdown, display

def printColorPalette(color_palette):
display(Markdown(‘<br>’.join(
f’<span style=”color: {color}; font-family: courier;”><span>{color}: </span>&#9608;&#9608;&#9608;&#9608;&#9608;&#9608;&#9608;</span>’
for color in color_palette
)))

This function takes a list of HEX numbers, and prints out the HEX Number and a corresponding color block. The following image shows output for the Cividis, Set3, and Spectral color palettes of various sizes.

Image by Author: Using function to print various color palettes

For more examples view the full code here.

8 — Display visualizations directly in Jupyter Notebook

When creating Bokeh visualizations in Jupyter Notebook, the default setting displays the output in a new webpage. Display the visualization directly in a notebook to quickly troubleshoot and develop visualizations.

To display Bokeh data visualizations in Jupyter Notebook, import the following functions.

from bokeh.io import output_notebook, show
from bokeh.resources import INLINE

Prior to developing any visualizations, call Bokeh’s output_notebook() function as shown below.

output_notebook(resources=INLINE)

Once the output for the visualizations has been set, use the show() function for each data visualization to show the output in the notebook.

For example, the following image displays how the show() function is called to display figure p, a scatterplot, within a Jupyter Notebook.

Image by Author: Data visualization displayed in Jupyter Notebook

Displaying the visualizations directly in the notebook helps keep the visualizations in one document. This makes it easy to reference the visualizations later without having to rerun the entire notebook.

These are a few ways to enhance your data visualizations with Bokeh! All examples, including code and datasets, are available here.


Photo by Lukas Blazek on Unsplash

Python is a great open-source tool to create data visualizations. There are many data visualization libraries available including Matplotlib, Seaborn, and Bokeh.

Bokeh is a Python data visualization library designed to create interactive charts. While free-to-use, a significant amount of time is often needed to learn the specifics.

Below are a few tips I’ve learned along the way to create data visualizations using Bokeh.

Note: All examples shown, including code and datasets, are available here.

1 — Format chart title, axis titles, and axis labels for easy readability

Well-formatted titles and labels can improve the readability and usability of a chart. Titles, labels and axes that are easy to read allow users to quickly view and process a visualization.

When designing a data visualization, it’s important to keep in mind the format of the titles and labels including font-weight, font-size, and font-color.

For example, the following bar chart displays the average salary for data science professionals by experience level. This visualization is using the default values for the titles and labels.

Image by Author: Bar graph with default title and axis labels

Changing the default settings can enhance a data visualization. Below are a few techniques I use when formatting chart titles and axes using Bokeh.

  • Bold Chart & Axis Titles — Bold titles to make them stand out. This makes it easier for the user to quickly find identifying information.
  • Center Chart Title — By default, the chart title is left aligned. Center the title to balance the visualization with a symmetrical element.
  • Increase Font of the Chart & Axis Titles — Increase the font-size of the titles to make them easier to read. Ensure the axis titles are smaller than the chart title, so they don’t overpower the chart.
  • Set Axis Range — By default, the axes do not start at 0. As we can see in the example above, this results in bars that appear to be floating off of the x-axis. To format the range of the x or y axis, use the Range1d function.
  • Modify Axis Labels with Tick Formatters— Bokeh contains several tick formatters, such as NumeralTickFormatter and CategoricalTickFormatter. These tick formatters can change the format of the tick labels of the x and y axis. In the above example, the NumeralTickFormatter can remove the scientific notation format of the y-axis. Check the Bokeh documentation to view formatters available for your version.
  • Override Axis Labels with Custom Labels — Customization helps when the labels contain large numbers in the thousands, millions, etc. Use custom labels to create an alias, or shorthand. i.e. if a label shows a value of “1,000,000”, a custom label can change the display to “1M” for easy readability.

The following image shows the bar graph after applying these techniques. While the changes are subtle, they improve the readability and usability of the chart.

Image by Author: Bar graph with modified titles and labels

2 — Create Interactive Legends

Legends provide useful context for data visualizations, allowing users to quickly identify similar data points — whether that be through color, shape, or size. Using the Bokeh library in Python, users can create interactive legends. Interactive legends hide or mute portions of the data out of view. Interactive legends are helpful when there are a large number of groups, or if several data points overlap.

For example, the following line graph displays the total acres burned by county due to California wildfires from 2013 to 2019. There are 6 counties represented on this line graph, with several overlapping lines.

Image by Author: California Wildfires Line Graph

Creating an interactive legend allows user to select items to remove from view. Items can be “hidden” or “muted” by setting the figure’s click_policy to “hide” or “mute”, as seen with the below command.

# Set Legend Click Policy for Figure 'p' 
p.legend.click_policy="mute"

I prefer to mute items in a legend as muted items appear grayed-out, rather than completely removed from view. This way users can focus on the groups they want to, while ensuring the full dataset is represented.

Note: If muting items, the muted_color and muted_alpha fields should be specified for each line plotted. View the full code for this visualization here.

In the California Wildfires Line Graph, the click_policy is set to “mute”, with gray as the muted_color and 0.2 as the muted_alpha value. By removing specific counties out of the view, users are able to compare counties on a smaller scale more quickly and efficiently. For example, if a user wants to compare wildfires in LA with wildfires in San Diego, it would be difficult as several lines overlap. In the following image, all counties except for Los Angeles and San Diego have been muted, allowing for easier comparison of the total acres burned between these two counties.

Image by Author: California Wildfires Line Graph with Select Counties Muted

Interactive legends can be applied to other visualizations as well. The following scatterplot shows the relationship between student math and reading scores by race. Adding an interactive legend helps here as there are several overlapping data points. To create a scatterplot with an interactive legend, plot each group individually. We can see that by muting groups A, B, and C, we are able to compare groups D and E with ease. View the full code for this data visualization here.

Image by Author: Scatterplot with interactive legend

3 — Maximize space by placing the legend outside the figure

In Python, many data visualizations, such as line graphs, scatterplots, and bar graphs, allow you to add a legend with a simple command. Often users will either leave the default placement of the legend, or move the legend somewhere within the figure containing the visualization, such as on the top left or right. While in most cases this is fine, there are cases when the legend covers key areas of the visualization. In the Average Chocolate Rating by Country bar chart below, the default legend location covers Belgium and the U.K., making it difficult to determine if one is greater than the other.

Image by Author: Bar graph with default legend placement

If a visualization has several data points, or if adding a legend to a visualization results in covering key information, place the visualization to the side of the figure.

Image by Author: Bar graph with legend placed to the right of the figure

By moving the legend outside of the frame, we are able to see the visualization in it’s entirety while having the legend as a useful reference.

To add a legend to the the right of the figure p, use the following command.

p.add_layout(Legend(), ‘right’)

Note: To use the legend feature in Bokeh, import the following functions. View the full code for this visualization here.

from bokeh.models import Legend
from bokeh.models import Range1d

4 — Add tooltips

Tooltips, often referred to as hover text, is the text that appears when you move your cursor over a visualization, or parts of a visualization. Use tooltips to share additional information with viewers. The Bokeh library allows for tooltips to be added to several visualization types including bar charts, line graphs, and scatterplots.

To add tooltips to a visualization, import the HoverTool function, as shown below.

from bokeh.models.tools import HoverTool

Take for example the following bar chart, Average Chocolate Rating by Country. The tooltip in this chart displays the Country Name and Average Rating for the country the cursor is hovering on.

Image by Author: Bar graph with tooltip enabled

Similar to formatting chart titles and labels, you’ll also want to keep in mind how you style the text in a tooltip!

In the following bar chart, Average Salary by Experience Level, the tooltip contains information regarding the Experience Level and Average Salary in US Dollars.

Image by Author: Bar graph with tooltip enabled

By default, the Average Salary text is not automatically formatted as currency; however, with a couple of modifications, we can format the text to include a dollar sign, and commas. The following line of code was added when creating the data visualization to format the tooltip. View the full code for this visualization here.

# Add hover text 
p.add_tools(HoverTool(tooltips=[(“Experience Level”, “@types”),
(“Average Salary (USD)”, “$@values{0,0}”)]))

Tooltips can also contain information outside of the x and y axis. The following scatterplot shows student math and reading scores by gender. The tooltip in this example shows the gender, math score, and reading score for each student.

Image by Author: Scatterplot with tooltip enabled

5 — Use tabs to organize data visualizations

With Bokeh, data visualizations can be displayed using tabs. Similar to a dashboard, each tab consists of its own content. Tabs display multiple visualizations that are related to one another, without having to generate a dashboard, or scroll through several images in a Jupyter Notebook. They are also useful to display different views of the same graph.

The following image shows how to use tabs to display variations of a scatterplot. The first scatterplot shows student math and reading scores by gender, while the second scatterplot shows student math and reading scores by race.

Image by Author: Tabs Object in Bokeh Example

To create an object with tabs, import the Tabs and Panel widgets with the below command.

 from bokeh.models.widgets import Tabs, Panel

Once the figures have been created, they can be added to a tabbed object. The following code snippet shows how the tabbed object for the student grades scatterplots was created. View the full code for this visualization here.

# Create tab panel for scatterplot visualizations

# Create the two panels
tab1 = Panel(child = p, title = 'Student Math and Reading Scores by Gender')
tab2 = Panel(child = r, title = 'Student Math and Reading Scores by Race')

# Add the tabs into a Tabs object
tabs_object = Tabs(tabs = [tab1, tab2])

# Output the plot
show(tabs_object)

While the above example shows one visualization per tab, it is possible to add multiple visualizations to each tab — just remember to keep in mind the layout and overall flow!

6 — Remove gridlines

By default, gridlines appear on data visualizations created using Bokeh. Reduce visual clutter by removing gridlines from the visualization. This makes it easier for users to view and interpret the data at hand.

Looking at the Average Salary by Experience Level bar graph below, we see the gridlines that are automatically added.

Image by Author: Bar graph with default gridlines

By removing the gridlines, the visualization becomes less cluttered, as seen in the following image.

Image by Author: Bar graph with gridlines removed

In Bokeh, removing gridlines is a quick process, and can be done by setting the grid_line_color to “None”. View the full code for this visualization here.

p.xgrid.grid_line_color = None
p.ygrid.grid_line_color = None

7 — Use pre-defined colors and color palettes

Color is a key part of any data visualization, and deciding on the right colors to use can take time. The Bokeh library comes with several pre-defined colors and color palettes.

Available color palettes can vary depending on the version of Bokeh you are using. To view the list of individual color names view the Bokeh documentation here.

To see the available color palettes for a specific version, check the Bokeh official documentation, or run the below command. This command lists the available color palettes based on the Bokeh version running.

bokeh.palettes.all_palettes.keys()

Bokeh color palettes consists of different sizes. To view the specific HEX colors available in a color palette, and the different sizes available, use the following command. This command lists the available sizes for the ‘Set3’ color palette, include the HEX colors.

bokeh.palettes.all_palettes[‘Set3’]

To import a color palette of a specific size, run the following command. This command is importing a size 3 Set3 color palette.

from bokeh.palettes import Set3_3

Alternatively, import all sizes in a color palette by specifying the palette name. The following example shows how to import all sizes for the Cividis palette.

from bokeh.palettes import cividis

It can be difficult to interpret HEX colors. To quickly view these HEX colors, you can use a function like the one below.

from IPython.display import Markdown, display

def printColorPalette(color_palette):
display(Markdown(‘<br>’.join(
f’<span style=”color: {color}; font-family: courier;”><span>{color}: </span>&#9608;&#9608;&#9608;&#9608;&#9608;&#9608;&#9608;</span>’
for color in color_palette
)))

This function takes a list of HEX numbers, and prints out the HEX Number and a corresponding color block. The following image shows output for the Cividis, Set3, and Spectral color palettes of various sizes.

Image by Author: Using function to print various color palettes

For more examples view the full code here.

8 — Display visualizations directly in Jupyter Notebook

When creating Bokeh visualizations in Jupyter Notebook, the default setting displays the output in a new webpage. Display the visualization directly in a notebook to quickly troubleshoot and develop visualizations.

To display Bokeh data visualizations in Jupyter Notebook, import the following functions.

from bokeh.io import output_notebook, show
from bokeh.resources import INLINE

Prior to developing any visualizations, call Bokeh’s output_notebook() function as shown below.

output_notebook(resources=INLINE)

Once the output for the visualizations has been set, use the show() function for each data visualization to show the output in the notebook.

For example, the following image displays how the show() function is called to display figure p, a scatterplot, within a Jupyter Notebook.

Image by Author: Data visualization displayed in Jupyter Notebook

Displaying the visualizations directly in the notebook helps keep the visualizations in one document. This makes it easy to reference the visualizations later without having to rerun the entire notebook.

These are a few ways to enhance your data visualizations with Bokeh! All examples, including code and datasets, are available here.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – admin@technoblender.com. The content will be deleted within 24 hours.
artificial intelligencebokehcreatingDataDecPatelPayalpythonTechnoblenderTechnologyTipsVisualizations
Comments (0)
Add Comment