Techno Blender
Digitally Yours.

Analyze Data from the World Health Organization Global Health Observatory | by Randy Runtsch | Nov, 2022

0 29


Use the WHO’s Global Health Observatory oData API, Python, and Tableau to analyze worldwide health-related data

Holding hands with care.
Holding hands with care. Photo by National Cancer Institute on Unsplash.

The World Health Organization (WHO), founded in 1948, is the United Nations (UN) agency that connects nations, partners and people to promote health, keep the world safe, and serve the vulnerable. The WHO’s Global Health Observatory (GHO) is the UN’s gateway to healthcare statistics for its 194 member countries. This article shows you how to retrieve healthcare statistics data with the GHO OData API and Python and transform it from a JSON structure into a CSV file. The CSV file will be ready to import into data visualization tools such as Tableau.

The following sections will describe the following topics:

  • The GHO OData API and indicators
  • A Python program to retrieve and transform GHO healthcare statistics
  • A Tableau workbook to visualize healthcare statistics data

The GHO OData API is described here. The datasets you can access with the API are identified as indicators.

A full alphabetized list of indicators is in this Indicators Index. See example indicators in the screenshot below.

First page from the GHO indicators index.
First page from the GHO indicators index. Screenshot by Randy Runtsch.

Clicking on the name of any indicator will present information about the data, its metadata, related index, and sometimes visualizations. For example, scrolling through the indicators index and clicking on “Adolescent birth rate (per 1000 women aged 15–19 years)” returns the visualization shown below.

Data visualization for the GHO “Adolescent birth rate…” indicator.
Data visualization for the GHO “Adolescent birth rate…” indicator. Screenshot by Randy Runtsch.

The indicator used in this article is called “Probability of dying between age 30 and exact age 70 from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease.” You can find it on the Indicators page by searching on “cardiovascular” as shown in the screenshot below.

The indicator of interest on the GHO indicators web page.
The indicator of interest on the GHO indicators web page. Screenshot by Randy Runtsch.

To retrieve a dataset with the GHO OData API, you will need to know its indicator code (IndicatorCode) value. To find this value, navigate to this indicator page in JSON format:

https://ghoapi.azureedge.net/api/Indicator.

In the first screenshot below, the IndicatorCode of “AIR_10” is the identifier for the “Ambiant air pollution…” indicator.

List of GHO indicators in JSON format.
List of GHO indicators in JSON format. Screenshot by Randy Runtsch.

By scanning or searching the page, you will find an IndicatorCode value of “NCDMORT3070” for the indicator called “Probability (%) of dying between age 30…” We will use the IndicatorCode value to identify the dataset to retrieve with the GHO oData API.

List of GHO indicators in JSON format.
Indicator with IndicatorCode value of “NCDMORT3070.” Screenshot by Randy Runtsch.

Example datasets retrievable with the GHO OData API can be viewed in JSON format by appending the indicator code to the URL https://ghoapi.azureedge.net/api. For the dataset used in this article, the full URL is https://ghoapi.azureedge.net.api/NCDMORT3070. Navigate to this address in a web browser to view the data, as shown in the screenshot below.

Note that you may need to install a browser extension to view the data in properly formatted JSON. On my computer, I use the free JASON-Handle extension for the Microsoft Edge browser. It can be found here.

The Python program described below will use the fields shown within the red boxes above. It will retrieve all records for the dataset identified by the IndicatorCode value of “NCDMORT3070.But will process only records with a SpatialDimType of “COUNTRY.” Also, it will retrieve the values for country code (SpatialDim), year (TimeDim), sex (Dim1) and numeric value (NumericValue).

For the project described here, I used Microsoft Visual Studio Community 2022 for Python programming and Tableau Public Desktop and the Tableau Public website for data visualization. For Python programming, feel free to use whatever editor or integrated development environment (IDE) you prefer.

Visual Studio Community and Tableau Public are free tools that you can install from these locations:

Please note that while the commercial version of Tableau will allow you to save data visualization workbooks to a local drive or server, in Tableau Public, all visualization workbooks may be saved only to the Tableau Public server. Plus, the visualizations will be visible to the general public.

Program Overview

The Python program shown in the next section is divided into these two modules:

  • Class c_who_mortality_data.py class retrieves data for indicator NCDMORT3070 in JSON format, converts it to a Python list of dictionaries, and writes mortality records to a CSV file.
  • Module get_who_mortality_data.py is the driver program. It simply calls class c_who_mortality_data with the name of the output file where the program will write the data records in CSV format.

Here is the pseudocode of the program:

  • Call c_who_mortality_data with the output file name.
  • Request the data for NCDMORT3070 in a JSON stream.
  • Convert the JSON-formatted data to a Python list of dictionaries.
  • Open the output CSV file in write mode.
  • Write the column headers to the file.
  • Iterate through each record in the list. For each record where SpatialDimType = “COUNTRY,” create an output record in CSV format with the year, sex (converted from the Dim1 value), and value. Write the output record to the file.

Note that dataset NCDMORT3070 contains SpatialDimType values in addition to “COUNTRY.” The records with those other values will be ignored by this program.

The Code

The c_who_mortality.py and get_who_mortality.py Python modules are presented below.

The Python program to retrieve, reformat, and write GHO data identified with indicator “NCDMORT3070.” Code written by Randy Runtsch.

The Python program shown here writes records to a file called ‘c:/who_data/mortality.csv’. Following are some example records from the top of the file.

Sample records from the mortality CSV file.
Sample records from the mortality CSV file. Screenshot by Randy Runtsch.

The first record contains the column names of “country_code,” “year,” “sex,” and “value.” Subsequent records contain the data values.

To create a map and bar graph of mortality probability written to the CSV file by the Python program, I loaded the data into Tableau Public. Also, to convert the 3-character country codes to country names, I loaded this country master file into Tableau and joined the datasets on the country codes..

Tableau dashboard built with data from GHO dataset identified with indicator “NCSMORT3070.”
Tableau dashboard built with data from GHO dataset identified with indicator “NCSMORT3070.” Dashboard created by Randy Runtsch.

While instructions to recreate the Tableau Public Dashboard are beyond the scope of this article, you can view and download it here. After downloading it, you can manipulate it in Tableau Public Desktop or Tableau Desktop.

This article provided instructions for writing a Python program to use WHO’s GHO OData API to retrieve global health-related data regarding mortality caused by select diseases. You should be able to adapt the program to retrieve and process other GHO datasets as well.

I hope you found this article helpful. Please let me know if you have any questions.


Use the WHO’s Global Health Observatory oData API, Python, and Tableau to analyze worldwide health-related data

Holding hands with care.
Holding hands with care. Photo by National Cancer Institute on Unsplash.

The World Health Organization (WHO), founded in 1948, is the United Nations (UN) agency that connects nations, partners and people to promote health, keep the world safe, and serve the vulnerable. The WHO’s Global Health Observatory (GHO) is the UN’s gateway to healthcare statistics for its 194 member countries. This article shows you how to retrieve healthcare statistics data with the GHO OData API and Python and transform it from a JSON structure into a CSV file. The CSV file will be ready to import into data visualization tools such as Tableau.

The following sections will describe the following topics:

  • The GHO OData API and indicators
  • A Python program to retrieve and transform GHO healthcare statistics
  • A Tableau workbook to visualize healthcare statistics data

The GHO OData API is described here. The datasets you can access with the API are identified as indicators.

A full alphabetized list of indicators is in this Indicators Index. See example indicators in the screenshot below.

First page from the GHO indicators index.
First page from the GHO indicators index. Screenshot by Randy Runtsch.

Clicking on the name of any indicator will present information about the data, its metadata, related index, and sometimes visualizations. For example, scrolling through the indicators index and clicking on “Adolescent birth rate (per 1000 women aged 15–19 years)” returns the visualization shown below.

Data visualization for the GHO “Adolescent birth rate…” indicator.
Data visualization for the GHO “Adolescent birth rate…” indicator. Screenshot by Randy Runtsch.

The indicator used in this article is called “Probability of dying between age 30 and exact age 70 from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease.” You can find it on the Indicators page by searching on “cardiovascular” as shown in the screenshot below.

The indicator of interest on the GHO indicators web page.
The indicator of interest on the GHO indicators web page. Screenshot by Randy Runtsch.

To retrieve a dataset with the GHO OData API, you will need to know its indicator code (IndicatorCode) value. To find this value, navigate to this indicator page in JSON format:

https://ghoapi.azureedge.net/api/Indicator.

In the first screenshot below, the IndicatorCode of “AIR_10” is the identifier for the “Ambiant air pollution…” indicator.

List of GHO indicators in JSON format.
List of GHO indicators in JSON format. Screenshot by Randy Runtsch.

By scanning or searching the page, you will find an IndicatorCode value of “NCDMORT3070” for the indicator called “Probability (%) of dying between age 30…” We will use the IndicatorCode value to identify the dataset to retrieve with the GHO oData API.

List of GHO indicators in JSON format.
Indicator with IndicatorCode value of “NCDMORT3070.” Screenshot by Randy Runtsch.

Example datasets retrievable with the GHO OData API can be viewed in JSON format by appending the indicator code to the URL https://ghoapi.azureedge.net/api. For the dataset used in this article, the full URL is https://ghoapi.azureedge.net.api/NCDMORT3070. Navigate to this address in a web browser to view the data, as shown in the screenshot below.

Note that you may need to install a browser extension to view the data in properly formatted JSON. On my computer, I use the free JASON-Handle extension for the Microsoft Edge browser. It can be found here.

The Python program described below will use the fields shown within the red boxes above. It will retrieve all records for the dataset identified by the IndicatorCode value of “NCDMORT3070.But will process only records with a SpatialDimType of “COUNTRY.” Also, it will retrieve the values for country code (SpatialDim), year (TimeDim), sex (Dim1) and numeric value (NumericValue).

For the project described here, I used Microsoft Visual Studio Community 2022 for Python programming and Tableau Public Desktop and the Tableau Public website for data visualization. For Python programming, feel free to use whatever editor or integrated development environment (IDE) you prefer.

Visual Studio Community and Tableau Public are free tools that you can install from these locations:

Please note that while the commercial version of Tableau will allow you to save data visualization workbooks to a local drive or server, in Tableau Public, all visualization workbooks may be saved only to the Tableau Public server. Plus, the visualizations will be visible to the general public.

Program Overview

The Python program shown in the next section is divided into these two modules:

  • Class c_who_mortality_data.py class retrieves data for indicator NCDMORT3070 in JSON format, converts it to a Python list of dictionaries, and writes mortality records to a CSV file.
  • Module get_who_mortality_data.py is the driver program. It simply calls class c_who_mortality_data with the name of the output file where the program will write the data records in CSV format.

Here is the pseudocode of the program:

  • Call c_who_mortality_data with the output file name.
  • Request the data for NCDMORT3070 in a JSON stream.
  • Convert the JSON-formatted data to a Python list of dictionaries.
  • Open the output CSV file in write mode.
  • Write the column headers to the file.
  • Iterate through each record in the list. For each record where SpatialDimType = “COUNTRY,” create an output record in CSV format with the year, sex (converted from the Dim1 value), and value. Write the output record to the file.

Note that dataset NCDMORT3070 contains SpatialDimType values in addition to “COUNTRY.” The records with those other values will be ignored by this program.

The Code

The c_who_mortality.py and get_who_mortality.py Python modules are presented below.

The Python program to retrieve, reformat, and write GHO data identified with indicator “NCDMORT3070.” Code written by Randy Runtsch.

The Python program shown here writes records to a file called ‘c:/who_data/mortality.csv’. Following are some example records from the top of the file.

Sample records from the mortality CSV file.
Sample records from the mortality CSV file. Screenshot by Randy Runtsch.

The first record contains the column names of “country_code,” “year,” “sex,” and “value.” Subsequent records contain the data values.

To create a map and bar graph of mortality probability written to the CSV file by the Python program, I loaded the data into Tableau Public. Also, to convert the 3-character country codes to country names, I loaded this country master file into Tableau and joined the datasets on the country codes..

Tableau dashboard built with data from GHO dataset identified with indicator “NCSMORT3070.”
Tableau dashboard built with data from GHO dataset identified with indicator “NCSMORT3070.” Dashboard created by Randy Runtsch.

While instructions to recreate the Tableau Public Dashboard are beyond the scope of this article, you can view and download it here. After downloading it, you can manipulate it in Tableau Public Desktop or Tableau Desktop.

This article provided instructions for writing a Python program to use WHO’s GHO OData API to retrieve global health-related data regarding mortality caused by select diseases. You should be able to adapt the program to retrieve and process other GHO datasets as well.

I hope you found this article helpful. Please let me know if you have any questions.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment