Techno Blender
Digitally Yours.

How to Perform JSON Conversion, Serialization, and Comparison in Python | by Lynn Kwong | Jul, 2022

0 72


Learn basic JSON operations with simple examples

Image by kreatikar in Pixabay

JSON (JavaScript Object Notation) is a text format that is language-independent and is commonly used to interchange data between different applications. A good example is that the responses from APIs are normally in JSON format, therefore the backend and frontend can interchange data freely with no need to know the technical details of each other. In this post, we will introduce the common use cases of JSON in Python, a popular language for backend development and data engineering/analysis.

JSON and dictionary

Firstly, we should know that JSON is a string format. Therefore it’s different from the dictionary data type in Python. The JSON string can be parsed into corresponding data in any modern programming language. Normally, a JSON string can be parsed into two data types, namely, object and array. An object is an unordered set of key/value pairs and corresponds to the dictionary data type in Python, while an array is an ordered collection of values and corresponds to the list data type in Python

Conversion between JSON string and data

As mentioned above, a JSON string can be parsed into an object or an array, and vice versa, in all modern programming languages. In Python, the json library can be used for this type of conversion. We use the loads function to convert a JSON string into an object or an array, and use the dumps function to perform the opposite conversion. Note the s in loads and dumps stands for string which means they work on a JSON string. If s is not specified, then the functions would expect to work with JSON files, as will be introduced later.

This code snipped below demonstrates the common conversions between a JSON string and an object/array.

Interesting, when we dump the array back to a JSON, the result is different from the original one. If you check carefully, you would see a delicate difference. When we don’t specify a separator, a whitespace will be added after the item separator which is by default a comma. We can specify a custom separator to make the result the same. Note that we need to specify both the item separator and key separator even if we only want to change one of them:

Actually, the separators parameter is more commonly used to customize the representation of a JSON object. We can use different separators to make the dumped string either more compact or more human-readable:

The indent parameter is used to insert some whitespaces before each key to improve readability. And the sort_keys parameter is used to sort the keys in alphabetic order.

Add a custom serializer for the values that cannot be serialized

In the example above, all the values of the target dictionary (dict_from_json) can be serialized. In practice, there can be some values that cannot be serialized, especially the Decimal and date/datetime types:

In this case, we need to create a custom serializer function and set it to the default parameter:

Note that in the custom serializer, we use !r in the f-string to show the representation of the value, which can be handy for debugging purposes. If you uncomment one of the if/elif conditions, and run the json.dumps command again, you will see a corresponding error:

Compare the difference between two JSONs

Sometimes we need to compare the difference between two JSON objects. For example, we can check and compare the schemas of some tables that can be exported as JSON and fire some alerts if the schemas of some important tables are changed.

The jsondiff library can be used to compare the differences between two JSON objects in Python:

If we want to have control over how the result should be displayed, we can use the syntax, marshal, and dump parameters to customize the result.

We can use the syntax field to specify how the values and actions should be displayed.

We can use the load parameter to load data from JSON strings and similarly use the dump parameter to dump the result into a JSON string, which can be written to a file directly, as will be introduced soon.

Read and write JSON

We can write a JSON string to a file with the json.dump function. Note that there is no s in the function name. The one with an s (json.dumps) is for working with strings, not files. A JSON file is just a plain text file and the extension is by default .json. Let’s write the difference between the two schemas returned by jsondiff to a file named schema_diff.json:

A file named schema_diff.json will be created and will contain the JSON string contained in the variable result. If the values of a dictionary/list contain data that are not serializable, we need to specify the default parameter with a serializer function as demonstrated at the beginning of this post.

Finally, we can then use the json.load function to load data from a JSON file:

In this post, the basics of JSON and how to use it in Python are introduced with simple examples. We have learned how to read and write JSON objects, either from a string or from a file. Besides, we now know how to write a custom serializer for our JSON objects containing data that cannot be serialized by the default serializer. Finally, we can use the jsondiff library to compare the difference between two JSON objects which can be handy for data monitoring.


Learn basic JSON operations with simple examples

Image by kreatikar in Pixabay

JSON (JavaScript Object Notation) is a text format that is language-independent and is commonly used to interchange data between different applications. A good example is that the responses from APIs are normally in JSON format, therefore the backend and frontend can interchange data freely with no need to know the technical details of each other. In this post, we will introduce the common use cases of JSON in Python, a popular language for backend development and data engineering/analysis.

JSON and dictionary

Firstly, we should know that JSON is a string format. Therefore it’s different from the dictionary data type in Python. The JSON string can be parsed into corresponding data in any modern programming language. Normally, a JSON string can be parsed into two data types, namely, object and array. An object is an unordered set of key/value pairs and corresponds to the dictionary data type in Python, while an array is an ordered collection of values and corresponds to the list data type in Python

Conversion between JSON string and data

As mentioned above, a JSON string can be parsed into an object or an array, and vice versa, in all modern programming languages. In Python, the json library can be used for this type of conversion. We use the loads function to convert a JSON string into an object or an array, and use the dumps function to perform the opposite conversion. Note the s in loads and dumps stands for string which means they work on a JSON string. If s is not specified, then the functions would expect to work with JSON files, as will be introduced later.

This code snipped below demonstrates the common conversions between a JSON string and an object/array.

Interesting, when we dump the array back to a JSON, the result is different from the original one. If you check carefully, you would see a delicate difference. When we don’t specify a separator, a whitespace will be added after the item separator which is by default a comma. We can specify a custom separator to make the result the same. Note that we need to specify both the item separator and key separator even if we only want to change one of them:

Actually, the separators parameter is more commonly used to customize the representation of a JSON object. We can use different separators to make the dumped string either more compact or more human-readable:

The indent parameter is used to insert some whitespaces before each key to improve readability. And the sort_keys parameter is used to sort the keys in alphabetic order.

Add a custom serializer for the values that cannot be serialized

In the example above, all the values of the target dictionary (dict_from_json) can be serialized. In practice, there can be some values that cannot be serialized, especially the Decimal and date/datetime types:

In this case, we need to create a custom serializer function and set it to the default parameter:

Note that in the custom serializer, we use !r in the f-string to show the representation of the value, which can be handy for debugging purposes. If you uncomment one of the if/elif conditions, and run the json.dumps command again, you will see a corresponding error:

Compare the difference between two JSONs

Sometimes we need to compare the difference between two JSON objects. For example, we can check and compare the schemas of some tables that can be exported as JSON and fire some alerts if the schemas of some important tables are changed.

The jsondiff library can be used to compare the differences between two JSON objects in Python:

If we want to have control over how the result should be displayed, we can use the syntax, marshal, and dump parameters to customize the result.

We can use the syntax field to specify how the values and actions should be displayed.

We can use the load parameter to load data from JSON strings and similarly use the dump parameter to dump the result into a JSON string, which can be written to a file directly, as will be introduced soon.

Read and write JSON

We can write a JSON string to a file with the json.dump function. Note that there is no s in the function name. The one with an s (json.dumps) is for working with strings, not files. A JSON file is just a plain text file and the extension is by default .json. Let’s write the difference between the two schemas returned by jsondiff to a file named schema_diff.json:

A file named schema_diff.json will be created and will contain the JSON string contained in the variable result. If the values of a dictionary/list contain data that are not serializable, we need to specify the default parameter with a serializer function as demonstrated at the beginning of this post.

Finally, we can then use the json.load function to load data from a JSON file:

In this post, the basics of JSON and how to use it in Python are introduced with simple examples. We have learned how to read and write JSON objects, either from a string or from a file. Besides, we now know how to write a custom serializer for our JSON objects containing data that cannot be serialized by the default serializer. Finally, we can use the jsondiff library to compare the difference between two JSON objects which can be handy for data monitoring.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment