Mastering File Handling in Python: A Comprehensive Guide Part 8

Mr Stucknet
Python’s Gurus
Published in
4 min readMay 23, 2024

Working with JSON

JSON is the acronym for JavaScript Object Notation, and it is a subset of the JavaScript language. It has been around for almost two decades now, so it is well known and widely adopted by most languages, even though it is actually language-independent. You can read all about it on its website (https://www.json.org/), but we are going to give you a quick introduction to it now.

JSON is based on two structures: a collection of name/value pairs, and an ordered list of values. It’s quite straightforward to realize that these two objects map to the dict and list data types in Python, respectively. As data types, JSON offers strings, numbers, objects, and values consisting of true, false, and null. Let’s see a quick example to get us started:

# json_examples/json_basic.py
import sys
import json
data = {
'big_number': 2 ** 3141,
'max_float': sys.float_info.max,
'a_list': [2, 3, 5, 7],
}
json_data = json.dumps(data)
data_out = json.loads(json_data)
assert data == data_out # json and back, data matches

We begin by importing the sys and json modules. Then we create a simple
dictionary with some numbers inside and a list. We wanted to test serializing and deserializing using very big numbers, both int and float, so we put 23141 and whatever is the biggest floating point number our system can handle.

We serialize with json.dumps(), which takes data and converts it into a JSON formatted string. That data is then fed into json.loads(), which does the opposite: from a JSON formatted string, it reconstructs the data into Python. On the last line, we make sure that the original data and the result of the serialization/deserialization through JSON match.

Let’s see what JSON data would look like if we printed it:

# json_examples/json_basic.py
import json
info = {
'full_name': 'Sherlock Holmes',
'address': {
'street': '221B Baker St',
'zip': 'NW1 6XE',
'city': 'London',
'country': 'UK',
}
}
print(json.dumps(info, indent=2, sort_keys=True))

In this example, we create a dictionary with Sherlock Holmes’ data in it. If, like us, you are a fan of Sherlock Holmes, and are in London, you will find his museum at that address (which we recommend visiting; it’s small but very nice).

Notice how we call json.dumps, though. We have told it to indent with two spaces, and sort keys alphabetically. The result is this:

$ python json_basic.py
{
"address": {
"city": "London",
"country": "UK",
"street": "221B Baker St",
"zip": "NW1 6XE"
},
"full_name": "Sherlock Holmes"
}

The similarity with Python is evident. The one difference is that if you place a comma on the last element in a dictionary, as is customary in Python, JSON will complain.

Let me show you something interesting:

# json_examples/json_tuple.py
import json
data_in = {
'a_tuple': (1, 2, 3, 4, 5),
}
json_data = json.dumps(data_in)
print(json_data) # {"a_tuple": [1, 2, 3, 4, 5]}
data_out = json.loads(json_data)
print(data_out) # {'a_tuple': [1, 2, 3, 4, 5]}

In this example, we have used a tuple instead of a list. The interesting bit is that, conceptually, a tuple is also an ordered list of items. It doesn’t have the flexibility of a list, but still, it is considered the same from the perspective of JSON. Therefore, as you can see by the first print, in JSON a tuple is transformed into a list. Naturally then, the information that the original object was a tuple is lost, and when deserialization happens, a_tuple is actually translated to a Python list. It is important that you keep this in mind when dealing with data, as going through a transformation process that involves a format that only comprises a subset of the data structures you can use implies there may be information loss. In this case, we lost the information about the type (tuple versus list).

This is actually a common problem. For example, you can’t serialize all Python objects to JSON, as it is not always clear how JSON should revert that object. Think about datetime, for example. An instance of that class is a Python object that JSON won’t be able to serialize. If we transform it into a string such as 2024–03–04T12:00:30Z, which is the ISO 8601 representation of a date with time and time zone information, what should JSON do when deserializing? Should it decide that this is actually deserializable into a datetime object, so I’d better do it, or should it simply consider it as a string and leave it as it is? What about data types that can be interpreted in more than one way?.

The answer is that when dealing with data interchange, we often need to transform our objects into a simpler format prior to serializing them with JSON. The more we manage to simplify our data, the easier it is to represent that data in a format like JSON, which has limitations.

In some cases, though, and mostly for internal use, it is useful to be able to serialize custom objects, so, just for fun, we are going to show you how with two examples: complex numbers (because we love math) and datetime objects.

That’s it for Today. See you tomorrow.

If you love my blogs please consider purchasing me a book. Thank you.

Python’s Gurus🚀

Thank you for being a part of the Python’s Gurus community!

Before you go:

  • Be sure to clap x50 time and follow the writer ️👏️️
  • Follow us: Newsletter
  • Do you aspire to become a Guru too? Submit your best article or draft to reach our audience.

--

--