Python best practices: Docstrings

Ethan Jones
Geek Culture
Published in
3 min readJan 5, 2022

Even though docstrings offer developers a key way to improve their code readability, they are often overlooked in both teaching and practice.

Introduction

As per PEP 257, a docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the __doc__ special attribute of that object. It is used to describe the functionality of a module or a function including the parameters and return, should they be present.

In my many encounters with teaching material for Python, I have never come across any content explaining what docstrings are and the advantages of using them. Throughout the rest of this post we will explore best practices, some examples as well as the benefits of a clear, well-written docstring.

Best practices & benefits

A docstring is most conventionally defined under a function or class definition using triple double quotation marks and is broken down into 3 clearly defined sections:

  • Functionality description.
  • Breakdown of any parameters, if any.
  • Breakdown of the return, if any.

The aim of any docstring is to explain the part of the code it is referring to so that anyone new viewing it will be able to understand, in a general sense, what the section does.

From this, we can almost deduce the advantages of having docstrings in your projects:

  • Increased readability, especially for those that have never viewed the code before.
  • Ease of testing, clear descriptions of functions makes it easier to write thorough unittests.
  • Helps write clear and consist code — it’s often easy to get carried away, so reviewing docstrings keys functions to the point.

Example

For this example, let’s look at 2 different versions of a Python function docstring…

Version A

def filter_systems(data, threshold, keep_duplicate=False)
"""
A filter for systems that don't match the threshold.
"""

vs. Version B

def filter_systems(data, threshold, keep_duplicate=False)
"""
A simple filter that removes systems that are beneath the user-defined x-value threshold.Parameters----------
`data` : Pandas DataFrame
A pandas dataframe containing the systems from the xyz dataset. Must include columns 'x' and 'y'. All timestamp columns must be timezone aware.
`threshold` : float
A float threshold for the x-value to be applied to the filter. Must be between 1 and 10.
`keep_duplicates` : Boolean
Defaults to False. A boolean value determining whether duplicate systems should be removed in the filter.
Returns
-------
`filtered_data` : Pandas DataFrame
A pandas dataframe containing the filtered systems from the xyz dataset. Must include columns 'x' and 'y'. All timestamp columns must be timezone aware.
"""

Now out of the above two versions, the version B paints a much clearer picture of what the function does, right? Now, I haven’t wrote the function that the docstring describes however you probably have some kind of inclination of what it would be!

Conclusion

In this blog post we have explored what docstrings are, why they are important and the benefits they bring for including them into your coding moving forward. Hopefully you found this useful and consider the points raised here when moving forward with projects in the future!

All the best

~Ethan

--

--