The Hidden Gems of Python Standard Library: Never miss a crash on production

The Python Standard Library is a pool of great utilities and tools that can make life easier for any developer. Given its size it’s easy to miss some modules or features that are available. For this reason I decided to start a set of articles that will reveal some of the features that I consider the best and little known gems available in the Standard Library.

This article tries to showcase a way to never miss a crash in your production software with very little error, and all you will need is using something provided by the Standard Library out of the box.

You should never miss a crash

As much as we try to prevent them with automated and manual testing, bugs are something every developer and software project learnt to live with.

Sometimes it looks like users have some kind of superhuman skills that allow them to trigger unexpected conditions and put the software in states that we could never foresee.

Users environments and actions that made possible to trigger the unexpected conditions and discover the bugs are a valuable information. The bug itself is something that we might never discover without the help of the user.

All this information are lost forever unless the user is willing to invest a lot of time in reporting the bug and helping developers investigate the problem with steps required to reproduce them problem and details about any input or software environment use to trigger the problem.

The problem is that in the majority of cases, the user doesn’t care enough and will just stop using the software. Very few users report crashes they face. In case of asynchronous actions they might not even notice that they faced a crash and just think that the software blocked.

That’s why it’s vital for the success of a software project to be able to detect and gather all crashes. Because unless they are registered somewhere, we will never know they existed, we will never be able to fix them and thus our software will continue to pile up bugs we never knew they existed.

Report software crashes using logging.

Many people know the python logging module, but fewer people know that it can log not just messages, but tracebacks too. And even fewer know that it can be configured with a set of very flexible and powerful message handlers that will be able to send the logged messages everywhere. From the most obvious case of sending those messages to log files, to leveraging the SysLog protocol, sending them through networks or even across emails.

By configuring the a logger with a proper handler, we can have our logged messages sent by email. This will allow us to receive a notification every time something is logged into that logger

At this point, we can call our configure_crashreport function to setup the notification by email

Once the logging is properly configured, everytime we log something through crashlogger we will receive it by email.

But we are interested in receiving some very specific messages by email: The crashes. So while we can log anything we want, the interesting part comes from joining this with the feature of being able to log tracebacks.

Particularly we want to track any crash from our main function (or from our main loop) and send them by email.

To do so we can create a decorator, which we will apply to the function that in case of crashes it should notify us.

In case of a web application this function would probably be the one that serves each request.

Then we can apply the decorator to the main entry point of our software to actually get notified of crashes

When main is called, the program will crash with a ZeroDivisionError and if we did call configure_crashreport we will receive the crash by email.

We could even use sys.excepthook instead of decorator, to track every uncaught exception, but it’s usually best to provide explicit behaviours, so by using a decorator we make very clear that every exception for that function will be notified.

Web Applications and WSGI

For web applications it’s easy to apply this pattern to the WSGI function so that it’s decorated to send any request crash by email

There are solutions available for any WSGI compatible web framework that that provide middlewares that will trap any application failure.

Once of such solutions is the Backlash project from the TurboGears2 web framework that can be used with any other framework and has no dependency on TurboGears itself.

You can wrap your application in the TraceErrorsMiddlewareand get notified with wathever reporter you configured (to get emails we can use the EmailReporter):

Backlash also includes support for reporters like Sentry one that will organise your crashes on Sentry instead of sending them by email, so you don’t have to aggregate them yourself.

Tools like Backlash also can include additional features like tracing slow requests or debugging live application, whatever way you go: Through the standard library or through a third party tool, you should really track your crashes on production and make sure your users do not face failures and crashes your are not aware of!

Most of “The Hidden Gems of Python Standard Library” articles are related to recipes that came from the Modern Python Standard Library Cookbook that I had the chance to recently write.

I think those pieces of the standard library are so helpful in everyday life that they should be spread as much as possible and thus will be part of a series of articles I’ll be publishing on this blog.

Passionate software developer and manager, TurboGears2 core Contributor and current maintainer of Beaker, DEPOT, DukPY and a few other Python libraries.