Tip 69 Checkpoint, It Saves Time

Pythonic Programming — by Dmitry Zinoviev (81 / 116)

The Pragmatic Programmers
The Pragmatic Programmers

--

👈 Cache It | TOC | Sort Big in Place 👉

★★★2.7, 3.4+ There is nothing more frustrating than irrecoverably removing a semi-finished file with a PhD dissertation or your program crashing after several hours (or days) of heavy-duty computations. Frequent backups help with the former. Checkpointing helps with the latter.

Checkpointing is a form of real-time backup and can be easily implemented in Python via pickling (Tip 16, Pickle It). To create a checkpoint, save the values of the variables that were hard to obtain in files, so that if the program crashes, you could restore these values from the files instead of recomputing them.

For example, your program may call fictitious and very time-consuming functions foo and bar:

​ result1 = foo(original_data)
​ result2 = bar(result1)

You may want to add one or two checkpoints, depending on whether to treat the final result in the same way as the intermediate results. Create a directory for the checkpoint data. Check if the results already exist before recalculating them. If a checkpoint concerns multiple results, store them as a tuple or one after another in the same file. pickle supports several objects per file.

​ ​try​:
​ ​with​ open(​'checkpoints/result1.p'​, ​'rb'​) ​as​ infile:
​ result1 = pickle.load(infile)
​ ​except​ FineNotFoundException:
​ result1…

--

--

The Pragmatic Programmers
The Pragmatic Programmers

We create timely, practical books and learning resources on classic and cutting-edge topics to help you practice your craft and accelerate your career.