WINNING GAMES WITH PYTHON

Memoization and Decorators

GABRIEL MANUEL SIDIK
Tech Front
Published in
5 min readJun 4, 2020

--

(Click here for the first article of the series, and here for the previous article!)

Recursion — solving subproblems to a larger problem

To calculate the probabilities of the various outcomes in an extended battle in Risk, we would have to compute the probabilities through a recursion tree/recurrence equations.

Imagine if we were to do this calculation by hand — it’s a little tedious but not that hard. In order to calculate the outcome probability distribution of a 4v2 deathmatch, we have to calculate the distributions of 4v0, 3v1, 2v2 deathmatches respectively. In the above image, we can see that these are “subtrees” to the larger 4v2 recurrence tree.

That’s going to take some calculations but it is still pretty manageable! However, if we were to perhaps analyze an EPIC deathmatch (say, a 15v15 deathmatch), we’ll probably decide to let a program run it.

We can use the recursive get_distribution_of_results function (introduced in the previous article).

Let’s see the results -

It’s taking a while isn’t it?

A timed run of get_distribution_of_results(15,15) takes ~15* seconds on my local machine! Not too shabby, but let’s try it to calculate the distribution for a slightly bigger 16v16 deathmatch!

16v16 takes around ~45* seconds. And the 17v17 takes almost 2 minutes! (~110* seconds)

“Time Taken — *Note: this is just a quick and dirty test on my local machine to showcase the difference in runtimes

Why is this the case? Well, to demonstrate, let’s visualize the recurrence tree for a the 4v4 deathmatch:

Trees all the way down…

Notice that the 4v4 recurrence tree (the entire tree) is ~3 times bigger than the 3v3 sub-tree (bounded in the light blue box). We can also see this similar relationship between the 3v3 (light blue) and 2v2 subtrees (light red)!

This is a fundamental problem about recursive algorithms — particularly those which break the original problem into multiple sub-problems. Such recursive algorithms can be described as having exponential computational complexity with respect to the problem size.

Every time we move 1 level higher in the tree, there’ll be ~3 times more computations required. Going up just 2 levels, we’ll almost increase the problem by tenfold! — This is consistent with what we observed comparing the time taken for 15v15 against 17v17.

If one were to attempt to solve this manually using pen and paper, we’ll have to calculate them one by one, scribbling through many pieces of papers, perhaps using quite a few pieces of rough paper/sticky notes/memos to record the calculations that we have computed.

Wait! Hang on… perhaps we don’t have to keep doing the same calculations over and over again! If we were doing these calculations by hand, we could use the same results from other sub-problems which have already been solved!

Visualizing this from the tree diagram, we have -

Visualization of calculations (using in order traversal) — Calculated results (thick boxes) will be reused later again instead of calculating repeated configurations (dotted boxes). Grayed out boxes are not calculated at all.

If we “memoize” or “cache” our calculated results, we would not need to repeat many of the subproblems that we have previously solved.

Memoization

Memoization is an idea in computer science where previous/intermediate computations are stored and recorded such that repetitions of the same calculation need not take place.

We can implement the exact same concept for our problem as well!

Let’s breakdown the pseudocode for this problem:

One approach to implement this properly in code is to rewrite the entire function. But is there a way to easily transform our previously recursive function get_distribution_of_results into a memoized version?

Decorators

The answer, in short, is yes! Using decorators, we can “decorate” our function and give it a “memo feature”.

A decorator is a function which takes in a function as an input (?!), and outputs another function (?!) (typically an extended version of the input function).

This idea is pretty counterintuitive, a function taking in a function as an input and outputting another function?

Let’s see how it would look:

Here, we extend the original get_distribution_of_results function with the ability to memoize. Within the memoize function, we introduce a dictionary memo which is used as our cache (memory in the pseudocode).

We define a new function memoized_function within the memoize function and return it.

The memoized_function acts as an extension of our original function by checking the memo first for the subproblem — if it exists, then we can simply return the previously calculated answer, else, we will have no choice but to solve the subproblem. However, every time we solve a new problem, we will add the result into our memo! — This ensures that we do not repeat the same calculations that we have previously done.

Running this “upgraded” get_distribution_of_results function allows us to solve much larger problems much quicker! For example, running a 100v100 problem only takes 0.365 seconds.

In addition, we have the added benefit of “memory” — if we were to run a 90v90 problem right after calculating the results for 100v100, we would get the resultant probability distribution almost instantaneously as the 90v90 subproblem has already been solved and our new get_distribution_of_results function “remembers” the solution to the 90v90 problem!

Probability Distribution for a 100v100 deathmatch

Conclusion

This has been a challenging article — we’ve touched on 3 pretty challenging concepts:

  1. Computational Complexity
  2. Memoization/Caching
  3. Decorator Functions

Don’t worry if you still don’t feel like you have a strong grasp on these concepts — this article covered a lot of ground!

Using these concepts, we have managed to make our Risk battle analyzer much more efficient. In the next few articles, we will be focusing more on the statistical aspect of the board game (leveraging on our memoized function to not have computation time holding us back!)

Till next time!

(Find the first article of the series here)

Further Resources

Computational Complexity

Memoization/Caching

Decorator Functions

--

--

GABRIEL MANUEL SIDIK
Tech Front

New guy exploring all fields of technology. Senior of SMU Business Intelligence and Analytics Club