Pushing Google Sheets to the limit of coding

Photo from adventofcode.com

Part of the job of working in tech is trying to find the best and most efficient solutions to any problem. This therefore requires the likes of the talented engineers and data scientists that we have at Gousto to be able to find intuitive ways to make our code as easy and efficient to use as possible.

But what if you’re not very good at coding? If you’re someone like me who is only a Python novice, then working to these standards can be quite challenging. Luckily though, traditional coding is not the only option and with how advanced spreadsheet applications like Excel and Google Sheets have become now, there is a lot more scope for these wider known tools to achieve the same results. Granted, these won’t be anywhere near as efficient or dynamic as a well written piece of code, but as long as you get to the same answer in the end that’s all that matters, right? Right?

Enter the Advent of Code; a yearly challenge that pits programmers to come up with and write some code, or at least it assumes that you will write some code, in order to solve problems. These range in difficulty from 5 minute jobs to the kind of thing that would make your heart drop in a FAANG interview. The concept is very simple: each day, two new problems are released which give you an input file which requires an algorithm to be written to format it and produce a single value as an output result.

When I first saw this challenge a thought dawned on me: when I first entered the professional world, I was quite shocked at how much of the corporate world is run off spreadsheets rather than purpose built applications. So then considering that, how much of this challenge would be possible using only spreadsheets? And then here we are…

Instead of going through each and every solution, I’ve summarised some of the more useful techniques that I employed to carry me through this challenge along with some of the biggest blockers:

Splitting out text

The biggest issue with this challenge was that the input files were all plain text files with often multiple data points per line. When this is pasted into G-Sheets, this will split out into rows, but will not split out into columns. This causes a big issue as for any formula in G-Sheets, you need the input data to be in a very particular format; one which is not provided by the data. Therefore, before any work can begin, these need to be formatted into something more workable for a spreadsheet, which means splitting out the data points, and sometimes even the individual characters.

The first time I did this, it was a very static process of using =MID(A1,x,1) to manually pull out each string or character. This works for small strings, but becomes infeasible for anything of reasonable length. As a result, I had to eventually come up with a more dynamic solution (=REGEXEXTRACT(A1, REPT(“(.)”, LEN(A1)))) which automatically splits out any string its given, no matter the complexity.

This was a pattern that showed up several times during the challenge where I would start with a manual solution, before having to upgrade to something more dynamic out of necessity.

Iterations

One of the biggest advantages that coding has over spreadsheets is the ability to loop and iterate. Now if you’re wondering how I replicated this, the answer is very much: ehhhhhhhh…… kind of. Simply put, you can’t loop and iterate in spreadsheets to the same capacity as a piece of code can, so instead it’s becomes more a case of finding a hacky solution to get around having to use it in the first place.

The first way, which is annoyingly the most painful but at the same time my most used method, is to just repeat the process that you want to iterate over the number of times that you need to iterate. If this is something small that you are repeating then this is pretty simple, but when you have a 140x140 matrix of formulas that you are repeating 534 times, you’re better off finding an easier solution, unless you’re okay with repetitive strain disorder.

This also naturally leads into the processing problem that G-Sheets has. With the example above (which was the perfect way to spend Christmas morning, thanks AoC…), G-Sheets will be on the verge of collapsing under its own weight when you have only 20 matrices. To get past this point then, it becomes a case of repeating the process as much as G-Sheets will allow, then delete the majority of the iterations to clear up the space (remember to keep a record through!). Repeat this until your brain is numb and you’re there!

Another way to get around iteration though is to find a more dynamic system through bucketing. This was required for several days where even a poorly optimised code would have taken days to run. For a lot of these problems, a much better solution is to look for if the results that you are counting can be categorised into a number of buckets. This means that instead of individually counting everything, you can instead just count the size of these defined buckets and save yourself a lot of time.

Manual Intervention

The big benefit of using spreadsheets over coding is that you have the opportunity for manual intervention. This means that instead of coding some things, you just do it instead! This may sound quite archaic, but actually leads to some days where using G-Sheets was actually a far quicker option than coding the solution!

One example of this was when there was a matrix of values representing holes where the number in each cell showed the depth of the hole. The challenge was to determine the 3 largest holes which from a programming exercise requires some tricky coding. Not for G-Sheets though! All I did here was add some conditional formatting to show where the edges of the holes were and then I just looked until I saw the 3 biggest holes.

If you’re using spreadsheet, you can also make your solutions colourful as well!

Brute Force

Along with manual intervention, there is another solution that we can employ from spreadsheets which is just brute forcing the answer. There were a couple of problems where using spreadsheets can get you close to the answer, but not give you the actual result. These happen when there are assumptions that you have to make in order for spreadsheets to be able to work at all.

A good example of this was when the problem needed you to find the quickest route through a complex maze. To avoid circular references, an assumption had to be made that each step through the maze had to move towards the end and couldn’t backtrack. As this assumption worked for a smaller version of the puzzle, I knew it would get very close to the solution, but may not actually reach it. So once I had my flawed answer, it was simply a case of submitting lower results until I was successful.

The website does have buffers before you can submit a new answer specifically to stop this behaviour, but if you’re willing to use G-Sheets to solve these problems, then you’re also going to be willing to wait it out!

Conclusion

Overall, this was a great experience in order to really push G-Sheets and myself to the limit. When you purposely set yourself a barrier which restricts you from choosing the easiest option, it forces you to think outside the box and leads to some of the most creative solutions to a problem. I would highly recommend this kind of exercise to anyone who wants to discover new ways to tackle systems that they already use.

And looking at G-Sheets specifically, it performed far better than what I initially anticipated. Over the 49 exercises in total, only 16% weren’t do-able in G-sheets which is a resounding achievement for spreadsheets (There were even 10% of the problems that could have been done on pen and paper!).

But for anyone who has read this article and is now thinking “great, I don’t need to learn coding anymore, I can just use spreadsheets!” I’m going to stop you right there! One thing that becomes quickly apparent is that the main reason that this works is because you only ever need to produce a one time output for each problem which allows for a more manual approach. For most business cases, you need a system that you can use multiple times consistently which G-Sheets often just does not allow.

So no need to change any of your working practises, but maybe give a little more appreciation to your spreadsheet application- it’s a lot more powerful than you think!

--

--

--

Gousto Engineering & Data Blog

Recommended from Medium

Is There Any Free API To Get Metal Rates In United Arab Emirates Dirham?

32-bit PCs cant download Windows 10 build 15025 ISOs released later this week

Differences between static and dynamic libraries

The table of contents below will give you an idea of what this guide covers. You don’t have to

Apache PIG — Part I

The Redux Guide I Wish I Had When I First Started

Search Algortihms in the Real World

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
James Davis

James Davis

More from Medium

Evolution of the data team at Motorway

Requirement Specification Template

Reverse ETL is only a portion of the solution

Incomplete bridge under construction

Why Data Economy is subsuming the API Economy