12 versions of the same algorithm in JavaScript, Python, and Go

The algorithm

I recently had to write nearly the same code in Go and Python on the same day, and I realized I had written it many times before in different languages. But it does point up some interesting language differences. This article explores many different ways to write the same code.

The problem is fairly straightforward: you have a list of pairs — say, owners and their pets. But what you want to do is be able to get a list of all pets for every pet owner, and you have to do it often enough that simply traversing the original list isn’t what you want.

We want to convert this JSON array of owners and individual pets:

[
{"owner":"Kent", "pet":"Shiner"},
{"owner":"Mary", "pet":"Pumpkin"},
{"owner":"Mary", "pet":"Tasha"},
{"owner":"Paige", "pet":"Sushi"},
{"owner":"Sarah", "pet":"Shiner"},
{"owner":"Sarah", "pet":"Mr. Wigglesworth"},
{"owner":"Sarah", "pet":"Pupparoo"},
{"owner":"Sarah", "pet":"Lucy"}
]

Into this JSON map of owner to array of pets:

{
"Kent": ["Shiner"],
"Mary": [ "Pumpkin", "Tasha"],
"Paige": ["Sushi"],
"Sarah": ["Shiner", "Mr. Wigglesworth", "Pupparoo", "Lucy"]
}

1) JavaScript at its most basic

It won’t surprise anyone to know that there’s way more than one way to do it in JS.

Just to save space for all these examples, let’s assume that the variable “input” is set to the JSON listed above.

In vanilla JS, the most generic, traditional, compatible way to do it is like this:

Note that we need to test to see if the owner already exists in the output; if not, we create a new property for the new owner name containing a list with one element, the pet name. Otherwise, we just push the pet name onto the existing list.

If I saw this in a code review, I wouldn’t necessarily be unhappy, but I’d be thinking the developer needed to learn a little more about what JS can do. In particular, I’d rather see a forEach loop over the elements and using hasOwnProperty rather than just relying on indexing returning undefined.

2) Modern Vanilla JavaScript ES5

With those problems fixed, we get a more modern and idiomatic but still vanilla JS version of it. Note that the sense of the test has been reversed so we swapped the two if clauses.

3) Into the future

If we know we have ES6 available, we can use the safer constructs let, const, and for…of to modernize this:

The let doesn’t really matter here, but the const is a nice security blanket.

4) Basic Python as written by a JS programmer

Let’s take a look at how we might handle the same problem in Python.

We can certainly write it the same way we’d write the JS code:

If I saw THIS in a code review, I would be bothered by it, because this is not at all idiomatic Python.

5) Python by a JS programmer who has read PEP 8

Python prefers the “better to ask forgiveness than permission” model of programming, so instead of testing with has_key you should use try…except. I also personally prefer to use the named function dict(), rather than {}, because it’s clearer (and in other contexts has some real advantages).

This is ok; it’s pretty explicit and reasonably idiomatic. It still doesn’t feel completely “Pythonic” to me, though.

6) Python without an if statement

Here’s a way to do it without having an explicit conditional:

The get() function lets you return an item from a dictionary with a default value if it’s not found. This has to be three lines in the loop, though, because the append() function modifies its list in place, and the designers of Python chose to not have it return a value (so that people would realize that it modifies its target).

7) JS without the if (requires lodash)

Now that we’ve seen that it can work, we can do that last one in JS the same way, provided we use one of my favorite JS libraries, lodash, which has a Python-style get() function (as well as an each() function that’s a bit more general than the one built into JS.

8) Go with a generic data model

Lately, I’ve been writing a lot of code in Go. One of the things I like about it is that it has a very strong sense of idiom. There’s a Go way of doing things.

Here’s how I’d do it in Go while still treating the input and output as generic data structures:

In JS and Python, the JSON input data is literally the same as the program source, so I didn’t include it in the implementations for those languages. But for Go, we need to decode the JSON into something Go can understand. Here I’ve used []map[string]string. One of the things I like about Go is that type definitions are read left to right, very naturally — so that is read “slice of map of string to string”. Which describes the input data. (For the purposes of this article, it’s reasonable to think of a slice as an array.)

The output format is different: map[string][]string, which is read as “map of string to slice of string”.

In Go, there are two things that help us keep the loop simple:

  • Attempting to access an element of a map that doesn’t exist returns the zero value of the type of the map. So we get an empty string slice if the owner hasn’t yet been seen.
  • The append() function doesn’t modify its target; it creates and returns a new slice.

These two features let us write the body of this loop naturally and idiomatically in one line.

Note that the JSON encoding and decoding here ignore any errors that could occur, just to keep things small. Normally, you handle all errors in Go where they occur.

9) Go with a bit more structure

Here’s how I’d do it using a Go data structure to make the code more readable:

Note that now we define the Pet data type (with tags to help the JSON decoder read the data). This lets us refer to the fields by name rather than indexing into a generic map. Otherwise the code is nearly identical.

10) Back to Python for more compactness

Here’s an alternative to get the body of the Python loop smaller — initialize output with empty lists:

That first line, for those who haven’t seen it, is called a “list comprehension”. Let’s break it down:

[[e["owner"], []] for e in input]

This iterates over input and builds a new list, where each element of the list is itself a list containing the owner name and an empty list. So it looks like:

[ ['Kent', []], ['Mary', []], ['Paige', []], ['Sarah', []] ]

We then pass that list-of-lists to the dict() function, which interprets it as a request to build a dictionary with the first element of these pairs as the key, and the second element of the pairs as the value. (I mentioned that dict() has useful features!) Basically, it’s our output dictionary with all the keys but with empty lists as values:

{ ‘Sarah’: [], ‘Paige’: [], ‘Mary’: [], ‘Kent’: [] }

Then the for loop can just use these elements without having to test if they’re already arrays or not.

The problem with this strategy is that it requires looping over the input list twice. For small lists that’s not a big deal, but if you were dealing with many items it might be more of an issue.

11) Shorter Python — now we’re just playing code golf

For fun, we can make this even more compact. Instead of building a list by modifying it in place, we can use concatenation with the append function. And since we’re no longer modifying the list, we can use tuples instead (a tuple is like a list but is not allowed to change, which lets Python be more efficient about it):

I’m not convinced this is as clear as the previous method, but it does show how you can get the body of the loop into one line with no nested iteration. But we still have that initialization of output and the for loop.

12) Do it all in one line — Python gets obscure

For our last trick, let’s figure out how to do this in Python with just one line.

We can do it with nested comprehensions. The best way I found to pull this off is to build a set of the owner names so that we can iterate over it, searching the whole input array on each pass to find only the items that match this owner. (In case you haven’t encountered one before, a set is an unordered list of items where duplicates are discarded.)

This is what our one-liner looks like:

This is O(n²) so it’s REALLY not a good idea. It iterates over the entire list once for each individual owner.

It’s also nearly incomprehensible. We can break it down for a bit more readability:

Let’s just say that if you brought that to me in a code review, I’d congratulate you on your overweening cleverness and then sentence you to fixing obscure bugs in other people’s code until you got it out of your system.

Wrap-up

Proponents of the language Perl like to use the motto TMTOWTDI (“There’s More Than One Way To Do It”). It seems like that could also apply to JavaScript as well. I’m pretty convinced that the various JS versions I’ve put here would all be seen in the field, and in the absence of a strong style guide, many of them would even be seen in the same codebase.

Python also allows for a fairly wide variety of expression, but PEP 8 does guide users toward a “Pythonic” style, which generally applies to code that is compact and uses the language features without being obscure.

Go is aggressively obvious; clever compactness is explicitly not the Go way of doing things. The only real questions for the different Go versions above is whether an explicit type for the input data was wanted or needed.

Having been in the position of maintaining large codebases in many languages, including the three above, I’m really coming to appreciate the Go way of doing things. It may be a bit wordy sometimes but I find I can read Go code a year later and still know what it does.

Did I miss something above? Would you have done any of them a different way? Please let me know in the comments.