Why we moved to Go (Golang)…

Miguel Mendez
Yik Yak Engineering
8 min readJan 24, 2017

In this post we will discuss why we made the decision to adopt Go (aka golang) as the standard programming language for backend development at Yik Yak, even though the original backends were mostly written in PHP.

Before going further, I must admit that I had serious reservations about discussing the language topic at all for the following reasons:

  1. I used to work at Google. So I could be perceived as having a bias from the outset.
  2. I worked on the Google Web Toolkit and an early, early, early version of Dart both of which colored my preferences language-wise.
  3. I’ve witnessed many fruitless static versus dynamic language debates which tend to avoid context and devolve into matters of pure preference. Everything has a time and place.
  4. I have seen very good PHP code. Usually written by a relatively small group of engineers who know how to avoid the Pit of Despair.

Having said all of that, given that the goal of this series is to share our journey, this choice must be discussed because it shaped a lot of the transformational work that followed. I will share the reasoning and try to be clear on the points which are a matter of preference while trying not to spark low-value debates.

In General

Early on in a startup’s life, the focus is on the search for a compelling problem to solve which can be profitable. The code is generally a secondary concern because you are trying to iterate quickly to clarify and validate the problem domain you’re going after. Now, because speed is of the essence, you want to keep things light and easy.

These criteria cast dynamically typed languages in a favorable light since they don’t enforce much structure or discipline.

Now what happens if you do hit on a profitable problem? Well, then you will need to expand the code’s features, make it able to serve a larger set of users and you will need to scale up your engineering org — while generating revenue. This is where the fast and loose nature of dynamically typed languages may become a problem. If you have engineers who are generally good with a dynamically typed languages then you are probably okay for a while or forever (think of Facebook).

In Yik Yak’s case, there wasn’t enough in depth knowledge about the finer points and gotchas of PHP, so seeds were sown which caused problems down the road. It was initially addressed with money and larger machines but it was clear that it would be difficult to scale the system to handle the traffic numbers we were expecting while still being cost effective. Scaling the engineering team was also a concern since PHP itself makes it harder to enforce good discipline.

PHP — Efficiency Baseline

You can write relatively efficient PHP code if you focus on it and you understand what parts of the ecosystem to use and which to avoid. PHP is an interpreted language so source code is often loaded and parsed dynamically in order to serve incoming requests. This load-and-parse on demand places an upper bound on the efficiency that you can achieve. Even if you use OPcache, the bytecode still needs to be interpreted which means the interpreter’s efficiency bounds that of your code.

The servers ran under PHP-FPM which processes each request in its own isolated OS process. OS processes are more expensive resource-wise to the machine compared to a thread within a process. In order to have more concurrent requests per server and respectable response times, you’ll need bigger machines but bigger machines cost more money, which weighs against your operational efficiency.

Another problem that can impact efficiency, is how to share state, when each request is handled in its own process. The easiest solution is to have each request/process simply reload the information from a persistent location, but now you are slowing down every request because of the extra load. You can use something like APC but that isn’t obvious to an engineer new to PHP. There is no clear path that leads there from the outset.

PHP — Please Help Me Learn

Early on in the startup’s life, the objective was to make the code work at all. It was clear that there were some fundamental misunderstandings about how PHP worked, but PHP as a whole gives very little guidance on the right way to do things or when you are clearly doing something wrong.

The classic example is a typo. While analyzing our production runtime logs we would see several warnings about “PHP Notice: Undefined Property” or “PHP Notice: Undefined Variable”. Sometimes the system would still sort of do the right thing, but it was concerning to say the least. The key issue, as mentioned before, is that PHP is a dynamically typed language. For example, it cannot know (in the general case) what properties are on an object until the code in question is executed.

In practice, you think that the code works because it has done so for a while. But then you get that one request which caused the server to execute a slightly different code path that contains a typo… and there go the next few hours while you play a character on CSI who wants to know what went wrong.

Another issue in our old code was a misunderstanding that, by default, arguments passed to functions are passed by value and not by reference. For example, if you pass an array to a function and it modifies elements in the array, those modifications will not be reflected in the original array. This little but insidious misunderstanding appears to be one of the reasons why whole blocks of logic for updating array elements had been copied and pasted around the code base, instead of being factored into a reusable function. This made the code much harder to understand, maintain and evolve.

PHP — Attempts at Cleanup

Even though it is definitely not sexy, I prefer to try and clean up what is there first before advocating a more radical approach, so attempts were made to clean up the code base. However, these cleanups often had their own unintended consequences which only showed up in production.

Case in point, a simple refactoring to rename a method failed to update one of the method call sites in the code. This went undetected until that one request triggered the unlucky code path and the PHP interpreter tried to execute the code which still used the old method name… and a production incident ensued.

Removing dead code also caused problems. In one case a method was no longer being called so the import statement at the top of the file which pulled in the method was removed. Unfortunately, removing that file also removed the transitive imports and their declarations which were called from the remaining code. Yet, there was no way to know this statically. Invariably, this was discovered in production when the pagers started alerting.

Even whitespace changes can be an issue. As part of cleaning up the code, the code was auto formatted to a standard form. Unfortunately, this reformat accidentally introduced an extra newline at the top of a PHP file. This manifested itself as a failure for Android devices to register, but the iOS devices were fine. It turned out that this extra newline made its way into this particular HTTP response payload which broke the HTTP parsing for Android only.

Could something like this happen with another language or system? Possibly, but the PHP ecosystem makes it more likely to happen and harder to find due to its very loose nature.

Language Requirements

So we had a system that worked, but was inefficient. It was written in a language that could be used to good effect, but had been used in a very inefficient manner and the language had no built-in mechanisms for catching problems ahead of time which complicated cleanup. We had a group of developers who were used to dynamically typed languages. And we were in the process of growing the engineering team so we needed tools that would aid in large-scale engineering.

We took a step back and decided that if there was to be a language change it would have to satisfy the following requirements:

  1. Statically typed with support for large scale engineering.
  2. Not interpreted due to efficiency concerns.
  3. For recruiting purposes, a sexy language that developers were excited to learn and work in.
  4. Garbage collected.
  5. Familiar to engineers with dynamically typed language experience.
  6. Support for systems programming.

The first two items knocked out PHP, Python, NodeJS and all similar languages. The third requirement knocks out Java. The fourth requirement rules out C++. And the fifth and sixth requirements lead us to Go.

Answer — Migrate to Go

Given those facts Go was a Go! It is a statically typed, compiled language so you can catch problems early on. Yet it feels like a dynamically typed, interpreted language so it isn’t too large a leap for the developers we had at the time.

Go compiles quickly to machine code so the edit/refresh cycle is relatively fast and yet fairly efficient machine code is produced. You don’t have the overhead at runtime of loading an interpreter which then has to parse a bunch of source code to even determine what should be executed. Because what is produced is an executable, which can host long-lived servers, you can cache information between requests for efficiency reasons.

Go is designed to make it easy to write highly concurrent, networked programs.

Go has a lot of built-in infrastructure to support testing, which means you can easily define and test modules, which further strengthen engineering discipline.

Lastly, and this is important, it made our recruiting efforts easier. All else being equal, it is easier to sell someone on a startup if they will learn a new marketable skill as opposed to learning a skill that is somewhat antiquated.

Conclusion

If you are going to use a dynamically typed language like PHP at your startup, take the time to understand the subtleties and pitfalls of that environment. If you are doing something quick and dirty, then give it a go and see. If things start to get more serious, then invest time in tests to cover the functionality that you cannot afford to break.

Statically typed, compiled languages are not a panacea, but they can be used to prevent a whole class of problems from ever making it into production — invalid code will not compile.

If you are looking for a language to use at your startup, do consider Go. It is part of a new class of languages that retain the benefits of static typing and compilation, while feeling like a dynamically typed, interpreted language… and the number of engineers with Go experience is steadily growing.

--

--