“The Future of Version Control”
Tom Preston-Werner talks about the early days of GitHub
The following interview is an excerpt from Unscalable: Unorthodox Startup Growth Stories.
Guo: So how did you and Chris [Wanstrath] first meet? Where did you get the idea for GitHub?
Preston-Werner: Well, we met through the Ruby users group that was here in San Francisco. We would go to conferences, and I saw him talk and we’d always talk about Git. This was back when Git was still very hard to use, but it was really bubbling in the air, around the Ruby on Rails community especially.
People were starting to pay attention to it, but nobody really knew what to do with it because it was so hard to use. And we were experimenting with it a little bit at Powerset. When I saw it, I thought, “This is obviously the future of version control.”
Once you start playing with Git, especially in combination with GitHub, then there’s no turning back from there. Git is just too awesome.
But there was only one website where you could share Git repositories, and it was quite bad, so I said, “I’m a web developer, why don’t I build a website to make it easier to share, with a central repository so that people don’t have to set up their own servers and create user accounts and all these kinds of things.”
At the time, Chris had also been thinking a lot about Git and how to use it, and we had already worked together on another project, a process monitor in Ruby. So we knew how bad the experience of sharing code via Git was because of that project.
I wanted to do it with Chris because I knew him through the Ruby community. So after one of the meetups — we met at a bar, as we often did after the Ruby meetups — I showed Chris a project I had started working on. It was a Git wrapper in Ruby that was called Grit. I showed it to him and I kind of pitched him my idea and said, “Are you interested?” And he said, “Yeah! I’m in.” So we just started hacking on it on nights and weekends.
I wanted him to do the Rails stuff. I thought I could do the visual design and back-end, and he could do the Rails app, and the two of us would make a really good team because of that complementary skill set.
Guo: When did you launch the product?
Preston-Werner: It took about three months before we started the private beta, so it wasn’t a long wait. It was a very, very basic product at first, as you can imagine. It was just two people working on it nights and weekends for three months.
Guo: Where did your first users come from?
Preston-Werner: We started showing it to people at Ruby meetups, since they were our friends. They were the people we would hang out with. In fact, back then a lot of people from Twitter, and Engine Yard, and Powerset were there. That kind of crew here in San Francisco.
So we’d show it around. We were like, “Hey, we’re doing this code sharing thing, and you can have Git repositories and put them up there.” We gave out invites to the people at the meet-ups and we told them that they could invite other people through our private invite system.
So that’s how we got our original users. They were just through the existing Ruby community that we knew. And that’s why GitHub started with the Ruby community, because it happened to be the community we were a part of.
And we were extremely lucky to have Rails come on board, to switch over from their existing version control system when we publicly launched. That was in April of 2008, which was about six months after we started coding. And that was huge for us. That alone maybe made GitHub possible at all.
Guo: How did you get DHH (David Heinemeier Hansson) and the Rails team to switch?
Preston-Werner: We emailed them and tried to explain why we thought they should switch to GitHub, probably four or five months in. And they responded and said, “No thanks!”
And then over the next month or two, or however long it was, they started seeing other people using GitHub. And actually it was Merb. Merb was the first prominent project, Ruby project, to switch over. It was Ezra [Zygmuntowicz]’s and Yehuda Katz’s project and they were two of our very first users.
So Merb switched over officially to Git, and that preceded Rails by a couple of months. And I think that was the first thing that got people to raise their eyebrows and say “Oh, interesting. An actual real project is using this.” For as much as Merb was an actual real project at the time. Nobody even knows what that is anymore, probably.
But at the time it was a lightweight alternative to Rails. And, so we had emailed the Rails guys and they had rejected us. But I think over the next couple of months they saw more and more people using it, enough stuff to where they started getting used to the idea of it. And probably started playing with it. And once you start playing with Git, especially in combination with GitHub, then there’s kind of no turning back from there. It’s just too good, it’s too awesome. Git is just too awesome.
That single event was really momentous in our adoption. Because what it did was, it made everyone in the Rails community almost overnight switch to using Git and GitHub as their version control.
Guo: Why was that?
Preston-Werner: People are going to use whatever their favorite projects use. You need to know the version control system that your tools are using, in order to contribute. And other people use that as a signal to know what they should be using for their packages, for their plug-ins, for things that are going to be useful for frameworks.
And because the Ruby on Rails community were such early adopters — it was in their very nature of having chosen to use Ruby on Rails at the time — it made it so that they were willing to try GitHub, to make a change without worrying about it too much. So that’s how we got our original set of users. It was just the Ruby community and then through Rails.
Guo: Did you try to recruit other projects to GitHub?
Preston-Werner: Yeah, there was a phase where we emailed a bunch of project maintainers and pitched them and asked them if they were interested in moving their projects to GitHub, but it wasn’t super successful. We tried jQuery, at one point one of us emailed John Resig about getting jQuery to switch over. And that was unsuccessful for several years!
And you know we would come in contact with him, and kind of joke with him about how he should switch over, and eventually they switched over all of the jQuery plug-ins, I think. There was something that they switched over, but they didn’t switch over the jQuery core for years. Mostly because they didn’t want to alienate their significant base of Subversion contributors. But eventually they relented.
Most of the project maintainers early on said, “Sorry, we can’t switch because of all our Subversion contributors.” And with the Ruby core itself, we talked to Matz and we talked to other people about switching. When I was at conferences with Matz or any of the Ruby core people, I would always bring it up. But what we found was that it’s impossible to convince project maintainers to switch. They have to convince themselves.
Guo: Was Subversion the most common roadblock?
Preston-Werner: Yeah, that was almost always the reason. With smaller projects, they would just switch over of their own accord, and you didn’t have to convince anyone. At least in the Ruby and Java Script communities, you knew that would just happen naturally.
But for the ones that had a real reason not to switch to Git and GitHub, it was Subversion. Or in the case of something like Linux, it was that they already had an existing process with their work flow and it would be too disruptive for them to try to change. So with Linux for instance, they have a very heavy mailing list based work flow. The way they accept patches is through the mailing list, and all the discussion happens there, and they have systems for dealing with that, and just the amount of infrastructure that they have built up is too vast to switch to another system.
And obviously the Linux kernel has always used Git, since Git was created. So they are using Git, but they just couldn’t use GitHub because of workflow.
Guo: How has the signup process for GitHub changed since then?
Preston-Werner: Actually, it’s primarily unchanged. Really all you need to do is create a repository on GitHub, on your account, and then push to it from the command line or from whatever tool.
At the time there was only the command line client. There were no graphical interfaces for a long time. Now you can use any of the graphical interfaces that are available: GitHub for Mac, GitHub for Windows, Git Tower. There are a ton of them now. So those are new channels that you can use, but the way you create a repository on GitHub is essentially unchanged.
There is a difference in that now you can create a new project on GitHub that is already initialized and has a Readme, so that you can clone it. You haven’t always been able to do that. You’d have to create the repository on your local side and then push it. So, that’s a little bit handy, but it’s really not much different.
I think that was the really big thing that allowed GitHub to become popular very easily. There was almost no barrier to entry to getting code on GitHub. Whereas with other things like SourceForge you’d have to go through an application and approval process to even get a place where you could put your code. It was a very high barrier to entry to put your code online with them.
And then places like Google Code were okay. You didn’t have to get permission, but it was still a single namespace, a global namespace, and it was project based. You also had to choose an open source license, which was another barrier. Because when a lot of people put up code, they’re not in the mindset to choose licenses and things. And I don’t think that you should force someone to choose a license just because they’re putting some code online. I don’t think that’s appropriate.
People are going to use whatever their favorite projects use. You need to know the version control system that your tools are using, in order to contribute.
So we’ve always taken the stance that you don’t have to choose a license if you don’t want to. We want to make it easy for you to do that, because we think it’s best practice to choose a license, so that others are clear about what that license is and how to use your code. But we never wanted to force that on people and by reducing the amount of choice there was, we made it easier for people to do it. We tried to reduce how much they had to think about it. It was like just push whatever code you have onto GitHub and you know you’ll never lose it again. You can just share it when you want to share it.
Guo: What sorts of things did you do manually for the company?
Preston-Werner: Well, the company was technology to serve people building technology. And the people using it understood what that meant, and everyone who was using it in the early days had to be sophisticated enough to know what it was in the first place; which meant that they were probably sophisticated enough to know how to run the basic commands and to get their code up on there.
The things we had to do manually were in tech support, which we did as a team. We all did support. We all did support for many many years. We got our first full time support person probably two years in.
You have to also realize though that the GitHub team grew extremely slowly in the beginning. After three years, we were a team of six people. We weren’t one of those heavily funded start-ups that grows to 100 people in year one. That just wasn’t our path.
We were bootstrapped, and we were letting people adapt to Git at a natural pace. We couldn’t really force people to use Git, it was impossible. People had to desire to learn it on their own, and had to desire to use it on their own. And so, we didn’t do a ton of marketing. Everything we did in marketing was unscalable, as the marketing that we did do was basically going to conference talks. We did a lot of conference talks.
What we found was that it’s impossible to convince project maintainers to switch. They have to convince themselves.
And that included us founders. We would spent a lot of time traveling, especially internationally. We gave talks about Git to help people learn how to use it and to talk about us as a company and what we were doing. And that was definitely unscalable from a founder perspective.
We did keep the value that we wanted GitHubbers to be able to travel and give conference talks. So we always had a policy that if you ever got a speaking slot at a conference, GitHub would pay for you to go there and give a talk. So we scaled out the conference talks by distributing that out to the entire company.
We also did drink-ups, which were essentially meetups at bars. It came from the idea that, one of our favorite parts of the Ruby meetups was the meeting with people afterwards at a bar to drink and talk about technology and it was a place where people could just share ideas freely. During the meet-up itself it was much more structured. People could get quite impatient and it was usually one person talking to a lot of people versus everyone talking to each other.
So, we always really enjoyed the part of the meet-ups where you’d be able to talk to everyone. And that was either afterwards at a bar or some other venue, and we’d get a couple of kegs or whatever, and we’d sit wherever the meet-up was and we’d hang out. Until we thought, “Well okay, maybe we can replicate that same experience of people sharing ideas with each other freely.” By just going to the bar and inviting a bunch of developers that are using GitHub to come and share their ideas.
So it was a way for us to do what I called “creating super-fans,” which meant going above and beyond what was expected from the company in order to surprise and delight the customers. We wanted to increase loyalty, to help them understand that we’re real humans, that we care about what they care about. And we wanted them to know that they had a venue where they could interact with us and let us know how the product was working for them, and what features they wanted and what they were doing with it.
So, it was an awesome channel. And I went to every drink-up that we had locally, or wherever I was where there was a drink-up. I went to every single one, except maybe one or two when I was sick, for probably three years. And they would be held here in San Francisco. In the early days, when we first started them, we did them every week at first. And then we switched to every two weeks and eventually we switched to monthly. But I went to an incredible number of the drink-ups to be able to meet our users.
I think that could probably count as something that didn’t scale in the long run, although I managed to pull it off for quite a while.
Guo: What did the growth curve look like at the time?
Preston-Werner: Growth was always pretty steady. There was never a moment in time where there was a crazy inflection point. It was just a steady, long term, exponential growth curve. And I think that was because it took a lot for people to switch to using Git.
It’s not a trivial thing to ask a company to switch their version control system. Most companies have a lot of infrastructure built around their deployment system and CI and all of their developers are trained to use something. To ask them to switch that, and all their projects, and everything to a new version control system is very high cost. So the larger the company the longer it takes them to do it, because of the inherent risk in changing that.
That’s a change that they are willing to consider doing maybe once every decade. And so that’s why I think the growth was slow and steady like that, because it was such a high impact change to a lot of people. It was great though as new start-ups were created and new projects, open source projects were started, they could just choose GitHub from the beginning. And that became more and more easy as time went on.
And then you’d have projects like CoffeeScript or Homebrew that were born on GitHub. These are things that exist in perfect symbiosis with GitHub from their very birth. And we started seeing those and that was pretty amazing. Right, things like Homebrew, they manage all of their recipes through GitHub. Everything they do is on GitHub and handled through GitHub. And that’s built into the core mechanism of their package management. And that was very cool to see.
Guo: What scaling challenges did you have on the technical side?
Preston-Werner: We were hosted on Engine Yard originally, for many years. And they were amazing, they really helped us succeed and we had a partnership with them. We put their logo in the footer and they gave us unlimited hosting and that allowed us to work really well in a bootstrap fashion because we were not paying as much for hosting.
So we used them for a long time, but they used at the time RedHat GFS, which is their global file system, so all the Rails servers would basically mount GFS as a drive and all of the Git repositories would be accessible through that one drive. So every front-end server didn’t have to care that the storage was not local. It would just make-believe that it was local.
The problem, though, was once we got enough front-ends there started to be a lot of contention for those files and the lock contention started to cause performance problems.
Somewhere between the second and third year, we started bottle-necking a lot because of that architecture. And so at that point, I spent a lot of time, probably around six months, re-architecting the entire site to be able to run on a distributed system where the storage was no longer local and was accessed through an RPC mechanism.
There’s a blog post I wrote called “How We Made GitHub Fast” that you can read, which is all about how I went about doing that. I completely re-architected how the back-end works to allow us to use commodity Linux servers as the storage nodes along with separate servers to act as front-ends and that allowed any front-end to contact any back-end and be able to find where the repositories were stored.
We’ve always taken the stance that you don’t have to choose a license if you don’t want to.
We deployed that on Rackspace, and we also had a partnership with them to get discounted storage for putting their logo in the footer. We did that for many years, but eventually we stopped doing that. But, I mean that’s another thing that I think you could consider, kind of non-scalable, in a way. We did those partnerships early on because it was a win-win. These companies got advertising and we got cheaper hosting.
But yes, there was a massive re-architect that had to happen for this site to continue or it would’ve gone out of business. Reliability started to suffer, I mean anyone who was around at the time would remember that phase. It was not awesome. But we were working our asses off in order to get the new architecture in place, which was a significant amount of work. And I wrote probably a half dozen open source projects to get it done.
Guo: What other serious hurdles did the company face early on?
Preston-Werner: Performance was the primary one. There’s also trust. If we were to have a serious security breach that could be an existential crisis for the company. Those were big ones for us.
We were lucky in early days of GitHub that there were no regulations that we were up against. There weren’t incumbents that had to be battled in that way.
Interested in the rest of Unscalable? You can preview more chapters and pre-order the book at https://www.inkshares.com/projects/unscalable/!