Quantifying Docker Adoption with LinkedIN

By the middle of 2014, it was clear to me that Docker was a genuine developer phenomenon. Bursting out of the confines of the San Francisco startup scene, Docker meet-ups started popping up in New York, Berlin, London and everywhere that software was being developed and the company started raising capital at impressive valuations. And the hype seemed justified — not since the break out of Ruby on Rails in 2005 had a developer technology seemed to emerge on the scene with such enormous momentum.

And the Ruby on Rails analogy did raise a big question. From 2006 to 2009, Ruby on Rails had the fastest take off of any developer technology of the 2000's, but ultimately, it peaked at approximately 5% of the total developer population. Was Docker primarily a developer technology or an operations technology or both? Would Docker adoption cross over into “the Enterprise”? Or would Docker end up like Ruby on Rails: popular among a relatively small segment of developers doing page-based web apps.

Starting last April, I set out on a project to track Docker adoption in large companies. After considering all data sources, I settled on using LinkedIN data. Although LinkedIN is very definitely a lagging indicator (many people tend to update their profiles only when changing jobs), it is a unique and deeply insightful source of trend and comparative data. In this article, I’ll cover some of the top level findings from my analysis of LinkedIN data.

Topline Docker Trends (TL;DR)

From early 2015 to now, the number of LinkedIN profiles that match a “docker” keyword search has grown at a 300%+ annual growth rate: from just over 14,000 results in March of 2015 to over 60,000 results in June of 2016.

While this is very impressive growth, and even considering that LinkedIN is a lagging indicator, it is still overall early days for adoption. Compared to the number of profiles returned from three four other well penetrated developer/ops technologies and skills — Linux, .NET, Java and ITIL — docker is still in the first few percent of market penetration.

(For those of you that think Docker is more comparable to virtualization than any of these terms — the keyword “virtualization” returns 1.2M profiles).

Answering the question whether Docker is an Enterprise or a long-tail technology, I found that Docker presence is well distributed across company sizes. Looking at the cumulative percentage distribution for docker profiles across the company size segments tracked by LinkedIN, you can see that Docker is well represented across companies of all sizes. About 43% of docker profiles are from companies with fewer than 100 employees, but about a quarter are from companies with more than 10,000 employees.

Further, when we compare this distribution with the distributions for Java (stereotyped as an Enterprise technology) and Ruby on Rails (a long-tail developer technology), you can see that Docker falls somewhere between the two. The median company size for a Ruby on Rails profile is 51–100 employees; a Java profile — 1,000–5000, and a Docker profile 101–500 . (Incidentally, a side note of this data — Java and Ruby on Rails both have plenty of employee presence in companies large and small — most developers are multi-skilled.)

Massive Quantities of Granular Detail: The Fortune 100

But the very high end of the market has been slow to adopt Docker. Of the 65,000 LinkedIN profiles returned from a docker keyword search, only 3,338 were found in the Fortune 100. And within the Fortune 100, the top ten adopters constituted over 80% of that number. 24 of the Fortune 100 had zero docker profiles and another 31 had five or fewer. Docker is still clearly in early adopter territory. But, again, the growth is impressive. Here is the growth in docker profiles among the top ten adopters in the Fortune 100.

As you can see, IBM employees are the most aggressive adopter of Docker, followed by Cisco and HP. However because IBM Services is an enormous division, it’s helpful to scale the adoption by the size of the technical workforce. I tested a number of different scaling factors for reasonableness and ultimately decided to go with the average of ITIL+Linux+Java+.Net profiles as the scaling factor. I call these “technical profiles”. After accounting for this scale, we can get a sense of the relative penetration of Docker — and here Cisco climbs to the top of the rankings, with IBM falling quite a few places.

All Fortune 100 companies with >10 docker profiles and docker % > 2%

In many ways, this list shouldn’t be a surprise to anyone who’s worked in Silicon Valley Enterprise sales or marketing. These are the usual suspects for almost *any* new infrastructure technology.

The Fortune 100 Scattergram

We can also look at Docker adoption within the context of how many technical software employees the company has, as measured by the number of “technical profiles” returned (see above for the definition). The scattergram below plots a company’s Docker adoption scaled by the number of technical profiles vs. the number of technical profiles scaled by the total number of company profiles on LinkedIN. One way of looking at this second axis is the “software intensity” of the workforce. (This feels like a measure that could do with some refinement though.)

The standout company, again, is Cisco. 9% of Cisco employee profiles match our technical keywords and 5% of these technical profiles match Docker. We can also see companies like Disney that have relatively small proportions of technical profiles, but who have a relatively high share looking at Docker.

Docker: A Great Start, but Lots to Do

Acknowledging the limitations of LinkedIN profile data, we can still see that Docker is an extremely rapidly growing technology overall, making good progress in all segments in the market. And while its overall Enterprise penetration is low, it’s also making good progress — primarily in Cloud and Systems Integrator companies.

I’ve also taken a look at Docker penetration among SaaS companies, that’s a dataset I’ll explore in a followup article. (Update: “Quantifying Docker Adoption in B2B Saas”)

Appendix: Methodology & Data Details

When deciding what information base to use to track Docker adoption, I considered all the places that a popular software project leaves internet fingerprints. Meetup, stackoverflow, github, indeed.com, and linkedIN were the top candidates. Although stackoverflow has a nice query tool, in the end I decided to use LinkedIN. Why?

  • Practicality: I had a premier subscription that allowed me access to search results from the entire population base.
  • Compared to other community sites like stackoverflow, meetup and github, LinkedIN has a high penetration among professional employees. My back of the envelope tests suggested between 70% and 90% of large company employees had an entry on LinkedIN.

The downside of LinkedIN is that it is a lagging indicator of adoption. Although some members are diligent about adding new skills as they learn them, most don’t bother updating their profile until they look for a new job. Only 25% of LinkedIN members are active in any given month. In addition, many people don’t consider Docker to be an important skill to add to their profile, or simply don’t bother adding that level of detail. However, even if the absolute numbers were understated, tracking the relative presence of Docker as a skill over time and across companies on a consistent basis would be very valuable.

Happily, my preliminary tests for Docker as a LinkedIN keyword search term yielded good quality results. When searching for docker as a keyword, profiles will match the search if the word docker is found anywhere in the profile, including in endorsement tags, and free text. Luckily, I found only two confounding usages of “docker” and those were relatively rare. “Dockers” — the apparel brand — occasionally showed up in retail and supply chain relative profiles. And“docker” — meaning dock worker or dock loader appeared occasionally in profiles from logistics and shipping companies. But the prevalence of these usages were quite low — less than 1%, so that noise level seemed excellent.

Since I wanted to track Docker prevalence in Enterprises, my choices were to use LinkedIN’s Fortune 100/500 etc. filters or to use company size. I decided to choose the Fortune ranking filters rather than company size. While it would take a little time to do, I also decided to look at Docker adoption among current employees of the Fortune 500 as well at every single Fortune 100 and top 50 B2B SaaS company. It was lucky for me that I decided to do both, because LinkedIN removed its “Fortune XXX” filters in late 2015, leaving me unable to track the total number of profiles in the Fortune 500 consistently.

It turned out that using Fortune 100 as the sample set had its drawbacks. For one, a number of the Fortune 100 are conglomerates who don’t have many employees working directly for the named Fortune entity. International Assets Holding, Enterprise GP Holdings and Berkshire Hathaway are good examples of this. In addition, the Fortune 100 list often had retained the older names of companies that had renamed themselves so LinkedIN results were split between the old and new names. Sometimes LinkedIN was smart enough to know that a new company was the same as the old, sometimes not.

Equally important, a non-trivial number of Fortune 100 companies have outsourced their technology infrastructure and their application development. Many large companies (e.g Kraft) have few to no employees with system administrator, software engineer or developer titles. However, in many cases, the out-sourcer is IBM or HP — both members of the Fortune 100, so we capture the presence of Docker in those customers when we look at the prevalence of docker at both IBM and HP.

The datasets were collected manually by searching on individual companies whenever I had a free few hours — but every time slice was completed within a few days of starting, so there may be small inconsistencies in the data, but within those constraints, I’m fairly confident in the overall accuracy of the data.