DevOps @ SEEK — A 4 Year Evolution

Andrew Hatch
SEEK blog
Published in
4 min readAug 8, 2016

Four years ago deploying a change to the SEEK website was a significant event on the IT calendar, usually it would occur late on a Saturday night as the entire site was brought down to complete it. This usually took a couple of hours plus a few beers and then at some time in the early hours of the morning it would come back online again. These deployments happened, on average, twice a month.

Four years later and SEEK is now very different. Just this year we hit a record 400 deployments — in a single month. Our biggest release management challenge now is no longer trying to re-align the planets, block out calendars and put everyone with a pager on high-alert, instead it is the daily task of ensuring everyone coordinates their deployments to ensure they do not run over each other. All of this is a great story, one we’re proud to talk about, yet we know we can do better. A lot better than this. Because once you start this transformation, the pursuit to continually improve everything you do becomes addictive.

This will be the first in a series of blog posts about the SEEK DevOps journey, from its beginnings mired in technical debt and reactive decision making, to our headfirst dive into the cloud, the pain and lessons learned along the way, how we managed growth and what we think we need to do next.

The world of 2012 — Living and breathing Conways Law

While it may be customary to imagine the origins of an IT transformation journey beginning from a dark place this is actually not the case here. In 2012 we were very much at the forefront of software delivery process and practice in Melbourne — for a business of this size. We were very Agile and we had built cross-functional teams with Product and Delivery people (architects, developers, BA, UX, QA) all sitting together. This was so different to how many other organisations were working at the time, that were still mired in waterfall or iterative waterfall delivery practices with segmented teams that sat on their own.

But there was a fundamental issue with how the teams worked. While seated next to each other delivery streams largely worked on their own sharing checkins and builds on a single monolithic designed source code and build system. Away in other areas of the building sat the Operations teams who were responsible for deploying builds and maintaining the monolithic source control and build system. This tyranny of distance between the two groups meant communication and collaboration was very difficult, if you were to imagine it the scene was a bit like this:

You’re free to build and test your code, but not to deploy or support it

Product and Delivery lived on their own island, Operations on theirs, with a fair amount of distance between. Crossing over to the other side to collaborate or work on shared change initiatives was very difficult.

This team structure, combined with huge legacy software delivery processes, created a number of common problems that are typical in many organisations. At SEEK, the three most obvious were:

  1. Monolithic Architecture — like our team structures and processes, it exhibited high degrees of tight coupling and complex call chain logic within processes
  2. Monolithic Infrastructure — Cost driven approaches to maximise computing and storage usage within infrastructure teams led to initiatives that sought to consolidate our systems to co-exist on the same virtualised machine instances. This gave us more resource capacity but massively increased the costs of maintenance and supportability of our servers.
  3. Monolithic Data Models — Databases were highly complicated and featured large amounts of cross-referencing between tables. In addition redundant data required a large amount of support to ensure indexing and cleanup procedures ran efficiently and did not impact systems.

In summary, our monolithic team structures and processes led to the creation of monolithic enterprise systems i.e. Conways Law.

If you were an outside observer during this time you would have noticed other outcomes and behaviours typical of organisations that are structured and work in this way:

  1. Hero culture being rewarded within Software Delivery
  2. Reactive fire-fighting culture being rewarded in Operations
  3. Lack of trust between Development and Operations
  4. Time-consuming and combative CMB processes
  5. Lack of innovation or willingness to go beyond known technology stacks
  6. Production environment locked away from all but a handful of people
  7. Development and Test environments constantly failing and breaking
  8. On-call and incident response time blowouts

All this sounds like a pretty sad and sorry state of affairs, especially when looking back 4 years ago. However this is a typical outcome for any IT organisation that focuses too heavily on costs; prizes stability and resiliency over innovation; and is structured with command and control management. Fortunately we have always had a good workplace culture so while times could be very challenging, people were supportive of others and always tried their best to work well together.

So what did we do next?

In the next post we will explore the foundations of creating a DevOps team, how we moved Operations into Delivery and what we did when Amazon launched their Sydney region in 2013.

--

--

Andrew Hatch
SEEK blog

Father, Santa Cruz Surfer, fiddler of old Datsuns. Engineering resilience as best I can