Avoiding Business Stasis by Modernizing Ops, Architecture & More

By George Loyer, Director of Technical Operations

Like any e-commerce provider, we at StubHub — Stubbers, as we’re called inside the building — know that nothing ever stands still.

We bring over 100 million unique visitors to the site every year and sell one ticket every 1.3 seconds (on average) across 48 countries and languages. But if we don’t keep improving the experience, and bringing fans back again and again and again, then we’ve gone static. And businesses don’t do static!

In our effort to up our game, we have to:

  1. Constantly change the way we deliver products to our fans
  2. Increase our use of data to further personalize the fan experience
  3. Radically expand the number of experiments we conduct
  4. Double-down on our focus on the fan journey
  5. And accelerate everything — our use of data, our frequency of experimentation, and most importantly, our move to the public and private cloud

Embarking on this journey and accomplishing our goals requires tackling these efforts at measured speeds. Given the size and ambition of our modernization journey, this means embarking upon a 2-to-3-year journey: crawl, walk and run:

Crawl

● Reduce risk by delivering the platform and the first apps as demonstrations

● Move into the private cloud first in a familiar environment, then to a Google Cloud Platform (GCP) as experience is gained

● Uncover challenges as we deliver the first apps using fully automated CI/CD and testing

Walk

● Ramp up more teams after sequencing the apps that will be modernized from an existing services architecture

● Early apps could include new inventory search and a modernized front-to-back checkout stack

● Learn more about running apps on the platform in production

Run

● Full team activation on modernization across all bounded contexts

● Hybrid deployment enabling flexibility for incremental transformation of the architecture

What self-service, transparent deployment looks like.

Modernization Solutions

So, what exactly are the solutions we’ve found in the modernization journey? For each respective team, we’ve found that they look like this:

Ops

● Using the Pivotal Cloud Foundry (PCF) platform as a service (PaaS) in private data center

● Using the PCF platform on the Google Cloud Platform (GCP) as a hybrid cloud deployment strategy

● Auto-scaling microservices clusters to drive higher utilization

● Making everything possible self-service so that there are no tickets to get to production

So far, we’ve learned that automating ops with PCF and the bos scripting language has freed up capital spend and helped us identify those opportunities where ops was over-deploying to reduce risk.

Architecture

● Application transformation best practices to refactor services to be cloud native

● Application dojo engagement to implement green field cloud native apps

● Enterprise architecture engagements to define clean bounded contexts

● Modernization forum led by engineering to prioritize and sequence application targets

● Hybrid deployment to solve wide area network (WAN) latency issues during early phases

We’ve learned that self-service operations and auto-scaled infrastructure create value in the form of increased developer velocity, which in turn helps pay for re-architecture. And re-architecture contributes value back to the team by reducing the coordination.

Developers

● Full automation of the software development life cycle (SDLC) using Concourse pipeline continuous integration/continuous deployment (CI/CD) that incorporates full test automation

● Adding self-service tools to support InfoSec goals and other governance to put control back in the hands of the developers (and responsibility for outcomes)

How automation and self-service tools build an SDLC that is continuous in integration, deployment, and maintenance.

Hybrid deployment allows services to incrementally move to microservices. These changes have also helped us find a way to transition to the new architecture without a “big bang.”

Business

● Bounded context and cloud native design reduce cross-team coordination, reducing the time from an idea to value in production.

● Pipeline, test and compliance automation eliminate the ticket-queuing tie-up.

● Shorter time from idea to production allows more experiments

● Blue-green deploy per microservices cluster reduces cross-team coordination and reduces risk of change in production

Thus far, our modernization is giving us a view toward future value and we believe the gains will be significant in the future.

The potential is tangible — here’s a load test where we autoscaled from up to 100 cluster members, and autoscaled back down to 30, and finally to 25.

Combating Fear

Fear is inevitable during any modernization growth spurt. For instance, the operations team may fear that an increase in automation will lead to the loss of human expertise. Re-architecting the software may be perceived by developers as a threat to well-defined traditional team scopes and organizations. For the business owner, a poorly executed modernization takes away resources and doesn’t lead to improved agility.

The concern many folks voice when they don’t know how to run or create a platform is that they don’t know what their place will be in the new organization. But what has started to become clear to those participating in our modernization effort is that their skills are being expanded — not replaced. And that enables them to take on new roles in the organization.

The Opportunities

One of the fundamental things that’s happening at StubHub is a complete change in the way we think about new ideas. The change in our stack allows us to work in any language and because we fully expect to move beyond Java and get into Go and Ruby and node.js, we can innovate and rethink our future in more ways than ever before.

There are plenty of opportunities to take on new technologies. We’ll see the organization increase resources both in Pivotal and in GCP that allow us to run a functional program that only appears when you ask the question and gives the result and then goes away.

All that infrastructure is being enabled right here, with all the cutting-edge technology available right now, in the context of transformation. The value being delivered by developers is critical to the continuing success of the company and that means scaling a multi-billion-dollar business in a very competitive marketplace.

It’s exciting — because the impact is immediate. We’re not simply testing a new product or building a new business, we’re helping fans around the world do what they like best at scale. Existing and future Stubbers know they are instrumental to our ownership of the marketplace. And we know that the best Stubbers, now and in the future, are those who are able to grow with and guide our company — new challenges, new skills, new successes and new failures, but all in the pursuit of impact at scale.