Deep Stacks Versus Deep Systems

Ben Sigelman
LightstepHQ
Published in
4 min readMar 27, 2021

--

Originally posted as this twitter thread.

0/ Deep systems have come to the fore in recent years, largely due to the industry-wide migration to microservices.

But weren’t monoliths “deep”, too? Well, yes and no.

And this is all related to tracing, observability, and the slow death of APM.

Thread:

1/ First, let’s start with monoliths. Of course they’ve been around for a while, and it’s where most of us started. There is plenty of depth and complexity from a monolithic-codebase standpoint, but operationally it’s just one big — and often brittle — binary.

2/ Hundreds of developers work across dozens of teams to develop countless packages that are (slowly) tested and compiled into *a single monolithic binary*, pictured here.

3/ These packages communicate via *plain old function calls* (new acronym: “POFC”??), pictured here in purple.

And those function call stacks can get very deep. So, are monoliths deep systems?

4/ Well: a deep system (per https://lightstep.com/deep-systems) has ≥ 4 layers of *independently operated* services. The call stacks in monoliths may be deep, but they are not “independently operated.” This is why you have the luxury of observing the whole thing via a single binary.

5/ Monolith deploys are *notoriously* painful: releases take days or weeks to test and deploy given the vast numbers developers and commits involved.

(This pain is so severe that organizations have to throw out the entire architecture — hence microservices!)

6/ Once deployed, ops uses an APM and/or metrics tool to observe the function calls and basic infra stats streaming from the singular, monolithic binary. And for pure monoliths, this can be powerful and valuable. But for *microservices*, it’s a different story… (onwards)

7/ So how — and why — does this change with microservices? It all hinges on the differences between deep call stacks and deep systems.

Let’s dig in…

8/ With microservices, hundreds of devs work across dozens of teams to develop countless services that are tested and built quickly and independently. We see clear similarities between the “services” here and the “packages” of monoliths — so what’s different?

9/ Well, first of all, these microservices communicate via network calls, not function calls.

And given that they are independently operated, a true deep system emerges from these many layers of abstraction and indirection.

10/ Of course, the “point” of microservices was to substantially increase release velocity.

But all of these deploys create an environment of continuous change and risk, affecting every layer of the deep system.

What happens if we try to observe this with monolith-era tools?

11/ APMs and metrics tools were designed to explain *individual* binaries in production. But inspecting individual pieces in parallel does little to explain the behavior of the overarching system.

In deep systems, APMs and metrics lack vital context about the larger application.

12/ In deep systems, distributed traces *are* that vital context.

And that’s why only opinionated, use-case-driven observability with a distributed tracing backbone can explain the behavior of these deep systems, and ultimately the apps and businesses that depend upon them.

13/ So while APM and metrics can be useful for monoliths, they are overwhelmed by the many independently-operated layers of deep systems.

So, finally, here is an illustration of the evolution — I’m fascinated to hear other perspectives here, please add to this thread!

For more threads like this one, please follow me here on Medium or as el_bhs on twitter!

--

--

Ben Sigelman
LightstepHQ

Co-founder and CEO at LightStep, Co-creator of @OpenTelemetry and @OpenTracing, built Dapper (Google’s tracing system).