EA Principles Series: Steward our Technology Portfolio and Minimize Long-Term Technical Debt

Published in

chick-fil-atech

8 min readJan 17, 2023

In Part 1 of this series, we unpacked our “Maximize Cloud First” principle. In part 2, we move on to how we think about the stewardship of our technology portfolio and how we manage and minimize long-term technical debt.

A great way to create organizational drag, high cost-of-ownership, and slow responsiveness to business opportunities is to ignore technical debt or fail to manage it.

Photo Credit: https://www.cnet.com/health/sleep/best-alarm-clock/

What is Technical Debt?

Most people are probably familiar with technical debt at an application level, where it often accrues, resulting in something needing to be re-written, re-factored, or re-architected. We are big fans of good architectural decisions that help minimize this type of debt at a product level.

Beyond technical debt at the application level, I believe there is enterprise debt that can be accrued as well. This “enterprise debt” can be found even when application level debt is minimized but an enterprise-wide, organization mindset is absent.

When we architect our systems without thinking about the business ecosystem they exist within, we create a hidden but costly form of debt.

For example, an application could be written and be feature-complete, but never have considered the interfaces (API, events, etc.) that meet the needs of other areas that may need to interact with it. They may not have thought about features that are needed beyond the scope of their primary stakeholders. They may not have considered how they would share their data with analysts or how that data might be useful to another area that is building a machine learning model.

When we think about technical debt, we think of both of these cases: application debt and enterprise debt.

Principle: Steward our Technology Portfolio and Minimize Long-term Technical Debt

Here is the verbatim of our principle.

Business growth will drive us to continue to add new technologies to our portfolio, but we must be careful to avoid the “undisciplined pursuit of more”. We must be disciplined in our rate of technology portfolio growth while also being intentional about retiring legacy technologies as we go.
We should reuse APIs, events, and tools as much as possible but be careful about the re-use of business systems as that can potentially lead to monolithic, multi-business domain solutions that create cross-team constraints and limit our organizational agility.
We should minimize the introduction of technical debt when implementing new applications, and make intentional plans to “pay off” historical debts by retiring legacy systems and services or refactoring sub-ideal architectures.
Why? All technologies come with both visible (money, time to implement) and hidden (opportunity, support) costs to the organization. Poor stewardship creates long-term organizational drag, higher cost-of-ownership, and stifles business agility.

Outcomes and Tradeoffs

What are the goals behind this principle?

Outcome 1: Achieve a disciplined approach to technology growth

At Chick-fil-A we are seeing an exponential increase in demand for technology to support and enable new business outcomes, business model changes, and strategic goals. That is exciting.

In parallel, the technology landscape has exploded with amazing new open source projects, software vendors, and technology capabilities. That, too, is exciting.

If we had unlimited resources, it would be very easy to continue to add each new amazing product that is created to our technology portfolio. It would probably help us achieve some business outcomes rather quickly.

In one of the admittedly less-fun parts of Enterprise Architecture, we have to be a guardian of our technology portfolio and advocate for healthy growth. As our principle mentions in a nod to business author Jim Collins, we want to make sure we do not get captured by an “undisciplined pursuit of more” technology. This means doing the difficult work of assessing new technologies against our existing capability and saying “no” to some great opportunities because they offer incremental value instead of transformational value. This ensures we do not allow our portfolio to become overly complicated, which has all kinds of costs (financial, cognitive load on people, increased difficulty to reason about, etc).

If we are going to introduce new technologies, we have to make sure we do the difficult work of retiring the old technologies they overlap with. Retiring old things is time-consuming and frustrating, but very important to maintaining a healthy technology portfolio. It is also important to the cognitive health of our organization — we need to have simple answers for how to get things done vs. requiring a highly-nuanced understanding of a massive technology landscape. A simple request like “I want to monitor my application, how do I do that?” cannot require 15 meetings to talk about the 12 tools available. It needs to be fairly simple and easy to grasp. We have work to do in this area, but it is nevertheless an important principle we advocate for.

Outcome 2: Promote re-use of the right things while maintaining separation of concerns and preserving organizational agility

We like to see services, enterprise messages / business events, platforms and tools be re-used across teams, business domains and departments. This means less duplication of efforts, (hopefully) achieving business results faster, and a less confusing technology environment (it is pretty confusing when there are 5 services that all do roughly the same thing or 5 datasets that are only subtly different). Re-use is often good.

In other cases, re-use is may not be the best option. In general, we push teams to avoid building multiple business applications inside of the same instance of a product. The goal here is to avoid monoliths — we don’t like to see shared databases, shared deployment tiers, shared … unless the product has been specifically architected to mitigate the risks of doing so while also preserving a teams ability to move at their own pace with a high degree of autonomy (no outside intervention needed).

We also want a product team to be able to build their application(s) without any understanding of the intricacies of other team’s implementations. They should only need to be knowledgable about the interfaces provided (REST API, Kafka message, etc). They should also be able to deploy any time they desire to any of their environments without concern over impacting or being impacted by other team’s applications. They should also be free from concerns over another application degrading their application’s performance.

There is a lot of nuance here, so lets unpack it.

Should we pack multiple applications or APIs into the same deployment artifact? No. This will result in tightly coupled deployments that limit agility.

Should we build multiple business applications inside of the same platform (meaning same app tier, database, etc). No. This will likely result in things like 1) one application team having to understand the nuances of another team’s application 2) one team’s deployment impact another team’s 3) load on one application causing performance issues with another.

Should we pack multiple applications into the same Kubernetes cluster? Yes. Why is this okay? Kubernetes is designed for this purpose with many controls to ensure workloads run successfully and independently. Teams are still able to deploy their application whenever they want, and do not need to understand what other applications reside in the environment with them or how they work.

Why does all of this matter? At the end of the day, our goal is to maximize re-use, minimize concerns a team has to account for, and preserve the autonomy of teams to move at their own pace of delivery (in short, DevOps).

Outcome 3: Pay back our technical debt

Technical debt will accrue. On occasion, it can even be a good business decision to accrue it.

Just like in the financial world, debt is a lever. In software, a team can pull that lever when they want to go faster now (akin to wanting money now to, say, finance a home) to get an outcome they believe is more valuable now (akin to a return/utility of the home that exceeds the interest being paid on the money) in exchange for a higher cost of repayment later (repayment of principal + interest).

To state the obvious, an endless cycle of adding debt without paying any of it back is the path to eventual destruction… monthly “payments” on past debt in the form of rework, support tasks, etc. might begin to consume all of the resources of a team (or just make it too complicated to make changes rapidly).

As Enterprise Architects, the main reason we want to name technical debt in the first place is to ensure we understand with “eyes wide open” the situation we are creating for ourselves, can understand long-term consequences of that decision , and can plan to manage the associated risks. By doing so and logging the debts as they occur, we essentially have a balance sheet where we can see the debts we have and if we are getting out of balance organizationally. We are young in our debt-logging journey but expect it to pay dividends in time.

It may feel like some technical debt decisions have a 0% interest rate so we don’t even need to worry about them… but we have to remember that they are adjustable-rate and at some point the rate is going to start rising.

Outcome 4: Achieve portfolio visibility and use data to inform investment decisions

This one is very new to us and therefore cannot be considered a success story. We are just beginning to implement LeanIX to help us model and record our business capabilities and map them to technical systems, interfaces (APIs, business events, Data Lake datasets, etc) and active projects. A business capability — in our definition — is something that we have the capacity to do or want to have the capacity to do as a business at Chick-fil-A or one of our subsidiaries.

The goal here is simple: develop an understanding of what our business does, determine how well we support it with technology, and identify where the big investments are needed. Investments may be about maturity, enterprise impact, or portfolio simplification.

In the past, we were a small enough organization to grow well by doing these activities organically, but as our business grows that has becomes increasingly difficult. Investing in tooling will help us better visualize our technology landscape and help product owners, builders, buyers make even better investment decisions in the future. We will share more about this in the future once we have some experienced-based learnings.

Conclusion

At the end of the day, we are not the owners of systems, so our role as EAs is to be advocates for these healthy activities related to portfolio management and technical/enterprise debt.

We want to see teams thrive, deliver amazing business-transforming software, and have fun doing it. All of these things can happen in the short-term by adding technical debt, but none of them become possible in the long-term if we leave that debt unpaid.