What We Learned from Launching Edge Compute from Enterprise Architecture

Brian Chambers
chick-fil-atech
Published in
9 min readFeb 14, 2023

by Brian Chambers

It has been about four years since we launched our Enterprise Restaurant Compute Edge K3s platform to support IoT and other use cases at Chick-fil-A. Today, we will take a look back at what we have learned from the experience of building an enterprise platform and how it shaped our Enterprise Architecture practice for better and worse. Most importantly, what have we learned from the experience and how is it shaping our future?

Photo Credit: Me. Yes, I ate the sandwich shortly after this picture was taken. It was delish.

Background

First it is important to know that we launched our Edge Compute work from Enterprise Architecture, not from a well-staffed, well-resourced, dedicated product engineering team (as we do for many new things today). For many years, I actually played the role of “Product Owner” myself. There were just a few of us that worked on the platform in the early days… it was fast and scrappy for sure. That is often true for innovation efforts: you have to prove some value before you can really go “all in” and get support for hiring staff or getting a bigger budget.

About 12–18 months before our Edge platform work began, we also launched an initiative to move to doing all new development in the cloud which we called “Cloud First.” Around the same time, we were also moving towards a modern approach to cloud-based analytics employing columnar data warehouses like AWS Redshift and building a Data Lake infrastructure. It was a busy time at Chick-fil-A and on the EA team.

How we got started w/ IOT

How did this Edge Compute effort start? In our case, we identified that we needed a technology capability for our restaurants to enable us to do Internet of Things (IoT) to help support our organizational goals around Restaurant Capacity and Customer Digital Experience. This was to address some current needs, but more an anticipation of where we needed to go in the future to support the business. Connected things (fryers, grills, tablet screens, POS registers, etc.) would play a big role in helping us optimize our restaurants, maintain high food quality and safety, and optimize our customer service experiences. We suspected that being data-centric was going to be critical to future operations, so we began to connect things and unlock the data they could provide to us (which usually meant working with the owner of the device to embed an SDK to onboard to our platform, and to write some software themselves to share data payloads).

As part of this IOT capability, we wanted to 1) host a resilient MQTT broker at the edge to enable normal data collection and operations in WAN-down scenarios and 2) enable teams to deploy applications to the restaurant to interact with restaurant “things” (kitchen equipment, etc). Thus was born the idea of an Edge Compute infrastructure.

To do this work, we used our internal innovation process at Chick-fil-A which is…

  • Understand — explore the problem and understand what it is we want to see be transformed / work differently / fix
  • Imagine — what possible solutions might make sense for this problem. “How might we?”…
  • Prototype — get something working to see if some of our “Imagine” ideas are feasible. Ideally get this to real users and get real feedback ASAP.
  • Validate — does the idea work? What do users think? Does it make financial and business sense? Does it fit with our organizations culture and ethos? Does the idea fit our current standards and approaches well enough to move forward? Do we need to adjust?
  • Launch — go to production and scale

In the case of our IoT problem, we worked through those first three phases with some small seed funding and a tiny team, and built a prototype that we started showcasing to users. This included:

  • Identity management via OAuth flows for device on-boarding in restaurants
  • MQTT broker and lightweight client SDK to onboard into our environment securely
  • And yes, an Edge Compute cluster (first running Docker Swarm and later Rancher and then K3s)

We were able to take that forward to validate and eventually launch a product.

How we validated the idea and built support

Simply put, we built momentum. We refined our prototype and told the story less about all the technical wizardry that was taking place (because the business doesn’t care about technical wins), and more about how this was going to be critical to our needs in a smarter kitchen to help with restaurant capacity (the business cares about business wins).

We also owe huge Kudos to our first customer, the Aha project, who walked alongside us and gave us invaluable feedback and played the role of initial “customer” that we could provide services to in the field as we iterated and learned. I am sure we caused them plenty of challenges, but they were nothing but gracious along the way.

One thing we did well in our validation process is that we acknowledged our constraints and actively managed them. This meant solving some very technical problems for a fast food company at times… but we had to work with what we had and prove the value of our idea to the business.

Our constraints were things like…

  • Physical space limitations — there is no room for a server rack, blade servers, etc.
  • Power outlet constraints — in a lot of restaurants this had to go in a wall closet in the back office, which didn’t always have the space we needed for power.
  • Switch port density constraints — no room for IPMI, PXE boot raspberry pi, etc.)
  • Network service constraints — no control over local DNS, DHCP reservations, static IPs, etc.
  • People / Money — we had a small team and small budget at the start and had to work to prove viability before we got enough funding to take our vision to the entire business.

I think this is important. We didn’t complain about these or decide to stop the project until they could be “fixed.” We actively managed them, solutioned around them, and worked with the resources we had (technical, budget, and people).

Despite these constraints, we were able to deliver a product and tell a pretty compelling story about how we were going to bring technology to every restaurant that would enable the business outcomes that everyone knew were important to our next five years as a business. MVP and iterate.

After Action Report: What we learned

Now we move to my “After Action Report” or AAR. What went well? What didn’t go well? And what did we learn?

What went well?

  • The platform works — we built something pretty cool with a very small team in spite of the constraints we faced. To get there, we iterated through a lot of ideas in a short time and had tight feedback loops with the areas we were working with. It wasn’t all perfect all the time but it was a good approach to make sure we were building the right thing.
  • EA had a proving ground for other technologies — we got close to Kubernetes, K3s, AWS cloud services, Golang, Prometheus, Grafana, GitOps, and much more. Having a product we owned and built and could also use as a proving ground for innovations that could help the organization was imperative. This product was the first at Chick-fil-A to use Kubernetes, whether cloud or edge (we did both) and today nearly all of our internal homegrown products run on it. Many of these tools and practices we incubated in our Edge project are now commonplace across all of our engineering teams.
  • API First + “Just enough automation” — API First was one of our key principles at the time and applying it really helped us be successful in scaling a cool idea into a real enterprise system. We employed just-enough automation to lessen the toil on the team and enable more feature building and scaling while not getting wrapped up in automating everything perfectly just because we felt like we have to.
  • We made lots of friends — the K8s story connected our team with many others who we have learned a ton from. Never underestimate the value of being open and sharing and how it can help you get way more value than you really brought by sharing your ideas.

What didn’t go well?

  • We were too small and underfunded at the start and that led to burnout for the team, especially me and one other team member that was doing a lot of the work at launch time. It is not our culture to do so… but we wanted to get this thing done, so we were often working 80+ hour weeks. For me I was doing my day job as an EA and writing APIs and edge services during the “night shift.”
  • Our team wore a lot of “hats” that aren’t the sweet spot for EA such as software engineer, tester, product owner, third-level support. We had no other option at the time, but this distracted us from other things that we probably should have been moving forward at the same time.
  • It was hard to ask for more resources because we succeeded with a small team and budget. It took a few years to get to the well-staffed team that we have today.

What we learned

  • We tied the building of this technology capability to real business challenges effectively. That built credibility for our team because we were based in the reality of our business and we delivered technology that aligned with real business needs.
  • It’s hard to find the sweet spot between incubating an idea and achieving an MVP level of maturity… and then handing it off to an operational team that is going to run it long-term and make incremental changes. In our case, EA held the Edge platform for far too long, which leads to the next point which is...
  • There is so much we need to do in EA. If we are too focused on a given thing it can be to our detriment. We were too focused on “technical architecture” for a long time, and missed a lot of opportunities where we could have added value in the business architecture practice. We ended up having to chase these down later and put way more effort in than we would have if we had been attentive to them earlier.
  • Since then we have also incubated other platforms and let them go much earlier, trying to apply this lesson learned. We have now also experienced letting go too early. I am not sure where the sweet spot is, but it feels like we are getting close(r).
  • Sometimes doing something manually is okay for a season. When we first embraced Kubernetes in the cloud to run our cloud workloads in support of the restaurant IoT environment, I remember doing production pushes manually (applying config that was in git). Oh no, manual deployments?!?! It’s not ideal and we don’t do it anymore since we have matured, but when you are getting started and are trying to prove value, have a small team, and have complete context for the problem you’re working on, I believe there are times where this makes more sense than investing weeks in getting a beautiful deployment pipeline running. Yes… it pays off later to make that investment. Yes, I am all for it. But there is a season for everything. When you are in the early stages you have to be scrappy.
  • EA needs to own something so that we have a place to incubate in the real world. Absent a real product, we lose the ability to apply new technologies to the real world without massive coordination, risk management and stakeholder communication. At this very moment, we do not own any products, so we are working to figure out how to remedy this situation. The most likely avenue is by coming alongside other business area’s innovation work and helping them succeed while also incubating our technologies to help them.
  • During our cloud, Edge, API, and Analytics capability work, it became really clear that technology capabilities were a sweet spot for EA, and that was recognized organizationally. That was good, but also bad. As an EA team, be careful about getting branded with one specialty… EA is a broad discipline with lots of areas that need attention from business architecture to platform architecture to solution architecture and so on. Be careful not to get pigeon-holed.
  • Sharing your story externally is very valuable. If you can explain your problem and solution to someone outside your business, you can likely explain it to anyone inside your business with ease. It can also connect you with a great community of others that enrich your thinking about your problem, technology and/or life.

--

--