For those who have no idea what the title says, I offer my apology here. Finance by itself is a deep topic with many trade-offs, similar to the world of engineering. According to a greate economists Thomas Sowell

There are no solutions, only trade-offs

In the DeFi (Decentralized Finance) sense, as much as we all love the high interest rates, the trade-offs aren’t always clear in this nascent industry. The underlying concept is really simple and sound. It goes something like this:

All right folks, I intend to keep this one short and that’s what I will do. I mean, it’s supposed to be easy but the official documentation(1, 2) makes it unnecessarily confusing. So I think maybe I can help to fill in the gap.

I will be using one of our business requirements at Buffer in this project, as an example for this blog post.

Quick recap

So, we need a few nodes that are dedicated to running cronjobs, and nothing else. At the same time, we want to make sure the cornjobs are scheduled to these nodes, and nowhere else. …

A mystery

So, it all started on September 1st, right after our cluster upgrade from 1.11 to 1.12. Almost on the next day, we began to see alerts on kubelet reported by Datadog. On some days we would get a few (3 - 5) of them, other days we would get more than 10 in a single day. The alert monitor is based on a Datadog check - kubernetes.kubelet.check, and it's triggered whenever the kubelet process is down in a node.

At Buffer, we utilize kube2iam for AWS permissions. It’s deployed, upgraded and managed with the stable Helm Chart repository.

It has been working great, until last week due to my oversight. The situation was resolved in a matter of 30 minutes and no users were affected. Nonetheless, I’d love to share my learnings with the community.

To upgrade kube2iam I have been using this command

helm upgrade kube2iam --install stable/kube2iam --namespace default -f ./values.yaml

Where the values.yaml file contains a bunch of overridden variables. In our case, it looks like

The command would upgrade kube2iam to the latest Chart…

(Related article on Upgrading Kubernetes with Kops)

For those who have been using kops for a while should know the upgrade from 1.11 to 1.12 poses a greater risk, as it will upgrade etcd2 to etcd3.

Since this upgrade is disruptive to the control plane (master nodes), although brief, it’s still something we take very seriously because nearly all the Buffer production services are running on this single cluster. We felt a more thorough backup process than the currently implemented Heptio Velero was needed.

To my surprises, my Google searches didn’t yield any useful result on how to carry out…

Alright! I’d like to apologize for the inactivity for over a year. Very embarrassingly, I totally dropped the good habit. Anyways, today I’d like to share a not so advanced and much shorter walkthrough on how to upgrade Kubernetes with kops.

At Buffer, we host our own k8s (Kubernetes for short) cluster on AWS EC2 instances since we started our journey before AWS EKS. To do this effectively, we use kops. It’s an amazing tool that manages pretty much all aspects of cluster management from creation, upgrade, updates and deletions. It never failed us.

How to start?

Okay, upgrading a cluster always makes…

At Buffer we invest heavily on Kubernetes. Since we use AWS as our cloud provider, getting services the right permission usually means having AWS keys/secrets in Kubernetes manifest files. In this SnackChat Steven walks through the steps of using kube2iam to eliminate the exposure of AWS keys/secrets.


Originally written May 14, 2018. Last updated Aug 13, 2018

Originally published at on May 14, 2018.

Click here to share this article on LinkedIn »

Kubernetes is great! It helps many engineering teams to realize the dream of SOA (Service Oriented Architecture). For the longest time, we build our applications around the concept of monolith mindset, which is essentially having a large computational instance running all services provided in an application. Things like account management, billing, report generation are all running from a shared resource. This worked pretty well until SOA came along and promised us a much brighter future. By breaking down applications to smaller components, and having them to talk to each other using…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store