Murphy, you bastard — I love you
There is an old adage — don’t go live on a Friday. It’s a good rule of thumb to follow, and many old devs live with the scars of the Friday deploy.
With the adoption of SCRUM and Agile in a disciplined way like I’ve never seen before, at Full Stack we tend to deploy or go-live on Fridays a lot. But these Friday deploys tend to be rather underwhelming in that what worked on Tuesday, Wednesday and Thursday simply gets rolled to a different location on the Friday and then its off for beers at 4.
Yesterday I was reminded, that ever after 18 years of coding commercially, there can be a new wave of interesting wrinkles that can help level a practitioner up to the next level of efficiency and ability to adapt.
It is now apparent that the web is under what may be the first IoT (Internet of Things) DDOS attack due to compromised devices that have been rooted with Mirai.
Mirai has been causing some issues for sometime (i.e. less than a month), and with the recent release of its source code (https://krebsonsecurity.com/2016/10/source-code-for-iot-botnet-mirai-released/) it has become inevitable that this attack would happen.
This go-live day was like any other, but it started with a couple of critical events:
- Atlassian JIRA experienced a litany of critical errors including decoupling boards from projects.
- A iOS certificate was revoked by another vendor, requiring a new set of creds; why do Apple insist on only 3 key pairs per deployment account? Seems unneccesary.
- Massive DDOS throughout the internet bringing a variety of services under threat.
In spite of all of this, there were a couple of interesting core lessens for myself once the day was done:
- Using a proper cloud provider is not enough; you win when your cloud architecture is ready for shenanigans — for the most part it was — which made me proud that our DevOps team weren’t effected — the architecture held out.
- “‘n Boer maak ‘n plan” — The analysts where able to side step issues at Atlassian, by reverting to tried and true tools like Email and Excel.
- You can only manage what you can measure. The developers were able to work through the issues and isolate them through comprehensive logging infrastructures etc. which gave exception level detail on items like card failure causes etc.
There were a couple of points when it felt like the whole web had become something of a burning platform, and therefore scrubbing the release would be a good idea. We didn’t do that, and I’m glad we didn’t.
Utilising an appropriate microservice architecture; having a deep set of well worn code prepared by an agile team who have owned the dev challenge for 7 sprints, over 12 weeks; not being beholden to analytical tools. All of this led to the ability adjust to the impediments of this Friday well.
There is tremendous excitement (as always) around many technologies that are rising. The 21st of October reminded me once again that these technologies work well when people work well. The principles and practices of adjusting to adversity, to staying calm under pressure, to managing changes that scramble the plan all make the act of going live a fun one — even if just to a closed beta.
On a day when much of the internet showed its fragility, it offered my team an opportunity to demonstrate that it isn’t when things are going well that the men are separated from the boys — rather it is in the fog of everything going wrong that the refined practices of design consideration come to the fore.
That is deeper than the aesthetic. The design of the network topology. The design of the roll-out methods. The design of the compilation pipeline. The patterns and anti-patterns of the load testing and checklists. The design of the test pack. The design of the release notes. The design of the error logging and handling.
It is in designing what goes unseen by the vast majority of users that developers, product owners and broader scrum teams can reveal the strength of their sinew in the face of adversity.
If we didn’t have a day like yesterday, we wouldn’t have an opportunity to show our process, our people and our products as resilient as we designed them to be.