Rollback and it’s importance in a DevOps world.

B
3 min readOct 2, 2017

--

Usually there is a fear in the DevOps community when we talk about roll backs and usually operations (now has skillset of both dev and ops) don’t understand quite well the need to be able to rollback failed software deployments.

They talk about the fear and the level of difficulty in rolling back and there is always need for roll forward in case of an error which is a good way to proceed and has advantages but at the risk of not being able to roll back becomes costly over time as many integrations and features are available for customers.

Let’s think about a scenario where companies implement only roll forward and no thought for roll back since money and time is key and is a good business case but what if a big costly question and can be of many forms. I’ve personally written Java code that was perfect in test environment and QA passed it but as soon as it was deployed to production and customers starting using them we found a costly bug and there was absolute panic and do we really want that to happen to us when we’re a customer, I bet the answer is no.

In my experience it’s always been easier to roll back to a known state than roll forward( and trust me there are technical aspects in rolling back and other business cases which might need more attention). It’s much easier to test and gain confidence in a rollback than it is in the roll forward. I think a lot of the mis-conceptions people have come from the fact that the development simply does not give enough importance and focus in their development and release management process.

With a rollback strategy you can simple hit the button at the first sign of trouble and have the system back up and running while you look into the situation. It may be that you’ve overreacted in pulling the release, but that’s generally better than breaking something.

Enough about the importance of rollback so what do we need to have a good roll back strategy:

Mindset and thought process. People usually don’t realize how important it is to be in a specific mindset to be able to articulate and realize the rollback. The most important in my opinion is to implement an architecture that supports the need to rollback. For example, module and componentized, service bases architectures lend themselves well to this.Perisstent message queues and asynchronous services allow you to bring components down for rollback without impacting the main user base. So think big and work on simple incremental ways to get there and Blue-Green release pattern has shown to be a really good friend and is a proven strategy and I’ll be writing more about this in future posts.

Embrace the change and it is good, but the route back to a known working system is much more important.

Design the rollbacks and roll forwards to both work idempotently.

Have your QA test the roll out which is part of the release process but test rollback as well to get the experience and much needed confidence which is a stepping stone in my opinion to change the mindset around you.

Document the roll back procedures. It’s likely there will be a degree of stress involved if we need to roll back the production system, so take the time before the release to write up how to run the rollback process, what to check, and what to do in potential failure situations. E.g. If the database deploy fails, run script abc.sql and check appropirate customer facing applications.

Work on the architecture to upgrade components individually rather than in parallel.

--

--