More #OpsEmpathy In Software Development

After years of doing DevOps now I’m in situation where there are several operation teams to work with.

System administrator job is hard. They have to make sure everything works smooth all the time. That includes the complex legacy system that nobody fully understands. Such work is often invisible and goes unappreciated.

Do The Dirty Work Yourself First

For the last few days I’ve put myself into shoes of such operation person trying to install and configure software developed by me. It was quite a painful and frustrating experience trying to debug why remote syslog does not receive messages over UDP. I can only imagine how much more difficult would be to make the same without inherent knowledge of the application and how all the things were wired together.

Such experience led me to the insight: have more empathy for people who maintain your software. It does not matter if you have separate operation team or doing DevOps. The person taking care of your app will be grateful. Also keep in mind that person could be future you. The future you who did not slept well and has got huge hangover (been there, done that).

The success of your software is on the shoulders of giants!

Try Docker To Make Yourself Uncomfortable

Using tools like Docker makes the work much simpler. It is useful even if operations team is not using containers, like it is in my case. If the whole system works in such hostile environment like Docker it won’t have any problem being deployed to virtual machine or bare metal.

Using Docker you are able to simulate cluster on your laptop with all the warts and shortcomings of production environment: no easy access to files, limited way to peek how application is performing (trace, debug), network access restrictions from one container to another. The default Docker configuration is pretty restrictive — just how any reasonable system administrator would do it.

Put Yourself In The Shoes Of The Maintainer

Look at your logs as you’d have to do application support. How quickly you’re able to determine what the problem is looking only at log message or attached stack trace? Is is obvious what is wrong and what is the severity of the problem? From support person perspective: is it clear who to call for help to debug the issue further?

Rise-up and leave the comfort of IDE and try to pin down the root cause of a bug using toolset available for admins. Instead of interactive debugger try to process application logs with grep and awk. That might be quite a challenge, which means logs need improvement. Bear in mind: setting log level to DEBUG is cheating!