Why Root Cause Analysis is King

Jacob Jones
Think|Stack
Published in
3 min readJun 1, 2018
Photo by Robb Leahy on Unsplash

All of us have probably dealt with unwanted plants in our yards at one point or another. Some are easy to get rid of, all you must do is spray them with poison or cut them down. Easy peasy.

Then there are the others…

Some plants just won’t die. No matter how hard you try with shears, poison, stomping, spitting, and cursing, nothing seems to work. Then one day after you have finally had enough of this eye sore in your yard, you open the armory.

In the armory you find the sharpest of shears, but they just won’t do. Next to those, you see your unused garden rake with the “unbreakable” handle. You remember the one, because you paid WAAAAY too much for it. Then, in the back corner… right next to the cigars your wife doesn’t want in the house, you see it. The shovel.

Unsheathing your weapon in a manic fit of rage, you plunge it into the earth. As it cuts through the dirt, beneath the roots, you use the handle as a lever to pry this abomination up from your yard. Nothing short of the biggest grin moves across your face as you see the entire problem laying there, roots and all.

In IT, the issues we run into are like weeds. If you don’t pull them out by the roots, they will grow right back.

For some of us, we are at the mercy of technicians that have fixed our problem week after week, but it just keeps coming back. It’s maddening.

For others, we may be that technician who wants nothing more to remove this blight from the earth and close that damn ticket.

The former position is quite frustrating, which is why in the IT industry we must be at the top of our game when identifying the“root cause”. We cannot simply prune the problem or spray a little weed killer, we must tear it out, roots and all.

Its not always easy to identify root cause, but here are a few things that will help:

1. Understand the symptoms

It is extremely important to make sure you clearly and accurately record the symptoms for any incident. Without a clear understanding of symptoms the best you can do is paint with a broad brush and hope the problem is fixed.

2. Don’t just fix the problem. Understand the cause.

Something that can easily be forgotten when in the trenches, is that you want to treat the cause, not the symptoms. I understand there are times to manage symptoms while you figure out what is going on. However, you should never leave the cause of the problem unresolved, just like a doctor should never just manage ONLY your symptoms and send you on your way.

3. Don’t wait so long to call vendors.

There are times when you must get the vendor involved. For both hardware and software, you will always run into bugs or maybe you just need someone with a deeper understanding. For the sanity of both technician and client, call your vendor as soon as possible.

4. Test, test, test

NEVER resolve an incident without testing and ensuring you cannot replicate the problem.

5. Understand you cannot fix everything

Lastly, you simply cannot fix everything. There are going to be times that vendors have bugs that simply cannot be fixed in the short term. Sometimes it is an update or software release and others might just accept “bug reports”.

You won’t fix everything. There is a big difference between excellent performance and perfection. The latter simply does not exist in IT.

6. Know when it should be a project

Every problem you come across is something that you will find the root cause of, but not necessarily something you will resolve. Sometimes there are a lot more moving parts than expected. Scope and hours of work to resolve the issue must be taken into consideration. However, identifying the root cause and completing a discovery will provide key information to create the scope of a project, so it is imperative.

Sometimes identifying the root cause will take a greater initial investment, but by doing so you are making an investment that will save precious time. So, the next time you think “almost” is “good enough”, remember, you will probably be working on the incident again soon.

--

--