Secrets of Simon Building — 6

Condition Based Maintenance (CBM) involves regular collection of data often at predetermined interval.

Clearly there is a risk involved. If we collect data at long time intervals we may miss the birth of an incipient failure. And if we collect data too often then we would simply be handling too much data, most of which would be quite unnecessary. Also consider the additional wasted effort that goes in collection of unnecessary data.

So the question is — how might one decide an appropriate frequency of monitoring?

This is the question I once asked Prof Henry.

To which, in his characteristic laconic style, he replied “MTBF by 5.”

(Now MTBF is an often used acronym in Reliability Engineering which stands for “Mean Time Between Failure.” )

So that was the formula (MTBF/5) for finding the appropriate frequency of monitoring.

What is the basis of this formula? I decided to crack the question myself.

So, my reasoning went something like this.

First, let us consider the famous Bath Tub Curve — a stylized representation of Failure rate vs Time. It is depicted as follows:

Figure 1

As we can see there are three distinct zones , which are:

  1. Early failure zone
  2. Random failure zone
  3. Wear failure zone

As we know random failure zone is often the most troublesome zone for Maintenance and Reliability engineers. This is because more than 68% of all failures in the plant are random in nature. To me 68% is a rather modest figure. I have seen that for most industrial plants random failures constitute more than 80% of all failures.

The point is when failures are random in nature it is absurdly difficult to maintain such a system through preventive maintenance (also known as Time Based Maintenance) schedules that focus on time based replacement of parts, components and sub assemblies of a system so as to minimize the risk of failures till the time for the next predetermined scheduled replacement comes up. Evidently, Preventive Maintenance (PM) or Time Based Maintenance (TBM) proves to be an adequate strategy for Wear Failure Zone (WFZ) as depicted in the diagram. However, the only sensible way to get over the problem posed by the Random Failure Zone is to firmly establish a Condition Based Maintenance (CBM) strategy.

To establish a CBM strategy is a detailed and often painstaking task. And one of the most important issue is to decide upon the right monitoring interval.

Now, let us assume that MTBF of a system conforming to the Random Failure Zone (RFZ) to be 10 months. That means it is most likely to fail around 10 months, though not exactly on the 10 th month. Sometimes it might fail on the 8 th month or on the 9 th month or on the 11 th month or 12 th month or even at times on the 10 th month. At the same time, it would be quite unlikely that the system would fail on the 49 th month from when it started running as a brand new system.

Engineers have different ways of estimating the reliability of such random failure prone systems. One good way is to use Weibull analysis, which most often gives a fairly accurate picture of a system’s behaviour over time. But Weibull analysis may prove quite involved since it needs accurate record of data, which might not be easily available in many industrial plants.