Functional Failure: The Primary Causes…
… and how to identify them
Military leaders study their enemies, so that they can understand their decisions, anticipate their strategy and ultimately, defeat them. In maintenance, we also need to know our enemy. By understanding the primary influences on, and common causes of, functional failure, we can act quickly to preserve functions by eliminating causes of failure.
Identifying primary causes
We can start this process by speaking to operations and maintenance staff, to uncover the embedded knowledge within the organisation. We can initiating rapid improvement by starting with major influences of failure, and using simple Pareto analysis to determine the most common causes of failure.
To defeat the enemy, we must eliminate the common causes of defects. If we succeed, there should be significant improvements in reliability and a reduction in the effort needed to deal with unexpected failures.
What are the major influences of functional failure?
First, let’s cover design and manufacturing. Unreliability may be caused by design flaws, the manufacturing process, or suboptimal materials. A major realisation is this: design and manufacturing set the ceiling of reliability levels that no maintenance can increase. The only way intrinsic reliability can increase is if the asset is modified.
Another major influence design has is maintainability, how easy maintenance tasks are to complete. Maintainability is concerned with aspects like accessibility to components that need maintenance and considering spatial aspects where human beings need to work comfortably and safely. The overall aim the reduce turnaround time for maintenance tasks.
Specialised tool requirements should be minimised. Modularisation and designing line replaceable units that can be easily swapped out for new or refurbished, has maintainability advantages but may also inherit No-Fault-Found issues, which we will describe in a future blog.
How invasive maintenance is, in breaking into systems and opening up containment boundaries is also problematic as downstream failures may be more likely. Less than ideal maintainability in design may be a cause of downstream premature failures after maintenance is completed which can significantly drive up through life costs.
But how prevalent is this influence on failure?
If the manufacturer has widely used products, and understands the majority of the operating contexts and environments their products are used in, then the intrinsic reliability should be high.
Problems may arise if the designer substantially modifies their product with new technology or the manufacturer has problems with materials (including a change in supplier) or assembly. Logistics and warehousing may also introduce problems if handling and packaging are badly done.
The goal of operations and maintenance should be to achieve reliability as close to the intrinsic reliability level as economically viable. However, other influences can severely decrease reliability…
Duty cycles and usage
Within the operating context, we need to think about duty cycles and usage — especially where assets are worked hard and possibly operated beyond their design intent.
Ideally, we should have an asset register which identifies all the equipment, and their manufacturers specifications as part of our master data. We should also record our functional requirements and their associated performance standards against each item.
If we do this, we should be able to avoid running machinery beyond its design limits. If we do not record this data, in some fixed plant modification scenarios, the legacy machinery may become underspecified, and then be overloaded with the higher throughput required by a new plant.
Low quality maintenance may cause failure, whilst over maintaining, especially where the maintenance is invasive, can significantly decrease achievable reliability.
One example from my own maintenance experience, was the realignment of motors to fans or pumps. Small misalignment errors can significantly reduce the life of bearings and couplings by up to 50%, as well as provide a source of gross inefficiency. We used dial indicators and feeler gauges to do this job in the day, which needed skill and experience to be effective. Large improvements in reliability were evident after laser alignment kits became available, and misalignment errors were largely eradicated.
Mal-operation and mal-maintenance are likely to be the largest influences generating unavailability and unreliability below the intrinsic level. This is probably where most benefit in eliminating common causes of functional failure may be found. Other factors that may play a part in unavailability includes delays caused by capacity, spares supply and tool availability, which can be addressed by process improvement.
Another influence may lie at the boundaries of the machinery. These may be discovered by investigating the quality of the machinery’s inputs and outputs. For example, low quality fuel, or noisy electrical supplies may drive unreliability.
A large operational concern is the quality of the system’s outputs. Poor quality may drive rejections or customer dissatisfaction. The machines output may be the input to other machines within your organisation, leading to further reliability losses. For example, your electrical generator might supply your electric motors. Having appropriate measurements, quality specifications and tolerances, may help eliminate these causes of failure.
The remaining major influence is the operating environment, where factors may initiate or accelerate functional failure. If machinery is used in the extremes of the operating context, for example if the environment has dust, temperature variances, salt or moisture extremes, the reliability may be severely impacted — unless specifically adapted by design.
Take helicopter reliability for military forces in Afghanistan — the use of IEDs and ambush increased the risk of moving forces on the ground, and as a result, the demand for helicopter use increased. The mountainous terrain, dust and hard flying through combat zones, stressed engines and air frames leading to increased unreliability. The situation was an order of magnitude worse than the same assets operating in Iraq. This was not fully appreciated by the armed forces logistics department, and strained supply to the limits.
More work can be done to focus effort, for example, compiling an inventory of assets and classifying them for criticality. Identifying where to focus your defect elimination work should be simple — look for the largest set of common causes of failure, the highest criticality machinery and use simple Pareto analysis. We will cover how to exploit data, and detailed criticality analysis in another blog.
In conclusion, we can achieve a lot with existing expert knowledge in our experienced operators and maintainers, and minimal data. By identifying the major common causes of failure, we can proactively prevent failure, as opposed to treating it.
Have you noticed these influences contributing to functional failure within your organisation? I would love to hear your stories in the comments.
In the next blog, we will discuss the effects of functional failure and how they help us conduct a deeper machinery criticality analysis, and focus efforts.
If you missed our previous blog “What is Maintenance? The strive for hollistic thinking” you can read it here.