What is Root Cause Analysis?
Root cause analysis (RCA) is the process of identifying the underlying reason for an event. It can describe any number of processes for getting to the bottom of any number of problems.
In the context of equipment failure analysis, RCA helps find the root cause of issues like frequent machine malfunctions or significant breakdowns.
Why conduct a root cause analysis?
Whatever the context, RCA aims to determine:
- What happened
- Why it happened
- How to prevent it from happening again
Performing a root cause analysis is a reactive process, meaning it is performed after an event occurs. Once you’ve completed your RCA, however, you’ll have the information you need to proactively predict and address future events like it.
RCA ensures you fix the actual cause of a problem rather than the symptoms alone. For example, suppose you replace a broken belt without also replacing the misaligned part that caused the belt to overheat and break in the first place. In that case, you shouldn’t be surprised when the belt fails again.
RCA applies a problem-solving methodology to pinpoint the one central problem causing all the others.
The RCA Process
There is no universal process for conducting a root cause analysis. The incident investigation process typically involves collecting data and applying various analysis methods to draw a conclusion about the root cause of a problem.
The results of your analysis won’t always be obvious. Sometimes you’ll identify a range of potential causes and contributing factors for a problem. In these instances, you’ll need to carefully review the data and apply root cause analysis tools alongside experts and team members to determine the appropriate corrective actions.
At its simplest, root cause analysis typically follows a basic four-step structure:
- Define the problem: A clearly defined problem statement helps you begin mapping out a solution.
- Collect data: Gathering plenty of data arms you with everything you need to better understand and correct incidents.
- Establish a timeline: Mapping out the events that led to your incident will help you identify the factors worth investigating further.
- Solve the problem: Once you’ve identified the root cause, take the appropriate steps to fix it and put systems in place to prevent the problem from occurring again and simplify correction.
Successful RCA calls for specialized knowledge and hands-on experience. Without the right know-how and tools, you’ll only manage to offer temporary fixes for adverse events. Hasty decision-making could even make a bad situation worse.
Despite these complications and limitations, RCA remains a powerful tool for understanding your systems and enabling preventive asset maintenance.
Different types of RCA
RCA comes in different forms depending on the problem you’re trying to solve:
- Safety-based RCA comes from the occupational safety and healthcare world. It is used to determine the possible causes of workplace injuries and accidents like falls or cuts.
- Production-based RCA is used by manufacturers to ensure quality control. You might use it to find out why injection-molded plastic parts are coming off the line warped, for example.
- Process-based RCA is used in business and manufacturing to identify a fault in a process or a system. An accountant might use it to determine why their organization’s vendors aren’t getting paid on time.
- Failure-based RCA is used in engineering and maintenance to determine the root cause of any type of equipment failure.
Systems-based RCA originated as a combination of some of the root cause analysis techniques listed above. This methodology combines two or more methods of RCA. It can be used in a wide variety of industries.
Root cause analysis tools and methodologies
Though there’s no single template for carrying out a root cause analysis, organizations leverage a number of popular tools and approaches for analyzing faults, disruptions, and other issues.
A fishbone diagram looks at the various factors that contribute to problems
- Fishbone diagrams: Also known as Ishikawa diagrams, these encourage organizations to analyze issues with an emphasis on their numerous contributing factors. Organizations sometimes call these factors the 6 Ms, approaching problems by focusing on Manpower (or Personnel), Machines, Measurements, Methods, Materials, and Mother Nature (Environment). This approach is popular among proponents of the Toyota Production System.
- Pareto analysis: Named for economist Vilfredo Pareto, the Pareto principle (or the 80-20 rule) holds that 80% of failures result from just 20% of potential causes. By applying this principle to root cause analysis, you can pinpoint the underlying factors most likely to cause trouble.
- Fault tree analysis: This method maps out the relationship between machines, subsystems, and various types of faults using Boolean logic. It helps engineers and project managers calculate the probability of issues and better understand their contributing factors.
Fault tree analysis example. Source: Six Sigma Study Guide
Root Cause Analysis Examples
These case studies offer a look at RCA in action.
RCA example #1: The case of the faulty parts
Let’s say you’ve observed a high incidence rate of faulty products from your injection molding machine. It’s costing you money and you need to get to the bottom of it.
First, you’ll need to define the problem. This includes explaining the exact defect you are observing. In this example, the defect is part distortion.
Write down the problem, taking care to include the number of defects as a percentage of total output. Then, collect all the available data. This should involve pulling reports from your maintenance management system and reviewing manuals from the original equipment manufacturer.
After collecting information on the defective asset, measure its deviation from typical specifications. In this case, you’ll want to take the heat signature of products once they’ve left the mold and measure the temperature of molten plastic in the barrel. This process will confirm that temperature deviations are causing the defects. Based on the data you’ve collected, you can investigate further to pinpoint where exactly the trouble is starting.
The problem may, for example, result from poorly aligned cooling liquid conduits. By correcting the conduit arrangement to better fit your molds, you’ll solve the part distortion problem.
RCA example #2: The mystery of the blown fuse
Next, let’s say a machine in your shop suddenly stopped working because it overloaded and the fuse blew.
An investigation shows you that the malfunction occurred because one of the machine’s bearings was improperly lubricated. As your investigation continues, you find that the automatic lubrication mechanism has a pump that isn’t working efficiently. Closer inspection of the pump reveals a worn shaft.
How did the shaft get worn? Dig deeper and you’ll discover that you don’t have a mechanism for stopping metal scraps from getting into the pump. Over time, scraps have damaged it.
Asking ‘why’ helps organizations dig deeper to address the root causes of faults and failures.
After all this analysis, you finally understand the root cause of the problem: metal scraps contaminating the lubrication system and causing damage. Now you can avoid focusing on symptoms alone and instead, stop the whole sequence of events from occurring again.
Compare this to a more surface-level investigation that fails to identify the real causal factor. You might have simply replaced the fuse, the bearing, or the lubrication pump. This may have made the machine operational again, but it wouldn’t have eliminated the problem altogether. Before long, you’d have another breakdown to deal with.
When to perform a root cause analysis?
Implementing processes for RCA requires a significant investment and will likely cause additional disruptions. That is why it may not be worthwhile for every fault in your system every time.
Unfortunately, there is no cut-and-dry rule for when to run an RCA and when not to. Assessing both the likelihood and the potential impact of failure can help you prioritize effectively. These types of faults and failures typically warrant an investigation to determine the root cause:
- Recurring failures: If the same faults occur again and again, issues under the surface probably need your attention.
- Systemic failures: Some faults and failures cause ripple effects that disrupt crucial processes. When a failed component threatens to upset an entire system, it calls for root cause analysis.
- Critical failures: When a failure would mean catastrophic consequences, it’s time to fix the problem at its source before it’s too late.
Now is not the time to cut corners
Remember, getting to the root of critical business problems takes time and effort. You might see short-term benefits from quick fixes, but your assets and your entire organization will suffer over time. Invest in thorough analyses and you’ll enable predictive maintenance while building a culture of continuous improvement.