You’ve probably heard the term benchmark in relation to stock prices, but it’s also a great way to evaluate and improve your business. In this article, we’ll break down what maintenance benchmarks are and how they can help you make your maintenance department more efficient.
What is a benchmark?
A benchmark is a standard of comparison. A benchmark can be used as a reference point for measuring progress and comparing performance with other organizations. Benchmarking is a way to get an objective measure of your organization’s performance so you can see how well it stacks up against other companies in the same industry or sector.
In business, there are many types of benchmarks:
Financial measures such as profit margins or return on investment (ROI)
Non-financial measures such as customer satisfaction surveys
How do you know what’s suitable as a maintenance benchmark?
In maintenance, a good benchmark should have the following qualities:
Relevant: You’re measuring something that will help you improve your business. For example, measuring how long it takes someone in your company to perform maintenance on an asset would be relevant if you want to improve downtime. If, on the other hand, you were trying to increase sales by selling more products (a common goal), then measuring how much time your assets spend creating the product would be important to measure.
Accurate: The measurements must accurately reflect what they’re supposed to (e.g. if we’re measuring downtime on an asset but only tracking one asset at our company instead of all the assets that may be down within 24 hours).
What are some examples of maintenance benchmarks?
Below are a few examples of maintenance activities that can be benchmarked:
Equipment downtime (annual or monthly), mean time between failures (MTBF), time to failure after installation
Average repair time (annual), mean time to repair (MTTR)
Ramesh Gulati, the author of Maintenance and Reliability: Best Practices, examines key performance indicators that can be easily measured and compared across businesses and industries. His book Maintenance and Reliability: Best Practices provides the metrics of well-performing companies. Most importantly, he outlines the performance of world-class companies so that you can compare it.
Here are some common maintenance KPIs and both average and world-class benchmarks according to Gulati:
First, the performance is always measured as a percent. This means that the numbers are normalized to account for the size of the machine, total work orders being completed, or total production cost. In this case, taking the percentage is a way of normalizing your maintenance performance. Your data is more likely to tell you your actual performance status.
Second, even world-class companies aren’t perfect. They make mistakes. Unplanned breakdowns happen despite the best intentions, equipment, and training of a top company. While you should aim to achieve zero breakdowns, you should always have a preventive maintenance plan to help you cope when something goes wrong.
What are the steps to benchmark maintenance activities?
Define the problem before starting on the solution. For example, perhaps your team is doing too much reactive maintenance and, as a result, isn’t hitting production targets due to unplanned downtime.
Set goals that are ambitious but realistic. If your team is doing too much reactive maintenance, create a process for the team to work through preventive maintenance strategies together for each asset. From there, you can set up new production targets for the team to achieve now that this new process is in place.
Have a plan step for when things go wrong. Let’s say your asset breaks down completely, and it’s not a simple fix with a part replacement. Having a backup plan for when the worst-case scenario might happen is always a good idea.
Be patient and give yourself time to achieve your goal. Set up a meeting with your team to check on your progress. Some teams set these up at 30, 60, and 90-day intervals and cross-compare the previous 30 days to the current.
Holding your team accountable to the benchmark
To hold your maintenance team accountable, you must set goals for them (see step 2 listed above). You can do this by assessing your current performance against the benchmark and then setting goals that are slightly higher than what you’re currently doing. For example, suppose your organization has been performing an average of 10 preventive maintenance activities per month on machines A and B over the last six months. In that case, one goal could be 11 PMs per month for these two machines. We can also convert benchmarks to PM percentage, so for example, out of all maintenance activities on asset A, the goal is for 80% of it to be planned and scheduled.
Benchmarking works because it allows you to understand where you are relative to others
Benchmarking is a great way to improve your maintenance activities. It allows you to understand where you are relative to others and helps identify areas where you need improvement. Benchmarking works because it gives everyone on the team a common goal they can work towards, increasing motivation and productivity in the long run.
A technical postmortem is a retrospective of a failure. It’s a preventative step that can help you quickly identify and address issues with your assets, systems, or other technology platforms so they don’t happen again. They are commonly used in maintenance but also have applications in software development and design as well.
What is a technical postmortem?
A technical postmortem is a retrospective analysis of events that resulted in a technical failure.
The purpose of a technical postmortem is to:
Find out what went wrong and why
Identify trouble areas
Determine what can be done to prevent future failures
Create best practices for your business
Inform process improvements, mitigate future risks, and promote iterative best practices
4 questions to ask during a technical postmortem
This postmortem outline is not meant to be comprehensive but to serve as a starting point for your technical postmortem. These questions generate discussion about what went well, what the team struggled with during the failure, and what the team would do differently moving forward.
Here’s what you and your team should be asking during a technical postmortem:
1. What happened?
You can’t analyze what you don’t understand, so establishing a clear understanding of what went wrong is crucial.
2. Why did it happen?
Identify the major events that led to the failure and try isolating the root causes for the failure. Determine if the events are the underlying causes of the failure, or if they initiate a process that leads to the technical failure. Some underlying causes can include defects in design, process, or poor maintenance practices.
Look strictly at the technical causes of the failure and examine the underlying management and team environment. Sometimes team members ignore warning signs of impending failure due to the organizational culture, time crunches, and budget pressure.
3. How did we respond and recover?
How your team responds to failure can determine how quickly you identify the root cause and fix it. A major technical failure can have a direct impact on shareholder value, revenues, market share, and brand equity, so a quick recovery is paramount.
A useful technical postmortem requires a reasonable level of honesty, insight, and cooperation from the organization. The outcome of the postmortem should be to recognize what worked and fix the processes that didn’t. Remember, the idea is to learn from your successes and failures, not just to document them.
4. How can we prevent similar unexpected issues from occurring again?
Unexpected technical issues do arise in mission-critical or complex hardware systems. However, the key to prevention is technical planning to prevent problems from affecting the entire system. Each of the failures uncovered in step two represents a risk going forward, so schedule regular inspections or system checks in your maintenance management software.
When a risk is detected, certain actions should be triggered immediately to prevent similar failures. Planning must also consider the business process and management responses the team initiates when a failure occurs. A complete postmortem addresses both technical and management issues.
Don’t turn your postmortem into a blame game. Instead, management has to develop a reputation for listening openly to input and not punishing people for being honest. A well-run postmortem can help a maintenance team create a culture of continuous improvement.
The benefits of conducting a technical postmortem
As we can see from our example, a technical postmortem has a series of positive benefits including a detailed analysis of why an asset failed. It can help you avoid future problems by identifying issues that are present before any kind of launch.
Improving the way your team approaches new projects
Learning from mistakes so they don’t happen again
Gaining insights into how other teams have handled similar situations
Some next steps after your technical postmortem is completed
After a technical postmortem is conducted and the project is concluded there is a postmortem meeting. This meeting is intended to understand the project from start to finish and determine what can be optimized and improved for the next postmortem. Generally, the project manager and team attend these meetings, but it’s open for anyone part of the project to join.
Tips and tricks to keep in mind during and after your technical postmortem
A postmortem can help you become more effective by learning from mistakes and focusing on what worked best, but it’s up to you to structure the meeting to get the most out of it. A way to structure your meeting is by setting a clear agenda, beginning with a recap of the project objectives, reviewing the results and whether or not the project met the set objectives, and lastly, analyzing the successes and failures and why they occurred.
You can ensure that your technical postmortem is successful by carefully preparing in advance, analyzing the failure systematically, producing actionable findings, and actively sharing the results.
Don’t let the momentum fade with your team. Schedule the postmortem right after the end of the project. A technical postmortem should occur within one to two weeks of the technical failure.
Make sure to store your postmortems in the asset record in a CMMS so they can be easily found in the future to prevent similar failures going forward.
A technical postmortem is an important tool for maintaining and improving your systems
A technical postmortem is a tool that allows you to learn from mistakes, identify the root cause of a problem, and improve your systems. It may sound like an abstract concept, but it’s actually quite simple: you document what went wrong and use that information to prevent the same issue from happening again.
Maintenance involves a lot of moving parts, which means more chances for something to go wrong. And when problems arise, you want to tackle them with as much information as possible. In other words, you want problem-solving to be predictable. Data is a key ingredient in achieving this goal.
We look at 5 ways to use data to solve common maintenance issues and lead your team to success.
Future of analytics and data
This article walks you through what data to use and how to use it. While you can follow along if your data is in spreadsheets or file cabinets, we’re using the Fiix analytics tool to illustrate the process. Fiix analytics is visual and interactive so you can get a clear view of how to drill into your data and find the answers to your biggest questions.
1. How do I make sure the right maintenance is being done at the right time?
The average facility manages 45 work orders a week. With so much to do (and so little time to do it in), you know how important it is to focus your team’s efforts in the right place. So, this question really has three sub-questions—am I doing too much maintenance, not enough maintenance, or the right amount of maintenance on an asset?
The first step to answering these questions is to identify the assets with lots of work orders associated with them. Then, filter these work orders by asset and maintenance type.
First, look for assets with few or no corrective work orders associated with them. This means you’re probably doing PMs too frequently on these assets and can cut the frequency of scheduled maintenance.
Assets with not enough preventive maintenance will have lots of emergency work associated with them. Also, look for assets with lower maintenance costs compared to assets of a similar type as that is often a sign that they aren’t getting enough maintenance. Increase the frequency of PMs on these assets.
The right amount of maintenance shows frequent and corrective work orders associated with assets.
2. How is maintenance affecting the performance of equipment?
To get a picture of how maintenance is impacting equipment performance, start by collecting information on assets with associated downtime. Next, filter those assets into two categories – planned and unplanned downtime. Rank those assets by unplanned downtime. Assets with more unplanned downtime are the ones you want to tackle first as they have the biggest negative impact on your company and the most opportunity for improvement. You can further filter those assets by maintenance costs associated with them. The assets with the most downtime and highest costs are where to begin adjusting your strategy.
The next step is to dive into the notes on the emergency work orders attached to those assets. Find out what the most common problems and causes were, and make changes to address them. For example, has a bearing continually failed because of improper lubrication? A simple change might be to increase the frequency of lubrication and specify the proper amount of lubrication needed in each instance.
Revisit this report to see if your adjustments have made a difference. If unplanned downtime and maintenance costs drop across 30, 60, and 90 days, you now have data to support your decisions and show how they impact production.
3. How can my facility organize our storeroom so parts are easily accessible?
An unorganized storeroom can pose more problems than just being messy. It makes it hard for technicians to access parts when they need them most leading to delays and potential breakdowns.
To tackle this problem head-on, collect data on assets with the most emergency work orders attached to them.
Take note of what parts are associated most with that emergency work and the equipment they’re needed for. Once that has been determined, you can kit those parts together. Parts kitting makes getting parts easier and more accessible when emergency work is triggered.
For this to work in the first place, this data needs to be tracked and updated frequently. Each time a tech reaches for a spare part, that data should be updated. It gives you an accurate sign of which parts are used frequently and how often they are attached to reactive work.
4. Where should I be allocating my maintenance budget?
Figuring out where to spend your maintenance budget can be a headache and can be even harder to justify that spending.
Let’s say that increasing your team’s headcount would help clear some of the facility’s backlogged maintenance. That decision comes down to two factors— do I hire more in-house employees or more contractors? That big budget consideration is hard to justify without proof.
To begin making your case, collect all the information you can about work done in the last quarter to a year. Was it done mostly by internal employees or contractors?
By looking at each category, add up the total spend associated with each. Take into account costs like employee salary and benefits, contractor’s hourly pay, and training. Each has its cost benefits and disadvantages.
Based on those costs, you can make a pretty clear case to your department, based on dollar value, if it’s more cost-effective to hire internal employees or more contractors. Those stats can help justify why spending on additional hires is necessary.
5. What obstacles are our technicians facing?
It’s easy for technicians to get caught up in their workload when things get busy. Completion notes aren’t updated or information is missed on work orders. It may not seem like a big deal the first time, but once it becomes a habit, it can become an obstacle for other technicians.
As a maintenance manager, you can help enforce the importance of having complete information. One of the ways you can tackle this obstacle is by conducting bi-weekly checks to find work orders with missing information or incomplete notes.
Look for trends in those work orders. Was it done by the same technician? Is it the same type of information being missed? Consider looking at the type of maintenance associated with these work orders. Consider having a department-wide info session on the importance and benefit of filling out work order completion notes.
If it’s the same technician, take a look at their logged hours. If they are doing more hours than the average, it might mean they are simply logging too many hours and might be overworked.
Making it a habit to check for these inconsistencies on a regular basis might make a big difference in the performance of your employees and your facility.
Seeing the bigger picture leads to bigger gains
Your facility has lots of moving parts and keeping track of them all manually can be time-consuming. Using an analytics reporting tool provides a visual representation of your facility’s moving parts. In addition, it gives the power back to the maintenance department, allowing them to tackle problems as they arise and lead their team to solution-oriented work culture.
Maintenance troubleshooting can be both an art and a science. A common problem is that, while art can be beautiful, it isn’t known for its efficiency. When taken to the next level, maintenance troubleshooting can ditch the trial-and-error moniker and become a purely scientific endeavor. This helps maintenance technicians find the right problems and solutions more quickly. When troubleshooting is done correctly, your whole maintenance operation can overcome backlog, lost production, and compliance issues much more efficiently.
In this troubleshooting guide, we’ll take a look at what it actually is, why it matters to maintenance professionals, and how your team can fine-tune its approach.
What is maintenance troubleshooting?
Systems break down—that’s just a fact of life. Whether it’s a conveyer belt or an industrial drill, we’ve all run across a piece of equipment that is unresponsive, faulty, or acting abnormally for seemingly no reason at all. It can be downright frustrating.
Maintenance troubleshooting is the process of identifying what is wrong with these faulty components and systems when the problem is not immediately obvious. Maintenance troubleshooting usually follows a systematic, four-step approach; identify the problem, plan a response, test the solution, and resolve the problem. Steps one to three are often repeated multiple times before a resolution is reached.
Identify the problem
Plan a response
Test the solution
Repeat until problem is resolved
Think about it this way: When a conveyor belt breaks down, you may try a few different methods to fix it. First, you identify which part of the conveyor belt isn’t working. Once you’ve identified the problem area, you plan a response and test it, such as realigning or lubricating a part. If this fails to fix the problem, you might replace the part, which makes the conveyor belt work again. This is troubleshooting.
How is maintenance troubleshooting usually done?
Stop us if you’ve heard this story before. An asset breaks down and no one knows why. You talk to the operator, read some manuals, and check your notes about the asset. You try a couple of things to get the machine up and working again with no luck. Before you can try a third or fourth possible solution, you get called away to another emergency, with the asset still out of commission.
This is often how the process happens when performing maintenance troubleshooting, especially when a facility relies on paper records or Excel spreadsheets. The process is based on collecting as much information as possible from as many sources as possible to identify the most likely cause of the unexpected breakdown. You can never go wrong when you gather information, but it’s the way that information is gathered that can turn troubleshooting from a necessity to a nightmare.
Why does maintenance troubleshooting matter?
Unexpected equipment failure is the entire reason maintenance troubleshooting exists. If assets never broke down without any clear signs of imminent failure, there would be no need to troubleshoot the problem. But we know that’s just not the case.
Machinery failure doesn’t always follow a predictable pattern. Yes, maintenance teams can use preventive maintenance and condition-based maintenance to reduce the likelihood of unplanned downtime. However, you can never eliminate it entirely. What you can do is put processes in place to reduce failure as much as possible and fix it as soon as possible when it does occur. This is where strong maintenance troubleshooting techniques come in handy.
Because troubleshooting will always be part of the maintenance equation, humans will also always have a role. Maintenance technology does not erase the need for a human touch in troubleshooting; it simply makes the process much more efficient. When troubleshooting isn’t refined, it could lead to time wasted tracking down information, a substantial loss of production, an unsafe working environment, and more frequent failures. In short, knowing some maintenance troubleshooting techniques could be the difference between an overwhelming backlog and a stable maintenance program.
Maintenance troubleshooting tips
The following are just a few ways your operation can improve its troubleshooting techniques to conquer chaos and take control of its maintenance.
1. Quantify asset performance and understand how to use the results
It probably goes without saying, but the more deeply you know an asset, the better equipped you’ll be to diagnose a problem. Years of working with a certain asset can help you recognize when it’s not working quite right. But exceptional troubleshooting isn’t just about knowing the normal sounds, speeds, or odours of a particular machine. Instead, it’s about knowing how to analyze asset performance at a deeper level, which is where advanced reporting factors in.
When operators and technicians rely solely on their own past experience with a piece of equipment, it leaves them with huge gaps in knowledge that hurt the maintenance troubleshooting process. For example, it leaves too much room for recency bias to affect decision-making, which means that technicians are most likely to try the last thing that fixed a particular problem without considering other options or delving further into the root cause. Also, if maintenance troubleshooting relies on the proprietary knowledge of a few technicians, it means repairs will have to wait until those particular maintenance personnel are available.
Maintenance staff should have the know-how to conduct an in-depth analysis of an asset’s performance. For example, technicians should understand how to run reports and understand KPIs for critical equipment, such as mean time between failure and overall equipment effectiveness. If using condition-based maintenance, the maintenance team should also know the P-F curve for each asset and what different sensor readings mean. When technicians are equipped with a deeper understanding of an asset, it will be easier for them to pinpoint where a problem occurred and how to fix it, both in the short and long-term.
2. Create in-depth asset histories
Information is the fuel that powers exceptional maintenance troubleshooting for maintenance. Knowing how a particular asset has worked and failed for hundreds of others is a good place to start a repair. That’s why manuals are a useful tool when implementing troubleshooting maintenance techniques. However, each asset, facility, and operation is different, which means asset machine failure doesn’t always follow the script. Detailed notes on an asset’s history can open up a dead end and lead you to a solution much more quickly.
A detailed asset history can give you an edge in maintenance troubleshooting in a variety of ways. It offers a simple method for cross-referencing symptoms of the current issue with elements of past problems. For example, a technician can see if a certain type of material was being handled by a machine or if there were any early warning signs identified for a previous failure. The more a present situation aligns with a past scenario, the more likely it is to need the same fix. Solutions can be prioritized this way, leading to fewer misses, less downtime, fewer unnecessary spare parts being used, and more.
When troubleshooting is done correctly, your whole maintenance operation can overcome backlog, lost production, and compliance issues much more efficiently.
When creating detailed asset histories to help with maintenance troubleshooting (as well as preventive maintenance), it’s important to include as much information as possible. Make sure to record the time and dates of any notable actions taken on an asset or piece of equipment. This can include breakdowns, PMs, inspections, part replacement, production schedules, and abnormal behavior, such as smoke or unusual sounds. Next, document the steps taken during maintenance, including PMs or repairs. Lastly, highlight the successful solution and what was needed to accomplish it, such as necessary parts, labor and safety equipment. Make sure to add any relevant metrics and reports to the asset history as well.
Effective maintenance troubleshooting starts with eliminating ambiguity and short-term solutions. Finding the root of an issue quickly, solving it effectively and ensuring it stays solved is a winning formula. Root cause analysis and failure codes are a couple of tools that will help you achieve this goal.
Root cause analysis is a maintenance troubleshooting technique that allows you to pinpoint the reason behind a failure. The method consists of asking “why” until you get to the heart of the problem. For example:
Why did the equipment fail?: Because a bearing wore out
Why did the bearing wear out?: Because a coupling was misaligned
Why was the coupling misaligned?: Because it was not serviced recently.
Why was the coupling not serviced?: Because maintenance was not scheduled.
Why was maintenance not scheduled?: Because we weren’t sure how often it should be scheduled.
This process has two benefits when conducting maintenance troubleshooting for maintenance. First, it allows you to identify the immediate cause of failure and fix it quickly. Second, it leads you to the core of the issue and a long-term solution. In the example above, it’s clear a better preventive maintenance program is required to improve asset management and reduce unplanned downtime.
Failure codes provide a consistent method to describe why an asset failed. Failure codes are built on three actions: Listing all possible problems, all possible causes, and all possible solutions. This process records key aspects of a failure according to predefined categories, like misalignment or corrosion.
Failure codes are useful when maintenance troubleshooting because technicians can immediately see common failure codes, determine the best solution, and implement it quickly. Failure codes can also be used to uncover a common problem among a group of assets and determine a long-term solution.
4. Build detailed task lists
Exceptional maintenance troubleshooting requires solid planning and foresight. Clear processes provide a blueprint for technicians so they can quickly identify problems and implement more effective solutions. Creating detailed task lists is one way to bolster your planning and avoid headaches down the road. This could also be incorporated into routine maintenance.
A task list outlines a series of tasks that need to be completed to finish a larger job. They ensure crucial steps aren’t missed when performing inspections, audits or PMs. For example, the larger job may be conducting a routine inspection of your facility’s defibrillators. This job is broken down into a list of smaller tasks, such as “Verify battery installation,” and “Inspect exterior components for cracks.”
Maintenance technology does not erase the need for a human touch in troubleshooting; it simply makes the process much more efficient.
Detailed task lists are extremely important when conducting maintenance troubleshooting. They act as a guide when testing possible solutions so technicians can either fix the issue or disqualify a diagnosis as quickly as possible. The more explicit the task list, the more thorough the job and the less likely a technician is to make a mistake. Comprehensive task lists can also offer valuable data when failure occurs. They provide insight into the type of work recently done on an asset so you can determine whether any corrective actions were missed and if this was the source of the problem.
There are a few best practices for building detailed task lists. First, include all individual actions that make up a task. For example, instead of instructing someone to “Inspect the cooling fan,” include the steps that comprise that inspection, such as “Check for any visible cracks,” and “Inspect for loose parts.” Organize all steps in the order they should be done. Lastly, include any additional information that may be helpful in completing the tasks, including necessary supplies, resources (ie. manuals), and PPE.
5. Make additional information accessible
We’ve said it before and we’ll say it again; great maintenance troubleshooting techniques are often the result of great information. However, if that information is difficult to access, you will lose any advantage it provides. That is why it is crucial for your operation to not only create a large resource center, but to also make it highly accessible. This will elevate your maintenance troubleshooting abilities and get your assets back online faster when unplanned downtime occurs.
Let’s start with the elements of a great information hub. We’ve talked about the importance of reports, asset histories, failure codes and task lists when performing a troubleshooting method. Some other key resources include diagrams, standard operating procedures (SOPs), training videos, and manuals. These should all be included and organized by asset. If a technician hits a dead-end a troubleshooting procedure, these tools can offer a solution that may have been missed in the initial analysis.
Now that you’ve gathered all your documents together, it’s time to make them easily accessible to the whole maintenance team. If resources are trapped in a file cabinet, on a spreadsheet, or in a single person’s mind, they don’t do a lot of good for the technician. They can be lost, misplaced and hard to find—not to mention the inefficiency involved with needing to walk from an asset to the office just to grab a manual. One way to get around this obstacle is to create a digital knowledge hub with maintenance software. By making all your resources available through a mobile device, technicians can access any tool they need to troubleshoot a problem. Instead of sifting through paper files to find an asset history or diagram, they can access that same information anywhere, anytime.
Using CMMS software for maintenance troubleshooting
If it sounds like a lot of work to gather, organize, analyze and circulate all the information needed to be successful at maintenance troubleshooting, you’re not wrong. Without the proper tools, this process can be a heavy lift for overwhelmed maintenance teams. Maintenance software is one tool that can help ease the load every step of the way. A digital platform, such as a CMMS, takes care of crunching the numbers, organizing data and making it available wherever and whenever, so you can focus on using that information to make great decisions and troubleshoot more effectively.
For example, when building a detailed asset history, it’s important to document every encounter with a piece of equipment. This is a lot of work for a technician rushing from one job to another and difficult to keep track of after the fact. An investment in maintenance software will help you navigate these roadblocks. It does this by allowing technicians to use a predetermined set of questions to make and retrieve notes in real-time with a few clicks.
The same goes for failure codes. The key to using them effectively is proper organization and accessibility. Without those two key ingredients, failure codes become more of a hindrance than a help. One way to accomplish this is to use maintenance software. A digital platform can organize failure codes better than any filing cabinet or Excel spreadsheet and make it easy for technicians to quickly sort them and identify the relevant ones from the site of the breakdown.
The bottom line
Troubleshooting will always exist in maintenance. You will never be 100 percent sure 100 percent of the time when diagnosing the cause of failure. What you can do is take steps to utilize maintenance troubleshooting techniques to ensure equipment is repaired quickly and effectively. By combining a good understanding of maintenance metrics with detailed asset histories, failure codes, task lists, and other asset resources, and making all this information accessible, you can move your troubleshooting beyond trial and error to a more systematic approach.
Criticality and reliability-centered maintenance go hand-in-hand. Think about it: We’re told to prioritize PMs for critical assets, to build a TPM plan that accommodates critical pieces of equipment, and to perform root cause analysis on machinery that we consider to be high priority based on criticality. But how do we actually decide what makes a piece of equipment “critical”? In short, it all comes down to risk. Performing a criticality analysis allows you to understand the potential risks that could impact your business.
What is criticality analysis?
Criticality analysis is a systematic approach to assigning a criticality rating to assets based on their potential risks. Still sounds kind of abstract, right? How can risk be quantified? It helps to think about criticality analysis as part of a larger failure modes, effects [and criticality] analysis (FMEA / FMECA).
As we’ve defined it recently, FMEA is an approach that identifies all possible ways that equipment can fail, and analyzes the effect those failures can have on the system as a whole. FMECA takes it a step further by conducting a risk assessment for each failure mode and then prioritizing what corrective actions should be taken.
Why is criticality analysis important?
As James Kovacevic of Eruditio describes, using a predetermined system to evaluate risk allows you to remove emotion from the equation. This ensures that reliability is truly approached from a risk-based point of view, rather than individual perception. Once equipment undergoes relative ranking based on its criticality, work can be properly prioritized and a condition monitoring strategy can be put in place. Performing an equipment criticality analysis also helps to clarify what can be done to reduce the risk associated with each asset.
Who’s responsible for criticality analysis?
So who actually carries out a criticality analysis? Industry experts say that it should be a cross-functional effort. We couldn’t agree more. It’s a much more effective process if input from operations, maintenance, engineering, materials management, and employee health and safety functions is considered. After all, risk can be defined differently for different teams. And since assigning risk will always be somewhat subjective, having a diverse background of knowledge to draw on will help to curb that.
How do you assess the criticality of an asset?
Asset criticality is the number value a business assigns to its assets based on their own set criteria. An asset criticality assessment can be done by creating a ranked list of work orders and orders in progress. This is known as an asset criticality ranking (ACR).
How to perform a criticality analysis
According to Kovacevic, there are two ways to carry out a criticality analysis. Both approaches produce a risk priority number (RPN) that allows you to rank the criticality level of each asset.
The first approach uses a criticality matrix, which is a 6×6 grid where severity of a given consequence (on the X axis) is plotted against the probability of that consequence occurring (Y axis). Naturally, if there is a high probability that a piece of equipment will fail in a way that causes great personal injury or severe operational issues, that piece of equipment is highly critical and should be prioritized accordingly. The number at the cross section of severity and priority for any piece of equipment is that piece of equipment RPN.
The second recommended approach is to separate the consequence categories by type (for example, health and safety, environmental, and operational). That way, you can rate how severe an equipment failure would be for each consequence category. For example, a piece of machinery that could cause severe personal injury upon asset failure would be a 5 or 6 in the health and safety category, but of almost no consequence to the environmental category (perhaps a 1 or 2), and moderately impactful to operations (somewhere in the middle). Once you’ve determined the severity of each consequence category for a given piece of equipment, you can multiply each of the categories together for that piece of equipment to get its RPN.
Once each piece of equipment has an RPN attached to it, you can rank them to assess which assets are critical. Kovacevic recommends grouping equipment into categories based on their RPN. Here are the categories he suggests:
Once each piece of equipment is ranked, maintenance managers can make decisions that are informed by risk, rather than gut feel. From here, all reliability-related activities and processes will run much more smoothly.
Every day, meat processing plants need to make sure the metal detectors in their machines are working. It’s a simple check to ensure there’s metal where there should be and no metal where there shouldn’t be.
This process involves running test balls through the machine. It takes about 45 minutes to complete (25 minutes of manual labour and 20 minutes of admin time). It’s routine maintenance— the type most people don’t give a second thought to.
It’s also an example of how tweaking maintenance processes can boost production efficiency. Instead of a manual check, the inspection can be done with an automated test-ball shooter. A button is pressed, the balls roll out on their own, and the task is wrapped up in five minutes. The result is more than 160 hours of extra equipment availability per year.
This is just one example of how companies can leverage maintenance to increase production efficiency. This article outlines several other strategies for bolstering production efficiency using maintenance, including:
How maintenance impacts production efficiency
Five ways the maintenance team can boost production capacity
How to measure the impact of maintenance on production
What is production efficiency?
Production efficiency is a measurement used mostly by manufacturers to determine how well (and how long) a company can keep up with demand. It compares current production rates to expected or standard production rates.
A higher rate of production efficiency delivers three critical outcomes for manufacturers:
Reduced resource usage: Efficient production systems produce the same number of goods with fewer resources
Higher financial margins: Efficient production means higher margins throughout the supply chain
A better customer experience: Efficient production allows products and services to be regularly and dependably delivered to customers
How to calculate production efficiency
The calculation for production efficiency compares the actual output rate to the standard output rate. The formula can be applied to either manual or automated work.
When it comes to industrial processes, the calculation takes quality into account. Let’s say you produce 50 units in an hour, but only 30 are useable. Your rate of production for that hour is 30 units.
The following formula is used to calculate production efficiency:
Production Efficiency = (Actual Output Rate / Standard Output Rate) x 100
For example, a manufacturing company receives a new order of 100 units. The standard rate of completion for 100 units is 10 hours, or 10 units per hour. However, the company took 12 hours to complete 100 quality units. In this case, the production efficiency formula would look like this:
Actual Output Rate = 100 units / 12 hours (8.3 units/hour)
Standard Output Rate = 100 units / 10 hours (10 units/hour)
Production Efficiency = (8.3 / 10) x 100 (83%)
In this instance, output and productivity levels are below capacity.
How maintenance can increase production efficiency
Proper equipment maintenance is essential for increasing production efficiency. It ensures your total effective equipment performance (TEEP) is as high as it can be. Using preventive maintenance to keep assets operating at their best helps to:
Limit equipment downtime: If equipment is checked regularly, you can find and fix failures before they cause big breakdowns that disrupt production. Having a solid preventive maintenance schedule also allows you to coordinate with production so planned downtime is done quickly.
Establish a corrective action system for failures: Having a strategy to find, analyze, and fix failure (aka a FRACAS) allows you to target recurring issues at their root. You can spot and eliminate problems that impact equipment availability and product quality the most.
Coordinate better shift changeovers: Better changeovers between maintenance shifts means communicating the right information to technicians quickly and accurately. This includes a run-down of what work needs to be done, when, and any obstacles that might get in the way of that work.
Ensuring standard operating procedures are clear and maintained: SOPs train operators to do routine maintenance so machines can be operated with fewer breakdowns and accidents.
Five things your maintenance team can start doing tomorrow to increase production efficiency
There are a lot of projects that take months or years to complete. But getting quick wins is also crucial for building momentum and proving the value of your maintenance team. So, here are five things your maintenance team can start doing tomorrow to increase production efficiency.
1. Optimize the frequency of your PMs
A preventive maintenance schedule can be a good example of having too much of a good thing. Going overboard on preventive maintenance can affect production efficiency in two ways. You can either waste valuable time preventing non-existent failure. Or you can increase the risk of failure by meddling with a perfectly fine component.
These guidelines can help you find the right balance between too many PMs and too few:
Use equipment maintenance logs to track the found failure rate on preventive maintenance tasks. Start with PMs that take the longest to do or cost the most.
If a PM leads to regular corrective maintenance, keep it at the same frequency.
If a PM rarely identifies failure, try increasing the time between inspections. If the found failure rate exceeds the frequency of the PM, tweak your schedule so it’s better aligned. For example, an inspection might happen every two weeks. But a failure is usually found every six weeks. In this case, plan for the PM to happen every 4-6 weeks instead.
If a machine experiences frequent breakdowns between inspections, try shortening maintenance intervals. You can also modify the trigger for maintenance, changing it from a time-based trigger to usage or performance-based trigger.
2. Identify machines that can be maintained while running
Some routine maintenance can be done while a machine is still operating. Find out if there are any assets that can be safely worked on while being used for production. The key word there is ‘safely’. This might mean that some work can’t be done because certain areas of a machine aren’t safely accessible while it’s operating. In this scenario, determine if partial maintenance is possible and if it’ll have a positive impact on the performance of the equipment.
It’s also a good idea to track rotating or spare assets and swap them for production equipment when possible. That allows you to do regular maintenance on these machines without sacrificing productivity.
3. Make equipment capabilities transparent and clear
Create an iron-clad list of instructions for operating equipment and common issues to be aware of. You can use a failure modes and effects analysis (FMEA) to create a list of common failures experienced by each asset. This can also include warning signs for breakdowns.
Having this information clearly outlined and easily accessible gives operators a chance to notice the early signs of failure and notify maintenance before it gets worse. Employees will be empowered to observe and identify any potential problems, and report them accordingly.
4. Use work order data to identify where your team can be more efficient
Work order data can tell you what jobs can get done quicker and how to minimize the risk of asset failure so you can boost production efficiency. Look for these telltale signs of broken processes in your work orders:
Unavailable parts and supplies: If this issue is delaying maintenance, review the purchasing process for parts and supplies. That includes making sure your cycle counts are accurate and the threshold for purchase approvals is low enough that inventory can get replenished quickly. You can also create parts kits for frequent repairs or emergency repairs on production equipment so your team can locate and retrieve parts quickly.
Misidentified/misdiagnosed problems or missing instructions: Make sure task lists, failure codes, and descriptions are clear. Attach photos, manuals, and other documentation to the work order.
Diverted resources resulting from emergency work orders: Emergencies can always be avoided. Analyze your work order data, find tasks that are too big, and break it down into smaller jobs to reduce the risk of major disruptions.
Scheduling conflicts with production: See if maintenance can be scheduled while production is happening or if work can be done at an alternate time, like evenings or weekends. You can also consider giving operators minor maintenance responsibilities associated with the work order.
Lack of adequate worker skillset: Work order data can show you if the person/people assigned to the work may not have the right skills. Make it very clear on the work request what kind of skills or certifications are necessary for certain maintenance types.
5. Find the biggest obstacles for your team and eliminate them
You can learn a lot from the data that comes from your equipment and work orders. But sometimes, you just have to ask the people who are doing the actual work. They will be able to tell you what barriers they face when completing work. Acting on this information is crucial to continually improve your maintenance processes. All those improvements can add up to a huge boost in production efficiency.
For example, your technicians may spend a lot of time going back and forth from the office to retrieve manuals, asset histories, or other materials that help them on a job. You probably won’t know that just by looking at work order records or wrench time reports. Armed with this information, you can figure out a solution. Maybe that’s creating areas throughout your facility where files can be accessed for nearby assets. Or it could be digitizing those files so they can be accessed through a mobile device.
Here are a few questions to ask your technicians to find any roadblocks:
What tasks commonly take you away from a machine?
Are information and parts easily accessible? If not, why?
What information would help you complete work more efficiently?
Are there processes or systems that are hard to use or you think could be improved?
Is there anything that frequently keeps you from starting a task on time?
Four ways to measure the impact of maintenance on production efficiency
There are many ways to measure how your maintenance efforts are affecting production efficiency. The most common metrics are the following:
Found failure rate on preventive maintenance
This metric will help you measure how efficient your preventive maintenance schedule is. If your found failure rate is high, it means you’re cutting down on unnecessary maintenance while preventing major disruptions to production.
Unplanned asset downtime (last 90 days)
This number tracks the amount of unplanned equipment downtime and compares it to the previous 90-day period. Because each minute of downtime lowers your production efficiency, this number highlights how maintenance is contributing to healthier, higher-performing assets.
Average time to respond to and repair breakdowns
This stat quantifies all the work you’ve done to prepare for emergencies. Breakdowns will happen. Having a plan to quickly and safely fix these failures will help you reduce the amount of time production is stalled.
Compare the amount of useable products coming from the equipment prior to and after maintenance is completed. If the machine is running better after maintenance, it’s proof that your team is increasing production capacity in a meaningful way.
Maintenance has the opportunity to drive production efficiency
Maintenance often gets talked about as an expense. A necessary evil. A cost-center. But the reality is, good maintenance can drive your business forward. When you keep the machines running, you can do more, faster, with less. That means happier customers, a better bottom line, and more profit for everyone in the supply chain. It’s a true win-win-win.
In order to turn maintenance from a cost centre to a business driver, you need to reorient maintenance as a business function and start asking how maintenance can drive production efficiency. From there, a world of opportunity opens up.