Maintenance involves a lot of moving parts, which means more chances for something to go wrong. And when problems arise, you want to tackle them with as much information as possible. In other words, you want problem-solving to be predictable. Data is a key ingredient in achieving this goal.
We look at 5 ways to use data to solve common maintenance issues and lead your team to success.
Future of analytics and data
This article walks you through what data to use and how to use it. While you can follow along if your data is in spreadsheets or file cabinets, we’re using the Fiix analytics tool to illustrate the process. Fiix analytics is visual and interactive so you can get a clear view of how to drill into your data and find the answers to your biggest questions.
1. How do I make sure the right maintenance is being done at the right time?
The average facility manages 45 work orders a week. With so much to do (and so little time to do it in), you know how important it is to focus your team’s efforts in the right place. So, this question really has three sub-questions—am I doing too much maintenance, not enough maintenance, or the right amount of maintenance on an asset?
The first step to answering these questions is to identify the assets with lots of work orders associated with them. Then, filter these work orders by asset and maintenance type.
First, look for assets with few or no corrective work orders associated with them. This means you’re probably doing PMs too frequently on these assets and can cut the frequency of scheduled maintenance.
Assets with not enough preventive maintenance will have lots of emergency work associated with them. Also, look for assets with lower maintenance costs compared to assets of a similar type as that is often a sign that they aren’t getting enough maintenance. Increase the frequency of PMs on these assets.
The right amount of maintenance shows frequent and corrective work orders associated with assets.
2. How is maintenance affecting the performance of equipment?
To get a picture of how maintenance is impacting equipment performance, start by collecting information on assets with associated downtime. Next, filter those assets into two categories – planned and unplanned downtime. Rank those assets by unplanned downtime. Assets with more unplanned downtime are the ones you want to tackle first as they have the biggest negative impact on your company and the most opportunity for improvement. You can further filter those assets by maintenance costs associated with them. The assets with the most downtime and highest costs are where to begin adjusting your strategy.
The next step is to dive into the notes on the emergency work orders attached to those assets. Find out what the most common problems and causes were, and make changes to address them. For example, has a bearing continually failed because of improper lubrication? A simple change might be to increase the frequency of lubrication and specify the proper amount of lubrication needed in each instance.
Revisit this report to see if your adjustments have made a difference. If unplanned downtime and maintenance costs drop across 30, 60, and 90 days, you now have data to support your decisions and show how they impact production.
3. How can my facility organize our storeroom so parts are easily accessible?
An unorganized storeroom can pose more problems than just being messy. It makes it hard for technicians to access parts when they need them most leading to delays and potential breakdowns.
To tackle this problem head-on, collect data on assets with the most emergency work orders attached to them.
Take note of what parts are associated most with that emergency work and the equipment they’re needed for. Once that has been determined, you can kit those parts together. Parts kitting makes getting parts easier and more accessible when emergency work is triggered.
For this to work in the first place, this data needs to be tracked and updated frequently. Each time a tech reaches for a spare part, that data should be updated. It gives you an accurate sign of which parts are used frequently and how often they are attached to reactive work.
4. Where should I be allocating my maintenance budget?
Figuring out where to spend your maintenance budget can be a headache and can be even harder to justify that spending.
Let’s say that increasing your team’s headcount would help clear some of the facility’s backlogged maintenance. That decision comes down to two factors— do I hire more in-house employees or more contractors? That big budget consideration is hard to justify without proof.
To begin making your case, collect all the information you can about work done in the last quarter to a year. Was it done mostly by internal employees or contractors?
By looking at each category, add up the total spend associated with each. Take into account costs like employee salary and benefits, contractor’s hourly pay, and training. Each has its cost benefits and disadvantages.
Based on those costs, you can make a pretty clear case to your department, based on dollar value, if it’s more cost-effective to hire internal employees or more contractors. Those stats can help justify why spending on additional hires is necessary.
5. What obstacles are our technicians facing?
It’s easy for technicians to get caught up in their workload when things get busy. Completion notes aren’t updated or information is missed on work orders. It may not seem like a big deal the first time, but once it becomes a habit, it can become an obstacle for other technicians.
As a maintenance manager, you can help enforce the importance of having complete information. One of the ways you can tackle this obstacle is by conducting bi-weekly checks to find work orders with missing information or incomplete notes.
Look for trends in those work orders. Was it done by the same technician? Is it the same type of information being missed? Consider looking at the type of maintenance associated with these work orders. Consider having a department-wide info session on the importance and benefit of filling out work order completion notes.
If it’s the same technician, take a look at their logged hours. If they are doing more hours than the average, it might mean they are simply logging too many hours and might be overworked.
Making it a habit to check for these inconsistencies on a regular basis might make a big difference in the performance of your employees and your facility.
Seeing the bigger picture leads to bigger gains
Your facility has lots of moving parts and keeping track of them all manually can be time-consuming. Using an analytics reporting tool provides a visual representation of your facility’s moving parts. In addition, it gives the power back to the maintenance department, allowing them to tackle problems as they arise and lead their team to solution-oriented work culture.
Maintenance troubleshooting can be both an art and a science. A common problem is that, while art can be beautiful, it isn’t known for its efficiency. When taken to the next level, maintenance troubleshooting can ditch the trial-and-error moniker and become a purely scientific endeavor. This helps maintenance technicians find the right problems and solutions more quickly. When troubleshooting is done correctly, your whole maintenance operation can overcome backlog, lost production, and compliance issues much more efficiently.
In this troubleshooting guide, we’ll take a look at what it actually is, why it matters to maintenance professionals, and how your team can fine-tune its approach.
What is maintenance troubleshooting?
Systems break down—that’s just a fact of life. Whether it’s a conveyer belt or an industrial drill, we’ve all run across a piece of equipment that is unresponsive, faulty, or acting abnormally for seemingly no reason at all. It can be downright frustrating.
Maintenance troubleshooting is the process of identifying what is wrong with these faulty components and systems when the problem is not immediately obvious. Maintenance troubleshooting usually follows a systematic, four-step approach; identify the problem, plan a response, test the solution, and resolve the problem. Steps one to three are often repeated multiple times before a resolution is reached.
Identify the problem
Plan a response
Test the solution
Repeat until problem is resolved
Think about it this way: When a conveyor belt breaks down, you may try a few different methods to fix it. First, you identify which part of the conveyor belt isn’t working. Once you’ve identified the problem area, you plan a response and test it, such as realigning or lubricating a part. If this fails to fix the problem, you might replace the part, which makes the conveyor belt work again. This is troubleshooting.
How is maintenance troubleshooting usually done?
Stop us if you’ve heard this story before. An asset breaks down and no one knows why. You talk to the operator, read some manuals, and check your notes about the asset. You try a couple of things to get the machine up and working again with no luck. Before you can try a third or fourth possible solution, you get called away to another emergency, with the asset still out of commission.
This is often how the process happens when performing maintenance troubleshooting, especially when a facility relies on paper records or Excel spreadsheets. The process is based on collecting as much information as possible from as many sources as possible to identify the most likely cause of the unexpected breakdown. You can never go wrong when you gather information, but it’s the way that information is gathered that can turn troubleshooting from a necessity to a nightmare.
Why does maintenance troubleshooting matter?
Unexpected equipment failure is the entire reason maintenance troubleshooting exists. If assets never broke down without any clear signs of imminent failure, there would be no need to troubleshoot the problem. But we know that’s just not the case.
Machinery failure doesn’t always follow a predictable pattern. Yes, maintenance teams can use preventive maintenance and condition-based maintenance to reduce the likelihood of unplanned downtime. However, you can never eliminate it entirely. What you can do is put processes in place to reduce failure as much as possible and fix it as soon as possible when it does occur. This is where strong maintenance troubleshooting techniques come in handy.
Because troubleshooting will always be part of the maintenance equation, humans will also always have a role. Maintenance technology does not erase the need for a human touch in troubleshooting; it simply makes the process much more efficient. When troubleshooting isn’t refined, it could lead to time wasted tracking down information, a substantial loss of production, an unsafe working environment, and more frequent failures. In short, knowing some maintenance troubleshooting techniques could be the difference between an overwhelming backlog and a stable maintenance program.
Maintenance troubleshooting tips
The following are just a few ways your operation can improve its troubleshooting techniques to conquer chaos and take control of its maintenance.
1. Quantify asset performance and understand how to use the results
It probably goes without saying, but the more deeply you know an asset, the better equipped you’ll be to diagnose a problem. Years of working with a certain asset can help you recognize when it’s not working quite right. But exceptional troubleshooting isn’t just about knowing the normal sounds, speeds, or odours of a particular machine. Instead, it’s about knowing how to analyze asset performance at a deeper level, which is where advanced reporting factors in.
When operators and technicians rely solely on their own past experience with a piece of equipment, it leaves them with huge gaps in knowledge that hurt the maintenance troubleshooting process. For example, it leaves too much room for recency bias to affect decision-making, which means that technicians are most likely to try the last thing that fixed a particular problem without considering other options or delving further into the root cause. Also, if maintenance troubleshooting relies on the proprietary knowledge of a few technicians, it means repairs will have to wait until those particular maintenance personnel are available.
Maintenance staff should have the know-how to conduct an in-depth analysis of an asset’s performance. For example, technicians should understand how to run reports and understand KPIs for critical equipment, such as mean time between failure and overall equipment effectiveness. If using condition-based maintenance, the maintenance team should also know the P-F curve for each asset and what different sensor readings mean. When technicians are equipped with a deeper understanding of an asset, it will be easier for them to pinpoint where a problem occurred and how to fix it, both in the short and long-term.
2. Create in-depth asset histories
Information is the fuel that powers exceptional maintenance troubleshooting for maintenance. Knowing how a particular asset has worked and failed for hundreds of others is a good place to start a repair. That’s why manuals are a useful tool when implementing troubleshooting maintenance techniques. However, each asset, facility, and operation is different, which means asset machine failure doesn’t always follow the script. Detailed notes on an asset’s history can open up a dead end and lead you to a solution much more quickly.
A detailed asset history can give you an edge in maintenance troubleshooting in a variety of ways. It offers a simple method for cross-referencing symptoms of the current issue with elements of past problems. For example, a technician can see if a certain type of material was being handled by a machine or if there were any early warning signs identified for a previous failure. The more a present situation aligns with a past scenario, the more likely it is to need the same fix. Solutions can be prioritized this way, leading to fewer misses, less downtime, fewer unnecessary spare parts being used, and more.
When troubleshooting is done correctly, your whole maintenance operation can overcome backlog, lost production, and compliance issues much more efficiently.
When creating detailed asset histories to help with maintenance troubleshooting (as well as preventive maintenance), it’s important to include as much information as possible. Make sure to record the time and dates of any notable actions taken on an asset or piece of equipment. This can include breakdowns, PMs, inspections, part replacement, production schedules, and abnormal behavior, such as smoke or unusual sounds. Next, document the steps taken during maintenance, including PMs or repairs. Lastly, highlight the successful solution and what was needed to accomplish it, such as necessary parts, labor and safety equipment. Make sure to add any relevant metrics and reports to the asset history as well.
One way to capture all this information in one place is to create a well-built equipment maintenance log, like this one:
3. Use root cause analysis and failure codes
Effective maintenance troubleshooting starts with eliminating ambiguity and short-term solutions. Finding the root of an issue quickly, solving it effectively and ensuring it stays solved is a winning formula. Root cause analysis and failure codes are a couple of tools that will help you achieve this goal.
Root cause analysis is a maintenance troubleshooting technique that allows you to pinpoint the reason behind a failure. The method consists of asking “why” until you get to the heart of the problem. For example:
Why did the equipment fail?: Because a bearing wore out
Why did the bearing wear out?: Because a coupling was misaligned
Why was the coupling misaligned?: Because it was not serviced recently.
Why was the coupling not serviced?: Because maintenance was not scheduled.
Why was maintenance not scheduled?: Because we weren’t sure how often it should be scheduled.
This process has two benefits when conducting maintenance troubleshooting for maintenance. First, it allows you to identify the immediate cause of failure and fix it quickly. Second, it leads you to the core of the issue and a long-term solution. In the example above, it’s clear a better preventive maintenance program is required to improve asset management and reduce unplanned downtime.
Failure codes provide a consistent method to describe why an asset failed. Failure codes are built on three actions: Listing all possible problems, all possible causes, and all possible solutions. This process records key aspects of a failure according to predefined categories, like misalignment or corrosion.
Failure codes are useful when maintenance troubleshooting because technicians can immediately see common failure codes, determine the best solution, and implement it quickly. Failure codes can also be used to uncover a common problem among a group of assets and determine a long-term solution.
4. Build detailed task lists
Exceptional maintenance troubleshooting requires solid planning and foresight. Clear processes provide a blueprint for technicians so they can quickly identify problems and implement more effective solutions. Creating detailed task lists is one way to bolster your planning and avoid headaches down the road. This could also be incorporated into routine maintenance.
A task list outlines a series of tasks that need to be completed to finish a larger job. They ensure crucial steps aren’t missed when performing inspections, audits or PMs. For example, the larger job may be conducting a routine inspection of your facility’s defibrillators. This job is broken down into a list of smaller tasks, such as “Verify battery installation,” and “Inspect exterior components for cracks.”
Maintenance technology does not erase the need for a human touch in troubleshooting; it simply makes the process much more efficient.
Detailed task lists are extremely important when conducting maintenance troubleshooting. They act as a guide when testing possible solutions so technicians can either fix the issue or disqualify a diagnosis as quickly as possible. The more explicit the task list, the more thorough the job and the less likely a technician is to make a mistake. Comprehensive task lists can also offer valuable data when failure occurs. They provide insight into the type of work recently done on an asset so you can determine whether any corrective actions were missed and if this was the source of the problem.
There are a few best practices for building detailed task lists. First, include all individual actions that make up a task. For example, instead of instructing someone to “Inspect the cooling fan,” include the steps that comprise that inspection, such as “Check for any visible cracks,” and “Inspect for loose parts.” Organize all steps in the order they should be done. Lastly, include any additional information that may be helpful in completing the tasks, including necessary supplies, resources (ie. manuals), and PPE.
5. Make additional information accessible
We’ve said it before and we’ll say it again; great maintenance troubleshooting techniques are often the result of great information. However, if that information is difficult to access, you will lose any advantage it provides. That is why it is crucial for your operation to not only create a large resource center, but to also make it highly accessible. This will elevate your maintenance troubleshooting abilities and get your assets back online faster when unplanned downtime occurs.
Let’s start with the elements of a great information hub. We’ve talked about the importance of reports, asset histories, failure codes and task lists when performing a troubleshooting method. Some other key resources include diagrams, standard operating procedures (SOPs), training videos, and manuals. These should all be included and organized by asset. If a technician hits a dead-end a troubleshooting procedure, these tools can offer a solution that may have been missed in the initial analysis.
Now that you’ve gathered all your documents together, it’s time to make them easily accessible to the whole maintenance team. If resources are trapped in a file cabinet, on a spreadsheet, or in a single person’s mind, they don’t do a lot of good for the technician. They can be lost, misplaced and hard to find—not to mention the inefficiency involved with needing to walk from an asset to the office just to grab a manual. One way to get around this obstacle is to create a digital knowledge hub with maintenance software. By making all your resources available through a mobile device, technicians can access any tool they need to troubleshoot a problem. Instead of sifting through paper files to find an asset history or diagram, they can access that same information anywhere, anytime.
Using CMMS software for maintenance troubleshooting
If it sounds like a lot of work to gather, organize, analyze and circulate all the information needed to be successful at maintenance troubleshooting, you’re not wrong. Without the proper tools, this process can be a heavy lift for overwhelmed maintenance teams. Maintenance software is one tool that can help ease the load every step of the way. A digital platform, such as a CMMS, takes care of crunching the numbers, organizing data and making it available wherever and whenever, so you can focus on using that information to make great decisions and troubleshoot more effectively.
For example, when building a detailed asset history, it’s important to document every encounter with a piece of equipment. This is a lot of work for a technician rushing from one job to another and difficult to keep track of after the fact. An investment in maintenance software will help you navigate these roadblocks. It does this by allowing technicians to use a predetermined set of questions to make and retrieve notes in real-time with a few clicks.
The same goes for failure codes. The key to using them effectively is proper organization and accessibility. Without those two key ingredients, failure codes become more of a hindrance than a help. One way to accomplish this is to use maintenance software. A digital platform can organize failure codes better than any filing cabinet or Excel spreadsheet and make it easy for technicians to quickly sort them and identify the relevant ones from the site of the breakdown.
The bottom line
Troubleshooting will always exist in maintenance. You will never be 100 percent sure 100 percent of the time when diagnosing the cause of failure. What you can do is take steps to utilize maintenance troubleshooting techniques to ensure equipment is repaired quickly and effectively. By combining a good understanding of maintenance metrics with detailed asset histories, failure codes, task lists, and other asset resources, and making all this information accessible, you can move your troubleshooting beyond trial and error to a more systematic approach.
Criticality and reliability-centered maintenance go hand-in-hand. Think about it: We’re told to prioritize PMs for critical assets, to build a TPM plan that accommodates critical pieces of equipment, and to perform root cause analysis on machinery that we consider to be high priority based on criticality. But how do we actually decide what makes a piece of equipment “critical”? In short, it all comes down to risk. Performing a criticality analysis allows you to understand the potential risks that could impact your business.
What is criticality analysis?
Criticality analysis is a systematic approach to assigning a criticality rating to assets based on their potential risks. Still sounds kind of abstract, right? How can risk be quantified? It helps to think about criticality analysis as part of a larger failure modes, effects [and criticality] analysis (FMEA / FMECA).
As we’ve defined it recently, FMEA is an approach that identifies all possible ways that equipment can fail, and analyzes the effect those failures can have on the system as a whole. FMECA takes it a step further by conducting a risk assessment for each failure mode and then prioritizing what corrective actions should be taken.
Why is criticality analysis important?
As James Kovacevic of Eruditio describes, using a predetermined system to evaluate risk allows you to remove emotion from the equation. This ensures that reliability is truly approached from a risk-based point of view, rather than individual perception. Once equipment undergoes relative ranking based on its criticality, work can be properly prioritized and a condition monitoring strategy can be put in place. Performing an equipment criticality analysis also helps to clarify what can be done to reduce the risk associated with each asset.
Who’s responsible for criticality analysis?
So who actually carries out a criticality analysis? Industry experts say that it should be a cross-functional effort. We couldn’t agree more. It’s a much more effective process if input from operations, maintenance, engineering, materials management, and employee health and safety functions is considered. After all, risk can be defined differently for different teams. And since assigning risk will always be somewhat subjective, having a diverse background of knowledge to draw on will help to curb that.
How do you assess the criticality of an asset?
Asset criticality is the number value a business assigns to its assets based on their own set criteria. An asset criticality assessment can be done by creating a ranked list of work orders and orders in progress. This is known as an asset criticality ranking (ACR).
How to perform a criticality analysis
According to Kovacevic, there are two ways to carry out a criticality analysis. Both approaches produce a risk priority number (RPN) that allows you to rank the criticality level of each asset.
The first approach uses a criticality matrix, which is a 6×6 grid where severity of a given consequence (on the X axis) is plotted against the probability of that consequence occurring (Y axis). Naturally, if there is a high probability that a piece of equipment will fail in a way that causes great personal injury or severe operational issues, that piece of equipment is highly critical and should be prioritized accordingly. The number at the cross section of severity and priority for any piece of equipment is that piece of equipment RPN.
The second recommended approach is to separate the consequence categories by type (for example, health and safety, environmental, and operational). That way, you can rate how severe an equipment failure would be for each consequence category. For example, a piece of machinery that could cause severe personal injury upon asset failure would be a 5 or 6 in the health and safety category, but of almost no consequence to the environmental category (perhaps a 1 or 2), and moderately impactful to operations (somewhere in the middle). Once you’ve determined the severity of each consequence category for a given piece of equipment, you can multiply each of the categories together for that piece of equipment to get its RPN.
Once each piece of equipment has an RPN attached to it, you can rank them to assess which assets are critical. Kovacevic recommends grouping equipment into categories based on their RPN. Here are the categories he suggests:
Once each piece of equipment is ranked, maintenance managers can make decisions that are informed by risk, rather than gut feel. From here, all reliability-related activities and processes will run much more smoothly.
Every day, meat processing plants need to make sure the metal detectors in their machines are working. It’s a simple check to ensure there’s metal where there should be and no metal where there shouldn’t be.
This process involves running test balls through the machine. It takes about 45 minutes to complete (25 minutes of manual labour and 20 minutes of admin time). It’s routine maintenance— the type most people don’t give a second thought to.
It’s also an example of how tweaking maintenance processes can boost production efficiency. Instead of a manual check, the inspection can be done with an automated test-ball shooter. A button is pressed, the balls roll out on their own, and the task is wrapped up in five minutes. The result is more than 160 hours of extra equipment availability per year.
This is just one example of how companies can leverage maintenance to increase production efficiency. This article outlines several other strategies for bolstering production efficiency using maintenance, including:
How maintenance impacts production efficiency
Five ways the maintenance team can boost production capacity
How to measure the impact of maintenance on production
What is production efficiency?
Production efficiency is a measurement used mostly by manufacturers to determine how well (and how long) a company can keep up with demand. It compares current production rates to expected or standard production rates.
A higher rate of production efficiency delivers three critical outcomes for manufacturers:
Reduced resource usage: Efficient production systems produce the same number of goods with fewer resources
Higher financial margins: Efficient production means higher margins throughout the supply chain
A better customer experience: Efficient production allows products and services to be regularly and dependably delivered to customers
How to calculate production efficiency
The calculation for production efficiency compares the actual output rate to the standard output rate. The formula can be applied to either manual or automated work.
When it comes to industrial processes, the calculation takes quality into account. Let’s say you produce 50 units in an hour, but only 30 are useable. Your rate of production for that hour is 30 units.
The following formula is used to calculate production efficiency:
Production Efficiency = (Actual Output Rate / Standard Output Rate) x 100
For example, a manufacturing company receives a new order of 100 units. The standard rate of completion for 100 units is 10 hours, or 10 units per hour. However, the company took 12 hours to complete 100 quality units. In this case, the production efficiency formula would look like this:
Actual Output Rate = 100 units / 12 hours (8.3 units/hour)
Standard Output Rate = 100 units / 10 hours (10 units/hour)
Production Efficiency = (8.3 / 10) x 100 (83%)
In this instance, output and productivity levels are below capacity.
How maintenance can increase production efficiency
Proper equipment maintenance is essential for increasing production efficiency. It ensures your total effective equipment performance (TEEP) is as high as it can be. Using preventive maintenance to keep assets operating at their best helps to:
Limit equipment downtime: If equipment is checked regularly, you can find and fix failures before they cause big breakdowns that disrupt production. Having a solid preventive maintenance schedule also allows you to coordinate with production so planned downtime is done quickly.
Establish a corrective action system for failures: Having a strategy to find, analyze, and fix failure (aka a FRACAS) allows you to target recurring issues at their root. You can spot and eliminate problems that impact equipment availability and product quality the most.
Coordinate better shift changeovers: Better changeovers between maintenance shifts means communicating the right information to technicians quickly and accurately. This includes a run-down of what work needs to be done, when, and any obstacles that might get in the way of that work.
Ensuring standard operating procedures are clear and maintained: SOPs train operators to do routine maintenance so machines can be operated with fewer breakdowns and accidents.
Five things your maintenance team can start doing tomorrow to increase production efficiency
There are a lot of projects that take months or years to complete. But getting quick wins is also crucial for building momentum and proving the value of your maintenance team. So, here are five things your maintenance team can start doing tomorrow to increase production efficiency.
1. Optimize the frequency of your PMs
A preventive maintenance schedule can be a good example of having too much of a good thing. Going overboard on preventive maintenance can affect production efficiency in two ways. You can either waste valuable time preventing non-existent failure. Or you can increase the risk of failure by meddling with a perfectly fine component.
These guidelines can help you find the right balance between too many PMs and too few:
Use equipment maintenance logs to track the found failure rate on preventive maintenance tasks. Start with PMs that take the longest to do or cost the most.
If a PM leads to regular corrective maintenance, keep it at the same frequency.
If a PM rarely identifies failure, try increasing the time between inspections. If the found failure rate exceeds the frequency of the PM, tweak your schedule so it’s better aligned. For example, an inspection might happen every two weeks. But a failure is usually found every six weeks. In this case, plan for the PM to happen every 4-6 weeks instead.
If a machine experiences frequent breakdowns between inspections, try shortening maintenance intervals. You can also modify the trigger for maintenance, changing it from a time-based trigger to usage or performance-based trigger.
2. Identify machines that can be maintained while running
Some routine maintenance can be done while a machine is still operating. Find out if there are any assets that can be safely worked on while being used for production. The key word there is ‘safely’. This might mean that some work can’t be done because certain areas of a machine aren’t safely accessible while it’s operating. In this scenario, determine if partial maintenance is possible and if it’ll have a positive impact on the performance of the equipment.
It’s also a good idea to track rotating or spare assets and swap them for production equipment when possible. That allows you to do regular maintenance on these machines without sacrificing productivity.
3. Make equipment capabilities transparent and clear
Create an iron-clad list of instructions for operating equipment and common issues to be aware of. You can use a failure modes and effects analysis (FMEA) to create a list of common failures experienced by each asset. This can also include warning signs for breakdowns.
Having this information clearly outlined and easily accessible gives operators a chance to notice the early signs of failure and notify maintenance before it gets worse. Employees will be empowered to observe and identify any potential problems, and report them accordingly.
4. Use work order data to identify where your team can be more efficient
Work order data can tell you what jobs can get done quicker and how to minimize the risk of asset failure so you can boost production efficiency. Look for these telltale signs of broken processes in your work orders:
Unavailable parts and supplies: If this issue is delaying maintenance, review the purchasing process for parts and supplies. That includes making sure your cycle counts are accurate and the threshold for purchase approvals is low enough that inventory can get replenished quickly. You can also create parts kits for frequent repairs or emergency repairs on production equipment so your team can locate and retrieve parts quickly.
Misidentified/misdiagnosed problems or missing instructions: Make sure task lists, failure codes, and descriptions are clear. Attach photos, manuals, and other documentation to the work order.
Diverted resources resulting from emergency work orders: Emergencies can always be avoided. Analyze your work order data, find tasks that are too big, and break it down into smaller jobs to reduce the risk of major disruptions.
Scheduling conflicts with production: See if maintenance can be scheduled while production is happening or if work can be done at an alternate time, like evenings or weekends. You can also consider giving operators minor maintenance responsibilities associated with the work order.
Lack of adequate worker skillset: Work order data can show you if the person/people assigned to the work may not have the right skills. Make it very clear on the work request what kind of skills or certifications are necessary for certain maintenance types.
5. Find the biggest obstacles for your team and eliminate them
You can learn a lot from the data that comes from your equipment and work orders. But sometimes, you just have to ask the people who are doing the actual work. They will be able to tell you what barriers they face when completing work. Acting on this information is crucial to continually improve your maintenance processes. All those improvements can add up to a huge boost in production efficiency.
For example, your technicians may spend a lot of time going back and forth from the office to retrieve manuals, asset histories, or other materials that help them on a job. You probably won’t know that just by looking at work order records or wrench time reports. Armed with this information, you can figure out a solution. Maybe that’s creating areas throughout your facility where files can be accessed for nearby assets. Or it could be digitizing those files so they can be accessed through a mobile device.
Here are a few questions to ask your technicians to find any roadblocks:
What tasks commonly take you away from a machine?
Are information and parts easily accessible? If not, why?
What information would help you complete work more efficiently?
Are there processes or systems that are hard to use or you think could be improved?
Is there anything that frequently keeps you from starting a task on time?
Four ways to measure the impact of maintenance on production efficiency
There are many ways to measure how your maintenance efforts are affecting production efficiency. The most common metrics are the following:
Found failure rate on preventive maintenance
This metric will help you measure how efficient your preventive maintenance schedule is. If your found failure rate is high, it means you’re cutting down on unnecessary maintenance while preventing major disruptions to production.
Unplanned asset downtime (last 90 days)
This number tracks the amount of unplanned equipment downtime and compares it to the previous 90-day period. Because each minute of downtime lowers your production efficiency, this number highlights how maintenance is contributing to healthier, higher-performing assets.
Average time to respond to and repair breakdowns
This stat quantifies all the work you’ve done to prepare for emergencies. Breakdowns will happen. Having a plan to quickly and safely fix these failures will help you reduce the amount of time production is stalled.
Clean start-ups
Compare the amount of useable products coming from the equipment prior to and after maintenance is completed. If the machine is running better after maintenance, it’s proof that your team is increasing production capacity in a meaningful way.
Maintenance has the opportunity to drive production efficiency
Maintenance often gets talked about as an expense. A necessary evil. A cost-center. But the reality is, good maintenance can drive your business forward. When you keep the machines running, you can do more, faster, with less. That means happier customers, a better bottom line, and more profit for everyone in the supply chain. It’s a true win-win-win.
In order to turn maintenance from a cost centre to a business driver, you need to reorient maintenance as a business function and start asking how maintenance can drive production efficiency. From there, a world of opportunity opens up.
Maintenance analysis has changed a lot over the last decade or so. New tools and technology have increased our ability to collect and interpret data. It’s enabled us to make informed decisions that wouldn’t have been possible 10 years ago.
But if our understanding of maintenance analysis has changed, why do we still rely on the same handful of metrics we did 40 or 50 years ago?
Metrics like overall equipment effectiveness (OEE) and mean time to repair (MTTR) dominate almost every list of go-to industry measurements. But experts agree that they’re flawed. Not only are these traditional metrics prone to bias and inaccuracy, but they also often don’t have a purpose. And when data doesn’t have a purpose, you can’t use it to make key decisions, like whether to hire an extra technician or increase the frequency of a task.
That’s why we’ve put together 10 useful metrics you won’t see on any other list and some tips for how to use them to improve your maintenance program.
10 maintenance metrics for better maintenance analysis
#1 – Time spent supporting production
What is it?: The total time that the maintenance team spends on production-focused activities. Usually measured weekly, monthly, or quarterly.
How can you use it?: Everyone has to pitch in to complete a big order once in a while. But when once in a while turns into every day, maintenance suffers. This metric helps you catch an unhealthy backlog before it happens and reallocate resources to prevent it. It also helps you advocate for a higher headcount on your team or an increased training budget to help production staff learn minor maintenance tasks.
#2 – Follow-up work created after inspections
What is it?: The number of corrective work orders created from routine inspections. Usually measured monthly, quarterly, or annually.
How can you use it?: There are many different ways you can use this metric for maintenance analysis. You can sort it by machine, shift, or site to get insights into how your assets or team are performing. But the most useful is by task.
It’s a good sign when regular preventive maintenance includes follow-up repairs. It means your schedule is accurate and that you’re preventing bigger problems. It allows you to flag common repairs and build processes to make them more efficient. For example, you can create parts kits for quicker access.
If the failed inspection percentage is low, you can increase preventive maintenance intervals. This will reduce the amount of time and money spent on tasks without increasing risk.
#3 – Cost of follow-up maintenance vs expected cost of total failure
What is it?: A comparison between the cost of corrective maintenance (i.e. labor and parts) and the cost of asset failure if maintenance is not done (i.e. lost production, labor, and parts).
How can you use it?: Use this type of maintenance analysis to plan your maintenance strategy. For example, if regular inspections cost you more than failure, you can likely go with a run-to-failure approach for an asset over a preventive one.
You can also use this metric to prioritize tasks and backlog, and figure out how to allocate your budget.
#4 – Cost by maintenance type
What is it?: The total cost of maintenance (i.e. labor and parts) by maintenance type (ie. preventive, emergency, follow-up). Usually measured monthly, quarterly, and/or annually.
How can you use it?: Higher costs are usually the result of broken processes. This view allows you to find out which processes need work so you can increase efficiency.
For example, are work orders unclear and leading to increased repair times and labor costs? Try clarifying instructions.
Are you bringing outside contractors in to do emergency repairs? You could invest in more training for your team or hire a specialist.
#5 – Clean start-ups after maintenance
What is it?: The number of times a production line starts without stoppages or waste after completed maintenance. This is measured monthly, quarterly, and annually.
How can you use it?: Include this metric in your maintenance analysis to draw a direct line between your team’s work and increased output.
If clean start-ups are low, it gives you another chance to spot problems in your processes. For example, you might find that the specs for a production line may be out of date. This will lead technicians to rebuild components incorrectly and the line to stall. Updating the specs is a simple tweak that could lead to higher output.
#6 – Size of backlog
What is it?: The total number of hours of overdue and scheduled maintenance tasks. Track this metric weekly and monthly.
How can you use it?: This metric can be a godsend when it comes to getting your team some much-needed relief. Quantify the gap between available labor hours and your total backlog hours. You might find that the amount of backlog far outpaces how much your team can do. Use that to make a case for more budget to spend on extra overtime, hiring another technician, or bringing in more contractors.
#7 – Top 10 assets by downtime
What is it?: This is your heavy hitters list—the equipment that breaks down most often or takes the longest to repair. Keep tabs on these assets weekly, monthly, and quarterly.
How can you use it?: This metric keeps your biggest problems visible. You might raise an eyebrow at that, but highly visible problems get solved the fastest. This kind of maintenance analysis can help you prioritize your problem-solving efforts, make decisions quickly, and measure their impact.
For example, if you know asset A is at the top of your downtime list, you can start by isolating the reason why. Is it because repairs take longer on that asset? Is work being delayed? Does that piece of equipment break down again and again?
The answer to these questions will give you an idea of how to prevent failure in the future. You might get rid of obsolete parts that keep breaking. Or put an extra technician on a job. Or clarify how much lubrication should be used on a bearing. If all else fails, conducting this type of maintenance analysis helps justify a capital expenditure on new equipment.
What is it?: The ratio of planned maintenance to all other types of maintenance over the last 90 days.
How can you use it?: This is a measure of progress. Going from reactive to planned maintenance doesn’t happen overnight. The time frame allows you to make a clear connection between action and results. You can draw a line between what happened and its impact on your end goals.
For example, if your percentage has dropped, you can look at what happened in the last 90 days to cause that drop. That could be a massive, unexpected breakdown. Or an increase in production support during the busy season. If you want to increase the percentage, try creating a better work request process to uncover problems earlier. Or shorten inspection intervals on assets with the highest instances of unexpected downtime.
#9 – Wrench time (last 90 days)
What is it?: The amount of time technicians spend working on a piece of equipment as part of the total time it takes to complete a job. This is usually measured by job or as a weekly, monthly, and quarterly average.
How can you use it?: Wrench time is a common tool for maintenance analysis, but it’s often used the wrong way. Technicians usually (and unfairly) get the blame for low-wrench time. It leads to wrench time inflation as technicians fudge the numbers to avoid trouble.
Low wrench time usually has its roots in broken processes, not the ability of the technician. That leads to bigger backlogs, more reactive maintenance, and avoidable labor costs.
To use wrench time in your maintenance analysis, start with the jobs that have the lowest scores. Review these jobs step-by-step with technicians. Work together to find out where unclear or incomplete processes cause delays. You’ll spot bottlenecks easier when breaking the task down into smaller pieces. The result is more value for your team’s time and money.
#10 – Health and safety work orders completed
What is it?: The number of work orders completed for health and safety or compliance purposes. This is usually tracked monthly, quarterly, and annually.
How can you use it?: Some metrics are quantitative. Others are qualitative. This one is the latter. And it’s essential for measuring the performance of your maintenance team and the impact it has on your business. A safe workplace keeps accidents low, and productivity and morale high. Passing audits and remaining compliant is crucial to staff safety and avoiding fines.
Three big goals you can accomplish by combining these metrics
All the metrics mentioned above are powerful in their own right. But when combined, they supercharge your maintenance analysis and help you achieve three common goals:
Get a bigger budget and more time for maintenance
Metrics to combine:
Cost by maintenance type
Clean start-ups after maintenance
Top 10 assets by downtime
Getting more money and time for maintenance means winning over whoever divvies up the budget, and whoever leads production. The quickest way to get them on board is to align your plan with their goals. The three metrics above will help you get there.
First, highlight the cost-benefit of preventive maintenance. Regular preventive maintenance might seem expensive. But just one instance of emergency maintenance can cost up to $250,000. If you’re tracking cost by maintenance type, you can highlight how much the company is losing with reactive maintenance, and how much it can save you by investing in preventive maintenance.
Next, it’s time to sway the production team. Use clean start-ups after maintenance to show production that you have their best interests in mind. It emphasizes what is good for maintenance is often good for production.
No one is going to give you more resources without a plan. Your list of bad actors is a blueprint for how you’re going to make the most of your extra time and money. It quantifies the problem and makes it very clear where you’ll focus your efforts.
Get your maintenance team to buy into change
Metrics to combine:
Planned maintenance percentage (90 days)
Wrench time (last 90 days)
Follow-up work created after inspections
Change sucks. And that makes it hard for your team to get on board with a new system or process. The best way to change the mind of naysayers is to show them how your plan is eliminating their biggest pains. Tracking the metrics above is one way to do this.
These data points give you a chance to compare how you operated before a change (i.e. lots of reactive maintenance and frustration over guesswork) and what you’ve accomplished since implementing a new system or process. Seeing the pay-off first-hand makes it easier to convert any critics and expand your project, whether it’s setting up a CMMS or allowing machine operators to do routine maintenance.
Build a preventive maintenance program that would make most other companies jealous
Metrics to combine:
Cost by maintenance type
Follow-up work created after inspections
Cost of follow-up maintenance vs expected cost of total failure
The best preventive maintenance programs don’t have the most PMs. Instead, they have the most efficient PMs. That means doing the right work at the right time. These metrics will help you achieve this balance.
Measuring cost by maintenance type helps you allocate resources to preventive tasks and gauge the efficiency of your PMs. You can track if cost-cutting strategies are working and make sure they’re not leading to reactive costs down the line.
Keeping tabs on follow-up work is one way to optimize PM frequencies. If an inspection isn’t leading to corrective work, you can increase inspection intervals. That means you can use fewer labor hours and parts, and spend that money and time elsewhere. Similarly, comparing the costs of corrective maintenance and total failure ensures you’re not spending money on proactive tasks that aren’t worth it.
The best maintenance analysis is constantly evolving
The best maintenance metrics have a purpose. They are collected and used consistently. They guide decisions and inform you on how to run your maintenance program on a daily basis. This is the backbone of successful maintenance analysis.
On the flip side, all maintenance analysis is a work in progress. Revisit your metrics on a regular basis to make sure they’re still relevant to your goals and the way your maintenance team works. Some of the metrics listed above might work for you now, but you might find others are more effective in six months. Or maybe five years.
Lastly, the best maintenance analysis incorporates data that other departments find useful. If you can connect the metrics above to solve the challenges of other business units, you’ll be well on your way to creating a world-class maintenance program.
Recent Comments