You might have an extra production line(s) hidden in your factory

Optimizing OEE for capacity and profit

Quentin Samelson
The Future of Electronics
15 min readAug 9, 2018

--

Availability, performance and quality are the factors you can use to drive optimal ouptut. It might be time to reassess them in light of IoT and AI

We used to call running out of manufacturing capacity a “good problem to have,” but it’s still a problem. If your customers want more product than you can build, you need to work out a way to increase production levels. In the short term, you can work overtime. In the medium term, you can increase capacity by adding a shift (once you’ve hired and trained people for it). If you’re already working three shifts, five days a week, you can work out a way to schedule weekends as well. You can even work to reduce downtime for preventive maintenance, and find a way to keep lines running during lunch breaks.

What if all those actions aren’t enough? Is it time to drop $5 or $10M to set up another manufacturing line? Does your factory have the open space available? If not, will you need to build, buy or lease additional space — and staff positions like HR and security for that space?

Or… is it time to work on your OEE?

You probably already know that OEE stands for “Overall Equipment Effectiveness” — essentially, the percentage of time that your manufacturing lines (or your manufacturing operation overall) is actually running, at its ideal cycle time, and making good parts. OEE is a way to assess the overall performance of a manufacturing operation. The calculation is simple. OEE is the product of three ratios:

OEE = (Availability) x (Performance) x (Quality), where:

• Availability = Actual Run Time/ Planned Production Time
or = Actual Run Time/ (Planned Production Time — Stop Time)

• Performance = Net Run Time/ Actual Run Time
or = (Ideal Cycle Time x Total Count) / Actual Run Time

• Quality = Good Count/ Total Count

Some of these factors cancel each other out, so OEE can be simplified to:

OEE = (Good Count x Ideal Cycle Time) / Planned Production Time.

OEE can also be viewed as the ratio of good products produced to the number of good products planned to be produced. For instance, if we plan to run a factory 20 hours/ day, and we expect to build a product every five minutes (12/hour), at 100% OEE we’d get a count of 240 good products every day, or 1200 in a five-day week.

DIY OEE Modeling

Say we start with:

for a total of 700 good products made that week, we can calculate an OEE of: 700 x (1/12th hour) / 100 hours = 700/1200 = 58.33%.

Understanding which factor — availability, performance, or quality — caused the shortfall is the key to improving OEE:

  • Perhaps the factory built 1200 units, but 500 of them failed for poor quality. That would indicate quality needs to be improved.
  • Maybe quality was ok — 95% — but the line shut down repeatedly for lack of materials (poor availability — in this case, about 61%).
  • Possibly, both availability and quality ran at 95% all week, but the line ran too slowly (slow cycle time/ poor performance) — instead of one unit every 5 minutes, it took (on average) 7 minutes 44 seconds (about 65% performance) to complete a unit.

The bad news is that in the real world, 100% OEE is an unrealistic expectation. The good news is that OEE can be improved. World-class OEE — around 85% — is achievable. (60% is fairly typical, and 40% isn’t all that unusual. Don’t feel bad if your factory’s OEE is poor at the moment — that means there is a real opportunity to obtain additional capacity without a major investment in new production equipment, buildings, or personnel.) The first step is to measure OEE and begin understanding what the causes of poor performance are, week after week, so you can start to eliminate them.

The good news is that improving OEE more than pays for itself. For example, let’s say you have 10 manufacturing lines, running at an average OEE of 55%. You don’t even have to achieve world-class performance to effectively find a new manufacturing line (or two, or three!) hidden inside those numbers. 10 lines running at 75% OEE can output 33% more (good) product than those same 10 lines running at 55% OEE.

Isolating the causes of poor OEE performance

Let’s suppose that your operations generally run between, say, 60% on a good day and 42% on a bad day. That indicates plenty of room for improvement — there’s almost a whole second factory hidden inside your factory! But how do you go about increasing your factory’s performance, and decreasing the variability of your OEE metrics?

There’s no single cure — the causes of poor OEE are as varied as the products made, and the equipment used to manufacture them. But there are some simple steps you can take that will lead you to a better understanding of the situation in your own facility — and it isn’t unusual to discover that substantial improvement is within your reach.

Step 1 is to take data, ensuring that it is valid, consistent, and standardized. You really only need five data elements[1], but you may want to add a few more in order to understand what caused specific aspects of performance. The bottom line is that each production line should be keeping a running log of downtime (or “stop time”), actual units produced (“total count”), and good units produced — and any notes that will help them understand why things didn’t go quite as planned. Each line should collect data the same way, using the same definitions and rules.

❇︎ ⁃⁃⁃ ❇︎ ⁃⁃⁃ ❇︎ ⁃⁃⁃ ❇

Let’s talk about the Availability metric first. You should know Planned Production Time at the beginning of each work day. If you’re running a two-shift operation, your planned production time will most likely be 16 hours each day. (But note: if you plan to shut the line down for lunch and two breaks each shift, your planned production time may only be 14 or 15 hours per day. And if you schedule overtime, you’ll need to add that time to the Planned Production Time total for the day.)

You can collect Actual Run Time by logging all situations where the line shuts down. The duration of these unplanned downtime events is subtracted from the Planned Production Time to yield Actual Run Time.

The Availability component of OEE is:

Availability = Actual Run Time/ Planned Production Time
or = (Planned Production Time — Downtime) / Planned Production Time

One factory where I worked consistently blamed production shortfalls on downtime due to supplied materials shortages — the production lines had the idea that the primary cause of their downtime was our suppliers’ failure to deliver on time. So we began collecting data — one of our buyers attended every production meeting; every time the production line reported downtime due to a lack of materials, we investigated until we identified the root cause. We discovered that the materials function wasn’t blame-free; but the problem was almost never late deliveries from suppliers. The true causes[2] were things like:

  • Undocumented scrap in production
  • Misplaced stock in the warehouse
  • BOM errors (including one situation where the line was using double the quantity on the official system BOM)
  • Scheduling changes (sometimes initiated by production managers who were measured on total production time — so they would sometimes build extra quantities, or try to build products that weren’t scheduled for production that week at all)
  • Supplier Quality issues
  • Machine issues & unplanned maintenance
  • Lack of a common understanding of basic MRP data (mostly what a requirement quantity-date combination really meant)

Just taking the data and identifying root causes gained a lot of credibility for the materials function — but solving the issues (which was often surprisingly easy) made an enormous difference in our Availability metric, and the whole plant’s performance.

And by the way… there’s an interesting calculation you can do. Take the total number of downtime incidents, and the total amount of downtime for a time period, and compare it to the total amount of resolution time for those incidents. For instance, if your line was down 10 hours in a 5-day week, with an average of two incidents per day; but on average it only took your team 15 minutes to get the line running again (total of 150 minutes/ 2 ½ hours of resolution time); that tells you that you lost 7 ½ hours of production time waiting to start resolution. This is the lowest-hanging of low-hanging fruit. More timely notifications to the right people when the line stops to reduce reaction time can restore hours of active time.

The Performance metric is really about operating to your planned cycle time — if you’ve set up your lines to build 60 units/hour, and you’re only achieving 40 units/ hour, you’d have to run the line 50% longer to get the planned quantity of product. To calculate Performance, you’ll need the same Actual Run Time (Planned Production Time — Stop Time) that you used for the Availability metric. You’ll also need to know the Ideal Cycle Time (time period to build one unit — 60 units/hour is equivalent to 1/60th of an hour per unit) and the Total Count produced (e.g. 640 units).

The Performance component of OEE is:

Performance = Net Run Time/ Actual Run Time
or = (Ideal Cycle Time x Total Count) / Actual Run Time

With the examples above, the Ideal Cycle Time is 1/60 hours per unit. With a total count of 640 units and a 16-hour Actual Run Time (assuming no down time that day), we’d get a performance metric of (1/60 * 640)/16 = roughly 0.67 or 67%.

My experience is that the Performance metric is more likely to be a cause of poor OEE if your manufacturing processes have a large number of manual operations, or transfers between dissimilar pieces of automation. Highly automated and integrated processes usually operate according to their intended cycle time, or they don’t run at all (which would be visible in the Availability metric). If you remember reading “The Goal” by Eliyahu Goldratt, addressing cycle time issues to improve the Performance metric is often an exercise in finding bottlenecks.

The last metric is Quality. This metric is very straightforward:

Quality = Good Count/ Total Count

The Total Count was collected as part of the Performance metric. Good Count requires a little discussion. If a manufacturing line produces 10% nonconforming units, but nine out of ten of those can be easily repaired, this metric will be less troublesome if those nine units are repaired and re-tested the same day they were originally produced. That way, the majority of the good products will be counted on the day they were actually manufactured.

I’ve worked in companies that permitted the creation of “bone piles” — defective products that were planned to be repaired but were allowed to accumulate until finally they represented full days and even weeks of equivalent production. When these are finally repaired, they can disrupt the metrics. Even if matters never grow so far out of control, the continued influence of repaired but delayed units can make the metric less trustworthy.

These metrics will give you a high-level understanding of what elements are degrading your OEE performance. One of my colleagues described a framework to address the causes of poor performance:

1.Predict and Pinpoint Production Losses (note that these losses extend beyond ‘just’ OEE: they include other areas of concern like energy & tooling)

a. Equipment-Related Losses

b. Process-Related Losses

c. Productivity-Related Losses

2. Prescribe the Best Action

a. Analyze

b. Advise

3. Achieve (Maximize Throughput & Yield, Eliminate Waste)

A simple progression for optimizing production is to first monitor, then predict, and finally to optimize. Often, simply being more aware of what’s happening on your production lines will expose some opportunities for quick wins. As you improve your ability to monitor your production lines, you can begin to predict (and prevent) future issues, and finally optimize your production capability.

IBM Plant Performance Analytics

The gotcha with that progression is that manual monitoring mechanisms have to be maintained, or they will slip. Manual systems can be slow to react, losing minutes or even hours of production time. And manual monitoring systems don’t produce the data that will help you prevent future downtime, let alone predict it. We need data to be able to automatically detect and alert production staff to anomalies, and we need to collect and organize that data so we can move past mere detection to prediction and optimization. Once your company has an effective method to monitor its production lines manually, there are opportunities to automatically detect and alert your staff to anomalies — and use data to predict and optimize.

In my experience, a lot of companies have allowed themselves to stall right there. They recognize that real-time data from their production lines would let them react faster and more positively, improving their ability to predict and prevent the kind of situation that leads to downtime. They might even recognize that most of the data they need is already being generated by the equipment they’ve installed on their lines. (This is particularly true in electronics assembly, where modern pick-and-place equipment, soldering machines, and other production equipment often have dozens, if not hundreds, of sensors to keep the machines running and help diagnose failures.) But they haven’t acted to harness that data and make use of it.

Why don’t they take that next step — to build an IT backbone so that the data they are generating can be used to improve their operations? I think it boils down to a simple lack of knowledge — and, in some cases, perhaps a little fear and trepidation left over from early proposals when the technology was still new. The fact that many companies are missing is simple: one of the best use cases for the Internet of Things (IoT) is to improve factory performance.

And that’s what we’re talking about — IoT or IIoT, depending on whether you want to tack the word “industrial” on in front. A few years ago, IoT may have looked like “too much money for too little benefit,” but if so, the ratio has reversed itself. In fact, the cost/benefit ratio has shifted to where building an IoT infrastructure (and using it!) may be one of the best investments you make, especially if your OEE is well below world-class levels[3]. IoT data, and analytics that use that data, can enable a permanent improvement in OEE — getting better use from your existing investment in production equipment, and avoiding the need to buy more.

Using the sensors already in your production equipment, and adding wired or wireless sensors where needed, is the most important step in establishing an IoT Infrastructure as data producer. Once we have that, my colleague Peter Xu explains that “establishing an IoT Platform as the data consumer in a manufacturing setting could be as simple as setting up an IIoT gateway/box, which retrieves data with its built-in connectors for industrial equipment and sensors, performs some local analytics, and then feed the massive amounts of data generated either into an “on-premises” data lake/private cloud, or into a public cloud like IBM’s Watson IoT platform.

Having said this, the process of establishing an IoT infrastructure is relatively straightforward (and not terribly expensive). Essentially, you need:

IoT sensors — as mentioned, since electronics manufacturing relies heavily on automated machinery, many electronics manufacturing lines will already have many of these. So will most other industries that have set up automated manufacturing lines in the last couple of decades. As necessary, a company can add a sensor here or there — to monitor conveyor speed, for instance, or temperature & humidity on the line, or count pieces as they pass by. Sensors are reasonably inexpensive — some are under $10, and even the most sophisticated are typically under $50. (There are a few exceptions, especially for monitoring things like fluid pressures.)

  • Routing capability. This can be done in several different ways. Most companies will purchase an IoT router (see footnote[4] for some options). These work very much like the internet routers used to connect laptops in a home or business, but they’re focused on transferring the data from IoT sensors to a single destination. Some of them have up to 80 ports — which is usually enough for even a complex manufacturing line. Another option is to use wireless technology. This uses LTE technology, just like 4G mobile telephones. It’s a little more expensive but much more flexible (valuable if you reconfigure your production equipment frequently, or you just don’t want the bother of extra wiring.)
  • An IoT database. The data created by the sensors, and consolidated by the router, needs a destination. The easiest, and generally cheapest, way to provide this is to simply send the data to the cloud. IoT data in the cloud can be stored so that the data can be observed over time. A client could also store the IoT data on a server, on their own premises.

The easiest way to establish an IoT Platform, so that the data can be “consumed” and made useful, is to send the data to an existing cloud offering. There, the data will be harmonized and standardized, so you won’t have to worry about which supplier’s sensors are being used in which locations. Once the data has been standardized and stored, it can be accessed by analytical tools on the IoT platform. Those tools can be descriptive, predictive or prescriptive; and we can also build a “digital twin” for each line so that plant managers can tell exactly what is going on down on the plant floor whether they are in their office, or halfway around the world.”

Of course, there isn’t much value in establishing an IoT infrastructure if the IoT data isn’t used for something, so let’s state that the purpose of setting up that infrastructure is to use the data from all of those sensors for one or several of these purposes:

  • Detection of an event or a “limit breach” (an “event” could be that a line has stopped; a “limit breach” might be that a temperature sensor is reading too high or low).
  • Monitoring of Key Performance Indicators (this is really another sort of detection — but of a condition rather than an event. Examples might be that a production line is only producing 80% of expected volume, or that the temperature has been gradually increasing, even though it hasn’t gone past a limit yet).
  • Analytics built using the data from a sensor or a combination of sensors — there is a variety of advanced analytical tools that go past reacting to an event, or taking preventive actions due to monitoring, and begin to utilize predictive analytics. Advanced Manufacturing Analytics can tell you things like “your ____ machine is going to shut down in three hours unless you perform _____ maintenance on it.”
  • Cognitive tools that can go beyond even the algorithms used in analytics, and learn from watching the data over time. Cognitive requires that IoT data be available for an extended period of time, so that it’s possible to learn from it.

All of these purposes are accomplished on an IoT platform. The platform will “ingest” and process the data that is generated by the IoT infrastructure. The platform may include a semantic model, which can provide a unified view of the data generated by the sensors. The IoT infrastructure (the sensors, equipment and wired/wireless networking) essentially is the hardware Data Producer, and the IoT platform is the software Data Consumer.

The software tools on the IoT platform continually monitor and analyze the masses of data coming in from the IoT sensors. The tools available can help you:

  • Harmonize/ standardize the data. Not all sensors broadcast data in the same way, and different brands of machinery use different standards as well. For example, you’ll need to know whether temperature sensors are reading in ˚F or ˚C.)
  • Reduce downtime — so that actual production time comes close to planned production time.
  • Maintain expected cycle times — so that your manufacturing lines actually produce the planned number of products in the planned amount of time.
  • Avoid quality defects — so that the products manufactured meet requirements and can be shipped to customers.

All of which, of course, will lead you to world-class Overall Equipment Effectiveness.

If you’ve gotten this far, you should definitely connect with Quentin

Quentin’s musings on supply chain transformation

One of the leading blog posts written on supply chain in 2017 — from Quentin

A bunch of end notes:

[1] The five data elements are listed sequentially in the body of this article in bold underlined text, but here they are as a list: Planned Production Time, Actual Production Time, Ideal Cycle Time, Total Count and Good Count.

[2] The results of this study were presented at the APICS International Conference and are available in the conference proceedings. Or send me a message and I’ll forward you the paper.

[3] As mentioned earlier, world-class OEE is typically thought to be in the range of 85% or more.

[4] IoT gateways and routers are available from companies like MICA (https://www.harting-mica.com/en) Telit (https://www.telit.com/industries-solutions/smart-factoryindustry-4-0/), Cisco (https://www.cisco.com/c/en/us/solutions/internet-of-things/iot-routers.html ) and others. Wireless IoT infrastructure is available from companies like Sierra: (https://www.sierrawireless.com/products-and-solutions/sims-connectivity-and-cloud-services/iot-cloud-platform/ )

--

--