How to Track Manufacturing Downtime and Reduce Unplanned Stops

Published March 28, 2026 · 9 min read

The Real Cost of Downtime

Most manufacturers know that downtime is expensive. Few know how expensive it actually is. When a production line stops, the obvious costs are visible: idle labor, missed shipments, and overtime to catch up. The hidden costs are larger: expedited freight for late orders, customer dissatisfaction, lost future orders, and the ripple effect on downstream operations that were waiting for your output.

A useful rule of thumb: calculate your plant's revenue per operating hour, then multiply by 2 to 3x to account for the indirect costs. If your plant generates $500 per machine-hour in revenue, each hour of unplanned downtime actually costs $1,000 to $1,500 when you include the full impact. For a plant with 10 machines averaging 85% Availability (that is 1.2 hours of downtime per 8-hour shift per machine), the annual cost of downtime is staggering — often exceeding $500,000 per year even for small operations.

The problem is not that manufacturers do not care about downtime. The problem is that they cannot manage what they do not measure. And most plants measure downtime poorly, inconsistently, or not at all.

Manual vs. Automated Tracking

Manual Tracking: The Paper Log

The most common approach in small and mid-size plants is a paper log at each machine or a shared spreadsheet. Operators record when the machine stopped, when it restarted, and (sometimes) why it stopped. This approach has well-documented problems:

Automated Tracking: PLC-Based Detection

Every modern PLC already knows whether the machine is running. The machine state signal — whether it comes from a dedicated status register, a combination of drive run signals and fault bits, or a PackML state machine — is the most reliable source of downtime data. It does not forget, does not round, does not under-report, and does not require operator effort.

Automated detection captures the start time, end time, and duration of every single stop event, including the 90-second jams and 3-minute material waits that operators never record. Those micro-stops, individually insignificant, often account for 10-20% of total downtime when aggregated across a shift.

The limitation of automated detection is that the PLC knows that the machine stopped, but it may not know why. Some PLCs provide detailed fault codes that map directly to root causes. Others simply report "not running." This is where operator input remains valuable — not to report that the machine stopped (the system already knows), but to assign a reason code after the fact.

Downtime Categories

A well-designed downtime categorization system is hierarchical. The top level separates fundamentally different types of stops. Lower levels provide the detail needed for root cause analysis.

Planned Downtime

Stops that were scheduled and expected. These are excluded from OEE Availability calculations because the equipment was not planned to produce during these periods.

Unplanned Downtime

Stops that were not scheduled. These directly impact OEE Availability and represent the primary target for improvement.

A Note on Changeover

The classification of changeover time is one of the most debated topics in OEE methodology. Some organizations treat it as planned downtime (excluded from OEE) because changeovers are a necessary part of the production schedule. Others treat it as unplanned downtime (included in OEE) because it represents time the machine is not producing and is therefore an opportunity for improvement via SMED (Single Minute Exchange of Die) techniques.

There is no universally "correct" answer. What matters is consistency: pick one approach and apply it uniformly across all machines. If you exclude changeover from OEE, track it separately so you can still measure and improve it. If you include it, make sure it has its own reason code so it does not get lumped in with breakdowns.

Building a Pareto of Downtime Reasons

The Pareto principle (80/20 rule) is remarkably consistent in downtime analysis. In most plants, 3 to 5 reason codes account for 70-80% of total downtime minutes. A Pareto chart — a bar chart sorted by descending duration with a cumulative percentage line — instantly reveals which problems deserve attention and which are noise.

To build an effective Pareto:

Why Operators Under-Report Downtime

Understanding why manual tracking fails is important for designing a system that works. Operators under-report downtime for several reasons:

Using PLC State Signals for Automatic Detection

The most reliable automated downtime detection uses the machine's PLC state signal, published via MQTT and Sparkplug B to the monitoring platform. The implementation pattern is straightforward:

This hybrid approach gives you the accuracy and completeness of automated detection with the contextual knowledge that only a human operator can provide. The machine knows when it stopped and for how long. The operator knows why.

Calculating the Cost of Downtime

To translate downtime minutes into dollars, you need a cost-per-minute figure for each machine or production line. The basic formula:

Downtime Cost = Downtime Minutes × (Revenue per Minute + Labor Cost per Minute + Overhead per Minute)

For a rough estimate, take the machine's annual revenue contribution, divide by annual operating minutes, and multiply by 1.5 to 2.5 to account for indirect costs (expediting, overtime, customer penalties, etc.). Even a rough cost figure transforms downtime discussions. "We had 47 minutes of downtime on Line 3" is abstract. "We lost $2,800 on Line 3 today due to downtime" gets attention.

From Tracking to Reduction

Tracking downtime is only valuable if it leads to action. The standard improvement cycle:

This is not a one-time exercise. It is a weekly rhythm. The top 2-3 Pareto items should have assigned owners and active improvement projects at all times. As the top items are resolved, the next tier moves up and becomes the focus. Over months, this disciplined approach compounds into significant OEE gains.

How PulseMQ Tracks Downtime

PulseMQ auto-detects every downtime event from PLC machine state signals published via MQTT. Every state transition is logged with precise timestamps. The platform distinguishes between planned and unplanned downtime based on your configured shift schedules and maintenance windows.

For unplanned stops, operators can assign reason codes from a configurable list directly on the production dashboard — on a shop floor tablet, their phone, or any browser. PLC fault codes are automatically captured as the initial classification. The AI agent can suggest reason codes based on patterns it has learned from historical data.

Built-in Pareto analysis shows downtime by reason, by machine, by shift, and by time period. Drill down from a plant-wide view to a single machine's downtime history in two clicks. Export data for deeper analysis or integration with your CMMS (Computerized Maintenance Management System).

Because downtime tracking is integrated with OEE calculation, job tracking, and environmental monitoring, you can correlate downtime events with production context. Did the breakdown happen during a specific product run? Was ambient temperature elevated? Was the machine running a new recipe? These correlations are impossible with standalone downtime tracking tools.

Stop Guessing About Downtime

Automatic detection from PLC signals. Operator reason codes. Pareto analysis. Every stop, every machine, every shift — with zero manual data entry.

Schedule a Demo