The Insight-to-Action Gap: How Integrating ML Anomaly Detection with CMMS Eliminated 84% of Unplanned Downtime
Key Highlights
- Integrating ML anomaly detection directly with SAP Plant Maintenance (CMMS) to auto-generate work orders eliminated the manual triage step, cutting unplanned downtime by 84% and avoiding $5.3 million in costs.
- The architecture uses a graduated alerting hierarchy, so only high-confidence detections trigger automatic work orders to prevent backlog buildup and preserve trust in the alerting system and CMMS data.
- Deploying entirely within the plant's OT network (via Docker Swarm, with no external connectivity) was the key architectural decision that made the historian, Active Directory and CMMS integrations secure and operationally viable.
The most frequent failure mode in industrial predictive maintenance systems is not an incorrect model. It’s a broken handoff. The platform identifies an anomaly. An alert is triggered on a dashboard. Once a reliability engineer sees the alert, they may or may not have a maintenance response workflow to initiate.
This is the insight-to-action gap, and it is an architectural problem, not a data science problem. It is about integrating predictive maintenance systems into the automation and maintenance execution ecosystems, rather than treating them as an analytics island.
This article describes the integration architecture I developed for EagleAPM, which is used at Novelis’s aluminum rolling and recycling company, and how the system transformed from a monitoring platform to a maintenance execution system when machine learning (ML)-based anomaly detection was integrated with the CMMS work order system.
I want to clarify that EagleAPM is an internal application used at Novelis. The integration architecture and requirements were designed in-house by the data science team. Konverge AI was engaged as the development partner to build the application according to the design document and requirements specification.
Integration architecture
EagleAPM was built with three integration layers to close the insight-to-action gap: historian integration for real-time ingestion of sensor data; identity integration for OT-compliant user management; and maintenance execution integration for the automatic generation of maintenance work orders.
Using a mix of standard OPC UA and proprietary historian APIs, the historian integration layer connects to the plant’s time-series data infrastructure and ingests continuous sensor streams from cold mill assets into EagleAPM’s data tier. This layer is designed to be thin; its only responsibility is to provide high-quality, low-latency data to the processing pipeline.
Making this level of customizability available to individual users is fundamental to system adoption within an organization.
Data ingestion does not include any transformations or feature engineering, as these processes reside in the business layer and can be controlled and versioned independently.
The identity integration layer connects to the plant’s Active Directory for user authentication, authorization and role-based access control. In Novelis’s OT environment, identity management cannot rely on a cloud identity provider, as such a provider would require external connectivity and violate the company’s OT security policies. Therefore, native Active Directory integration enforces that user provisioning, deprovisioning and role changes occur through the same governance processes that control access to other systems within the plant.
CMMS integration
The most important operational integration is with the plant's SAP computerized maintenance management system (CMMS), known as SAP Plant Maintenance. It manages maintenance workflows, work orders, asset histories and scheduling. The integration architecture generates work orders directly in SAP when high-confidence ML alerts are triggered. This integration closes the insight-to-action gap.
When an alert is triggered by a static threshold breach or other anomaly, or by an ML model that detects a precursor to failure across several variables, the alert can be configured to automatically generate a work order in the CMMS. This type of integration occurs through an automation gateway, which takes structured Microsoft Message Queueing (MSMQ) messages and translates them into API (application programming interface) calls to the CMMS.
The automation gateway is a set of servers that acts as a decoupling layer between the IT and OT networks. It sits at the boundary between the two environments, receiving structured messages (via MSMQ) from the EagleAPM alerting layer on the OT side and translating them into API calls to SAP on the IT side. This separation ensures that the OT network never communicates directly with IT systems, maintaining the security segmentation required in industrial environments while still enabling automated work order creation.
Each message payload includes the asset identifier, alert type, severity classification and sensor values, along with the detection logic that triggered the alert, equipping the maintenance technician with the necessary context to diagnose the asset before arriving.
Each message payload includes the asset identifier, alert type, severity classification and sensor values, along with the detection logic that triggered the alert, equipping the maintenance technician with the necessary context to diagnose the asset before arriving.
This design has two important consequences for operations. First, it removes the manual triage step, the greatest source of alert-to-response delay in traditional monitoring programs. This means the reliability engineer does not need to view the dashboard alert, interpret it, decide to act on it and then create a work order. Work orders are created automatically and sent to the most appropriate maintenance queue based on the asset type and alert level.
Second, it establishes a complete audit trail that connects the ML detection event to the maintenance action and the action outcome. This audit trail is crucial for the feedback loop needed to improve alert accuracy — if a work order created from an ML alert later finds a failure, that result is used to validate the alert and retrain the model.
The notification layer
Of course, not all alerts should create an automatic work order. EagleAPM’s notification system includes configurable escalation logic. Low-severity alerts create dashboard notifications and, depending on the configuration, an email or message to the reliability engineer. High-severity alerts create dashboard notifications and automatically generate a work order.
Plant-level engineers can set severity thresholds and escalation rules through the self-service interface without changing the integration layer.
Making this level of customizability available to individual users is fundamental to system adoption within an organization.
Consider a system that automatically generates a CMMS work order for each individual threshold alarm. In such a case, work order backlogs will be created and users will lose trust in both the alerting system and the CMMS data. The notification layer in EagleAPM uses a graduated level-of-confidence alerting hierarchy, meaning that automatic work order creation will be triggered only by highly confident, escalated alerts that warrant a response.
Why the integration architecture must live inside the plant network
Historian connectivity, Active Directory, the automation gateway and CMMS API integrations are all located in the plant OT network. EagleAPM is 100% containerized, with a fully functional deployment via Docker Swarm on the plant-site infrastructure, thereby requiring no external connectivity.
This is not simply a notable deployment characteristic; it is the primary architectural choice that enables the integrations to function.
Native Active Directory integration enforces that user provisioning, deprovisioning and role changes occur through the same governance processes that control access to other systems within the plant.
According to IEC 62443, facilities, historian data, identity store and CMMS are all either OT systems or OT-adjacent systems. Gaining access to these systems through a cloud-hosted analytics platform poses challenges, either requiring data replication to a third-party environment or exposing APIs that cross the OT/IT perimeter.
Each of these issues requires extensive security reviews, procurement approvals and ongoing governance, making them unsuitable for real-time monitoring.
The implementation of EagleAPM within the OT network enables all future integrations to be local to the OT network, with local governance similar to the rest of the plant systems.
Results and replication
For the cold mill deployment, the closed-loop integration of the ML detection and CMMS EagleAPM work order system directly caused an 84% reduction in unplanned downtime, avoiding $5.3 million in costs and sustained an above 80% alert accuracy in production.
The CMMS integration meant that every high-confidence ML detection triggered a maintenance response, which eliminated the manual triage latency that had previously allowed failure precursors to progress unaddressed.
Konverge AI has implemented the same integration architecture, involving historian ingestion, Active Directory, notification services and an automation gateway for CMMS work order generation for other manufacturing clients with the same OT limitations as Novelis. In this project, Konverge AI served as Novelis’s development partner — coding and building the EagleAPM application based on the design documents and requirements authored by the internal team.
The architecture Konverge AI created for this work order generation process has been successfully reproduced in multiple facilities without major redesign, confirming that the architecture captures the structural features of industrial maintenance execution environments rather than particular user needs.
More plant maintenance insights Automation World:
About the Author

Chirag Agrawal
Chirag Agrawal is global head of data science at Novelis, the world’s largest aluminum rolling and recycling company. He originated the EagleAPM architecture and led it through formal governance by the company’s technology architecture review board.

Leaders relevant to this article:

