AVEVA PI Historian Anomaly Detection for Rotating Equipment

AVEVA PI Historian Anomaly Detection for Rotating Equipment

Most reliability engineers know what PI Data Archive is. They've been pulling trends on bearing temperatures and discharge pressures for years. What they've done less often is use that same data to build a model that tells them something is wrong before the alarm fires. That gap is exactly where we spend most of our time at Midstreamly.

What PI Actually Stores on a Midstream Asset

Start with the basics, because this matters for model design. A typical midstream compressor station will have hundreds of tags in PI Data Archive, but not all of them are useful for anomaly detection. The ones we care about fall into a few buckets.

On a reciprocating compressor, you generally have suction and discharge pressures per cylinder, interstage pressures, inlet and outlet temperatures, rod drop (if you have proximity sensors), vibration on the frame and crosshead, lube oil pressure and temperature, and sometimes cylinder pressure traces if the unit is instrumented with P-V cards. On a centrifugal compressor, you're looking at bearing vibration (X/Y proximity probes or accelerometers), bearing metal temperatures, seal gas differential pressure, thrust position, inlet guide vane position, and suction/discharge conditions. Pump assets are similar: suction/discharge pressure, flow, motor current, bearing temperatures, and casing vibration.

The raw tag count is not what matters. What matters is that PI has been collecting this data at 15-minute intervals or faster for years. Most sites we've onboarded have 18 months of continuous history per asset, minimum. That's the baseline. You can build something meaningful from it.

PI Asset Framework: The Part Most Teams Underuse

Here's the thing: PI Data Archive is just a time-series database. The intelligence is in PI Asset Framework. AF is where you define your asset hierarchy, create element templates, and build the analytical context that turns raw tag values into something interpretable.

Element templates are the core of what makes AF useful for anomaly detection. When you define a Centrifugal_Compressor element template in AF, you're declaring that every unit of this type has a set of AF attributes: BearingDE_Vib_X, BearingDE_Vib_Y, SuctionPressure, DischargePressure, and so on. Each attribute maps to a PI tag through an AF data reference. The template enforces consistent naming across your fleet. If you have 12 centrifugal compressors across 4 stations, the AF attributes are identical for all of them. That's the precondition for any fleet-level model.

In our experience, this is where a lot of the integration work actually lives. Sites that have mature AF configurations save weeks of onboarding time. Sites where tags are named ABC_123_PIC_0042.PV with no AF structure, where every station named the same physical measurement differently, require a significant mapping effort before you can run a single model.

AF also gives you event frames, which are critical for training data quality. An event frame captures a time interval with a defined context: a maintenance event, an operational mode change, a known anomaly period. When we build a baseline model, we exclude event frame windows that correspond to abnormal operating periods. Without AF event frames, you're doing that exclusion manually from maintenance logs. Slow and incomplete.

Rule-Based Alarming in PI vs. Statistical Anomaly Detection

This distinction matters. Traditional PI alarm configurations are threshold-based: if BearingDE_Temp exceeds 185 degrees F, trigger a high alarm. Simple. Defensible. And chronically late. By the time a bearing temperature crosses a fixed limit, the bearing has usually been degrading for days or weeks.

Statistical anomaly detection works differently. Instead of comparing a single tag to a fixed limit, you build a multivariate model that learns the normal operating envelope from historical data. When the current operating point deviates from the predicted envelope, you score the deviation. The key word is multivariate. No single sensor tells the whole story.

Consider a centrifugal compressor running at partial load. At reduced throughput, bearing temperatures naturally run cooler and vibration levels may be lower. A fixed-threshold alarm set for full-load operation will never fire on a bearing that's degrading at low load because the absolute values stay below the limit. A multivariate model trained on that specific load condition will detect the deviation because it's comparing actual behavior against predicted behavior at that operating point. The model expects certain vibration and temperature values given the current suction pressure and flow. If the values don't match, something has changed.

In our validation runs, we've detected compressor bearing defects 11 to 19 days before any PI threshold alarm triggered. That's not unusual for this class of model. What's more interesting is the false-positive rate: properly tuned multivariate models running on 18-month baselines generate roughly 60 to 70% actionable anomaly flags on events that turn out to be real issues, versus the 20 to 30% true-positive rate typical of threshold-alarm investigations. Less noise. More signal.

Building the Model on Top of PI Data

We use PI Web API as the primary read path for model training and inference. The alternative is OPC-UA, which some sites prefer because it's a standard protocol and doesn't require a PI-specific client license. Here's the tradeoff in practice: PI Web API gives you direct access to AF asset structure, element attributes, and event frame metadata. If you've built your AF hierarchy correctly, you can query the API and get back a structured representation of your fleet without any additional mapping. OPC-UA exposes the same underlying tag values but flattens the hierarchy. You lose the AF context. For a model that uses AF attribute names as feature identifiers and excludes training windows based on event frames, PI Web API is the right choice.

For feature engineering, we correlate four primary signal groups for rotating equipment on midstream assets:

Signal Group Typical PI Tags Fault Mode Sensitivity
Vibration Overall velocity (in/s RMS), bearing-band acceleration, proximity probe gap Leading indicator; catches imbalance, misalignment, bearing wear
Temperature Bearing metal temp, lube oil supply temp, casing surface temp Lags vibration; confirms fault progression
Pressure differential Suction/discharge delta, interstage pressure ratios Process-side anomalies; valve wear, fouling
Lube oil system Lube oil supply pressure and temperature Oil starvation, contamination, filter differential

The correlations between these groups carry the diagnostic signal. A bearing running hot with normal vibration is a different fault mode than a bearing running hot with elevated 1X vibration and a dropping oil supply pressure. The model encodes these multi-dimensional relationships during training. A threshold alarm treats each sensor independently. That's the performance difference.

How AF Enriches the Anomaly Signal

Raw anomaly scores are useful. Anomaly scores with process context are actionable. This is the integration piece that took us the longest to get right.

When Midstreamly flags an anomaly on a compressor, we don't just surface the score. We pull the AF element attributes to attach the asset's current operating mode, the process unit it belongs to, and any active event frames that might explain the deviation. If a compressor is in a startup transient and vibration is elevated because the unit is at 40% speed, the anomaly flag gets context. If there's no active event frame and the unit is in steady-state operation, the flag gets escalated.

Nadia Okafor, our Head of Data Science, spent four years as an AVEVA analytics solutions engineer deploying PI AF models at midstream and refining clients before joining Midstreamly. Her position on this is direct: "AF element templates are not optional. They're the schema for your asset data. If you're trying to do analytics without AF structure, you're working without a data model. Every site I've worked with that had poor AF configuration also had poor model performance, not because the physics were different, but because the data was inconsistent."

Fact: that's exactly what our data shows too. Across every site we've onboarded, AF configuration quality is the single strongest predictor of first-pass model performance. Not sensor density. Not data volume. AF quality.

What This Looks Like in Practice

A typical onboarding sequence for a site with an existing PI Data Archive and AF configuration runs like this: we validate the AF element template against our expected attribute schema, check for gaps or nonstandard tag mappings, identify and exclude event frame windows corresponding to known maintenance periods or upset events, then train the multivariate baseline model on the cleaned 18-month history. From that point, inference runs at 15-minute intervals, matching the PI Data Archive update rate for most midstream sites.

First-pass detection rate on sites with mature AF configurations typically reaches 60 to 70% of mechanical events that were later confirmed by maintenance records. Sites with flat tag structures and no AF configuration start lower and improve as we add the context layer. The pipeline is clear: better AF structure drives better model features, which drives better anomaly precision, which drives fewer false alarms, which drives more operator trust. The chain matters every step.

The Practical Limit

One thing worth being direct about: PI historian-based anomaly detection works well for gradual degradation patterns, trending faults, and operating-point deviations that develop over hours to days. It is not a substitute for high-frequency vibration analysis. A bearing spall at 3,600 RPM generates a fault signature in the 3 to 8 kHz range. A 15-minute PI historian tag sampling overall vibration in in/s RMS will not capture that. If you need spectrum-level detection, you need a separate high-frequency data path, either from a continuous monitoring system or from periodic route-based vibration analysis.

What PI historian does well is process-correlated anomaly detection: catching the events that show up as subtle multivariate deviations before they progress to the level where a spectrum measurement would catch them. In our data, that window is typically 11 to 19 days. That's the lead time you're buying with a properly configured multivariate model on PI data.

If your site already has PI Data Archive and AF configured, you have most of the infrastructure needed to run this class of model. The investment is in data validation, model training, and integrating the anomaly signal with your maintenance workflow. Not in new instrumentation. That's the reason this approach makes sense as a first step for most midstream operators, and it's exactly why we built Midstreamly to sit on top of the PI stack rather than replace it.

Related Articles

Compressor Failure Detection with PI System Compressor Failure Detection with AVEVA PI System Vibration Analysis for Oil and Gas Compressors Vibration Analysis for Oil and Gas Compressors Midstream Vibration Monitoring Guide Midstream Vibration Monitoring: A Practical Guide