Session: 15-01-02: General Topics on Risk, Safety, and Reliability II
Paper Number: 173774
Detecting Faults in Triplex Reciprocating Pumps With Synthetic Data Generated Using Simulink
Triplex reciprocating pumps are often used in the oil and gas industries. Due to their moving parts, pumps are more prone to failure compared to static equipment such as vessels.
There are typically three types of maintenance: reactive, scheduled, and predictive. Reactive maintenance is performed after a problem arises, which can be costly and dangerous. Scheduled maintenance occurs at regular intervals but may not prevent all failures. Predictive maintenance, based on data analysis, is challenging for complex equipment but offers the most efficient solution. Predictive maintenance enables engineers to service their process equipment at an optimal time whereas scheduled maintenance takes a more conservative approach to maintenance scheduling which increases maintenance costs. To develop a predictive maintenance algorithm, raw equipment data is needed to extract condition indicators. To make fault detection models more robust, having enough data representing different fault types is a prerequisite. If such data is not readily available, it can be generated synthetically.
The physical modeling of a triplex reciprocating pump is implemented using Simulink and Simscape to create a high-fidelity digital twin. The model includes detailed representations of the pump's mechanical and hydraulic subsystems, such as the electric motor, crankshaft, plungers, and valves. These components are modeled using first principles and physical networks and CAD models of the pump housing and geometry, ensuring accurate spatial representation for mechanical dynamics and hydraulic behavior, allowing for realistic simulation of fluid flow, pressure dynamics, and mechanical motion and low computational costs. Faults such as cylinder leakage, inlet blockage, and bearing friction are introduced by parameterizing relevant physical properties—like leakage area, flow restriction factors, and friction coefficients—within the computational model. These parameters are varied systematically to simulate different fault severities. The model also incorporates stochastic noise to emulate real-world variability in sensor readings and system behavior. This physical modeling approach enables the generation of synthetic data under controlled yet realistic conditions, which is essential for training and validating machine learning models for predictive maintenance.
After running simulations of the pump under various fault conditions, we process the time series by trimming the initial transient period from the data to focus on steady-state behavior. Then, a set of condition indicators (features) is extracted from the signals. These include statistical, spectral, and time-domain features like RMS, kurtosis, and frequency content. Through a directed workflow, the Diagnostic Feature Designer app, we select the most informative features for fault classification. These features are then added back to the ensemble for use in training machine learning models.
With the features extracted, the next step is to train a multi-class classifier to detect and distinguish between different fault types and severities. Among various tested methods, we use a decision tree classifier, trained on labeled data from the simulation ensemble. The classifier is evaluated using a confusion matrix and accuracy metrics to assess its performance. The trained model is then validated on a separate test dataset to ensure generalization. This workflow enables automated fault detection and classification, supporting predictive maintenance strategies for complex systems like reciprocating pumps. The model achieves a validation accuracy of 66% and a fault prediction accuracy of 94%. Performance improvements can be made by retraining the classifier and incorporating additional pump measurements.
We conclude that the synthetic fault data can supplement or replace sensor data, enhancing the predictive maintenance model's performance. Retraining the classifier and including specific fault values as non-fault can further improve accuracy. Additional pump measurements can also contribute to better model performance. The codes and models for this case study are shared publicly.
Presenting Author: Jordan Olson MathWorks
Presenting Author Biography: Jordan is an application engineer at MathWorks specializing in artificial intelligence and advanced control design. Jordan supports customers across a wide range of industries, including aerospace, automotive, energy, and robotics. He holds a B.S. and M.S. in Mechanical Engineering, as well as an M.S. in Electrical Engineering, all from The University of Alabama.
Authors:
Jordan Olson MathWorksDetecting Faults in Triplex Reciprocating Pumps With Synthetic Data Generated Using Simulink
Paper Type
Technical Presentation
