Formation of hydrates is a serious flow assurance issue in offshore oil and gas production, which is usually the cause of blockage of pipelines, production shutdown and safety risks. The interactions between pressure, temperature, water cut and flow regime are not linear and hydrate events are uncommon, complicating the early detection of them which makes datasets extremely imbalanced. This paper examines the use of machine learning (ML) models in predicting hydrate, including the use of skewed failure data. An artificial sample of 10,000 samples was created with the key multiphase flow variables, and the models of Logistic Regression, Random Forest, Support Vector Machine (SVM), and XGBoost were trained and evaluated. Baseline models reached high overall accuracy (91%-95%) and low recall of hydrate events (12%-22) demonstrating the inefficiency of the traditional training of unbalanced data. Oversampling the minority-classes using SMOTE resulted in a significant improvement in the detection of the minority-classes; XGBoost recall increased from 22 to 81, the F1-score improved from 33 to 73, while the AUC-PR increased by 0.79. Cost-sensitive learning was more accurate (as high as 74% with SVM) but of lower recall than SMOTE-enhanced models. The findings have shown that the ensemble tree-based models, which have been used together with oversampling methods, represent the best early-warning of hydrate formation in imbalanced conditions. This research verifies that operational reliability and safety of subsea pipeline systems can be significantly enhanced in case of using ML with an adequate imbalance mitigation.
IRE Journals:
Ichenwo John Lander , Ogwu Philip "Machine Learning Models to Predict Hydrate Formation in Multiphase Flowlines with Imbalanced Failure Datasets" Iconic Research And Engineering Journals Volume 9 Issue 9 2026 Page 442-450 https://doi.org/10.64388/IREV9I9-1714838
IEEE:
Ichenwo John Lander , Ogwu Philip
"Machine Learning Models to Predict Hydrate Formation in Multiphase Flowlines with Imbalanced Failure Datasets" Iconic Research And Engineering Journals, 9(9) https://doi.org/10.64388/IREV9I9-1714838