Machine-Learning-Driven Formation Damage Classification Based on SEM Images and Petrophysical Data

preprint OA: closed
Full text JSON View at publisher
Full text 83,023 characters · extracted from preprint-html · click to expand
Machine-Learning-Driven Formation Damage Classification Based on SEM Images and Petrophysical Data | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Machine-Learning-Driven Formation Damage Classification Based on SEM Images and Petrophysical Data Yuxing Wu, Ali Ghalambor, Saeed Salehi, Craig Phillips This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9430254/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 6 You are reading this latest preprint version Abstract Oil and gas recovery is significantly reduced by formation damage caused by improper drilling, completion, and production operations. Conventional evaluation methods rely on core sample analysis, well-logging assessments, and interpretation of production data, which require greater human effort and are susceptible to human error. This study presents an innovative automated category framework for the formation damage classification. Computer vision and machine learning are applied to evaluate scanning electron microscopy (SEM) images, petrophysical properties, and mineral data to characterize formation damage mechanisms. A dataset of SEM images taken from core samples is created, including both damaged and undamaged formations. Computer vision technologies, such as OpenCV, are employed to preprocess images. Capillary pressure, X-ray diffraction (XRD), and petrophysical measurements are also considered to improve accuracy. All data is fed to four machine learning algorithms, Random Forest (RF), XGBoost, a convolutional neural network (CNN), and a hybrid Light Gradient Boosting Machine (LightGBM) model, to identify different types of formation damage. Four common damage issues, fine migration, phase trapping, rock-fluid interactions, and wettability alteration, are the focus of this paper. The results show that XGBoost achieves the highest classification accuracy (over 77%). For fine migration, the accuracy of both RF and XGBoost are 85.2%. XGBoost and LightGBM are effective for distinguishing phase-trapping problems, achieving more than 88% accuracy. Rock-fluid interactions and wettability alterations range from 59.3% to 77.8%. The difficulty of detecting geochemical interactions and surface energy changes from the current database results in low accuracy. CNN model has the lowest performance. The ineffectiveness indicates that transfer learning with petrophysical inputs is better than direct pixel-level learning, especially when you cannot generate high-resolution images. Compared with conventional evaluation, the innovative method significantly reduces analysis time and minimizes bias, supporting decision-making in well stimulation design, formation damage mitigation, and production optimization strategies. Formation Damage Machine Learning Computer Vision Technologies Scanning Electron Microscopy (SEM) Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 1. Introduction Formation damage impairs the permeability and porosity of reservoirs, reducing the natural productivity during the entire well lifespan - drilling, completion, production, and stimulation operations, and workover processes (Bennion, 2002 , Civan, 2023 , Xu et al., 2016 ). The damage significantly reduces production and can also decrease water or gas injection rates (Bennion, 2002 ). Especially in sandstone reservoirs with clay minerals, the instability of the clay enhances interactions between fluid and rock (Wilson et al., 2014 , Zhou et al., 2022 ). Stimulation and complicated remedial operations are needed to restore injectivity or deliverability (Ghalambor & Economides, 2002 ). A case study in the Western Desert of Egypt shows that formation damage reduces oil production by 10% to 45% (Alqutt et al., 2024 , Noah, 2016 ). A NaCl-KCl completion fluid is employed for workover operations, resulting in approximately a 45% reduction in production. Acid stimulation aims to mitigate the damage. However, due to clay mineral dissolution and precipitation in high temperatures, production continues to decrease. Hydraulic fracturing is used to bypass the damage zone. Figure 1 illustrates the major formation damage mechanisms, which are broadly categorized into four groups: mechanical, chemical, biological, and thermal. Mechanical formation damage mechanisms are governed by solid particle transport, fluid entrapment, and changes in effective formation stress. Fine migration occurs because of the mobilization of clay minerals and detrital particles within the pore network. When fluid flows with a high velocity or ionic strength changes, small particles detach from the main rock and plug the pore throats, significantly reducing permeability (Galal et al., 2016 ). Solid invasion refers to when drilling fluid filtrate carries solid particles, including weighting agents, polymers, and bridging materials, into the near-wellbore formation. The pore connectivity is damaged. Therefore, the effective permeability decreases (Li et al., 2024 ). Phase trapping is a mechanistically damage that water- or hydrocarbon-based fluids introduced during drilling or completion operations become immobilized within pore spaces by capillary forces, effectively blocking the flow paths of reservoir hydrocarbons (Tsar et al., 2012 ). Perforation-induced damage is another crucial mechanism of formation damage, characterized by rock failure and compaction around the perforation tunnel generated during high-velocity jet perforations. The comminuted particles clog the fluid flow path in the crushed zone (Heiland et al., 2009 ). Under large pressure drawdowns or geomechanical stress changes, dilatant behavior may develop, temporarily increasing pore volume through microfracture generation, while compactive failure irreversibly reduces pore volume and permeability (Khurshid & Afgan, 2021 ). Chemical formation damage mechanisms encompass rock-fluid interactions, fluid-fluid incompatibilities, and wettability alteration. Clay minerals — particularly smectite and mixed-layer illite/smectite — are highly susceptible to hydration and osmotic swelling when exposed to low-salinity or chemically incompatible aqueous fluids, leading to pore throat constriction and permeability reduction. Ion exchange reactions and mineral dissolution-precipitation processes driven by changes in pH, ionic strength, or temperature can further modify pore geometry and surface chemistry. When incompatible fluids mix — for instance, sulfate-rich injection water contacting barium- or strontium-bearing formation brines — scale precipitation may occur, plugging the pore network and reducing effective permeability (Velasquez et al., 2021 ). Wettability alteration, induced by the adsorption of polar organic compounds, surfactants, or multivalent ions onto mineral surfaces, modifies the capillary pressure-saturation relationships governing fluid distribution in the pore network, leading to increased water saturation, relative permeability reduction, and fluid trapping (Al-Yaseri et al., 2016 ). To understand the mechanisms of damage, rock mineralogy and interactions between rock and fluids need to be examined. When temperature and pressure change, asphaltene components form and precipitate (Struchkov et al., 2019 , Tavakkoli et al., 2014 ). Xylene and toluene injections are potential inhibitors that reduce organic deposits (Kuang et al., 2019 ). Formation damage during drilling and completion is the most common issue. Overbalanced drilling (i.e., wellbore pressure exceeds formation pressure), combined with mud and completion fluids, exacerbates the damage. Formation pore network and pore size distribution, level of overbalance conditions, formation mineralogy, composition and particle size of the drilling and completion fluid, and exposure time influence the damage development (Civan, 2007 ). Although the principles of formation damage have been well investigated in previous studies, conventional methods for damage evaluation still rely on human judgment. Long time consuming and human errors limit the efficiency, resulting in delays in operational decision-making (Ghalambor & Economides, 2002 ). Introduction of artificial intelligence (AI) and machine learning (ML) into the petroleum industry has opened new opportunities. Table 1 summarizes the literature on the use of AI and ML for formation damage diagnosis. Xiong and Holditch ( 1995 ) first introduced an expert-system-based framework for formation damage diagnosis and stimulation operation designs. Fuzzy logic inference is employed as a core analyzer. The study demonstrates the applicability of AI approaches for damage classification under uncertainty. Jiaojiao et al. ( 2010 ) uses an artificial neural network (ANN) algorithm to analyze the petrophysical parameters and mineralogical data, aiming to predict five types of formation sensitivities and water-blocking damage, achieving field-validated prediction accuracy exceeding 85%. The results highlight the capability of nonlinear data-driven models to capture complex rock–fluid interactions. Nunez Garcia et al. ( 2015 ) develops an integrated ML workflow to identify dominant formation-damage mechanisms. The algorithm is validated by water injection wells in a mature Colombian oil field. The approach combines feature engineering with supervised learning algorithms to map operational and reservoir variables to damage indicators. Erbas and Gumrah ( 2001 ) applies genetic algorithm-based mathematical models to simulate permeability alteration during acidizing and geochemical damage and validate the models against case-study data with high predictive accuracy. More recently, Sun and Chen ( 2025 ) presents a comprehensive review of AI-driven formation-damage diagnostics, emphasizing advances in neural-network architectures, imaging-based characterization, and coupled numerical modeling. The analysis highlights the increasing integration of data-driven and physics-informed approaches for near–real-time subsurface assessment and decision support. Table 1 Summary of the literature on the use of AI and ML for formation damage diagnosis. Paper ML Model Input Data Target Damage Mechanisms Xiong and Holditch ( 1995 ) Expert system and fuzzy logic model Engineering rules Formation damage classification Jiaojiao et al. ( 2010 ) ANN Petrophysical properties and chemical interactions Formation sensitivities and water blocking Nunez Garcia et al. ( 2015 ) Supervised machine learning workflow Injection data and reservoir variables Fines mobilization, solids invasion, and emulsion formation Erbas and Gumrah ( 2001 ) Genetic Algorithm Acidizing and geochemical reaction parameters Permeability changes during acidizing In this study, an innovative workflow is proposed to automatically characterize formation damage by the integration of computer vision and ML methods. The approach minimizes human interpretation errors, significantly reduces the analysis time, and enables the evaluation of an extensive database. 2. Data Collection, Preprocessing, and Machine Learning Model Setup In this study, an ML workflow is developed to identify the four major damage mechanisms: fines migration, phase trapping, rock–fluid interactions, and wettability alteration. Each mechanism is categorized as an independent binary classification, indicating high- and low-risk of damage. SEM images, capillary pressure, petrophysical properties, and mineralogy are considered for the damage characterization. The dataset consists of 155 samples to normalize pixel intensity distributions. Computer vision technologies (e.g., OpenCV) are used to resize the image to 224 x 224 pixels and convert to RGB format to ensure compatibility with machine learning algorithms. Other data is preprocessed by standardized normalization to remove scale dependency. For prelabeling the samples, fine migration is determined by the presence of weak formations, clay minerals, and small and detachable solids in SEM images and mineral data. Wettability alteration is assessed based on the salinity and mineral components. Variations in salinity and divalent cation concentrations—particularly Ca²⁺ and Mg²⁺—can significantly influence surface charge interactions between polar components of crude oil and mineral surfaces, especially under low-salinity conditions, resulting in wettability alterations (Yang et al., 2016 ). Phase trapping assessments are based on the capillary pressure profile. A large change in pressure indicates fluid immobilization, which leads to phase trapping damage. Rock–fluid interactions are identified by chemical indicators associated with the interactions. Table 2 Summary of Machine Learning Algorithms Used in This Study. Algorithm Input Data Key Features Benefits Random Forest Mineralogy, XRD, petrophysical data:, SEM images, and capillary pressure curves Pretrained model (ResNet50, 2048-dim) for image feature extraction + 200 decision trees, bootstrap aggregation Easy to train, interpretable feature importance, handles nonlinear data well XGBoost Mineralogy, XRD, and petrophysical data 500 estimators, max depth = 5, learning rate = 0.05, subsampling = 0.8 Handles class imbalance well, efficient on structured data CNN Mineralogy, XRD, petrophysical data, SEM images, and capillary pressure curves CNN model for images feature extraction + Dense (128) + Dropout + Sigmoid classifier Learns pore structure, fines distribution, microfractures directly from images LightGBM Fusion of mineralogy, XRD, petrophysical data, SEM images, and capillary pressure curves Numerical features scaled; ResNet50 (2048-dim) deep image features combined; LightGBM classifier Combines physical, mineralogical, and visual microstructural information. Minimize bias. Four ML algorithms are studied to compare predictive performance. Table 2 summarizes the features of the ML algorithms used in this study: i) The RF classifier is implemented to capture nonlinear relationships; ii) An XGBoost model is used with 500 estimators, a learning rate of 0.05, maximum tree depth of 5, and subsampling ratio of 0.8 to mitigate overfitting; iii) A CNN with the Adam optimizer is developed for the damage classifier based on direct image-based categorization; iv) Additionally, a hybrid LightGBM algorithm with deep image feature embeddings is included. For all models, data, including SEM images, capillary pressure, petrophysical properties, mineralogy, and prelabeling results, are fed to the ML. The dataset is randomly split into training (80%) and testing (20%) sets, which are then used to train and test the algorithms, respectively. 3. Results Table 3 illustrates the accuracy of the four types of machine learning models in identifying formation damage. XGBoost and Random Forest demonstrate the most accurate predictions with average accuracies above 80%. The CNN model achieved slightly lower accuracies, suggesting that directly learning from the images has low efficiency compared to using a pretrained model to extract image features. Due to the lack of data, the CNN cannot develop a robust classifier to learn textures, pore structures, clay components, fracture distributions, and particle shapes. The pretrained model a higher accuracy. The hybrid LightGBM model, which integrates numerical and deep image features, achieved competitive performance, especially in predicting fluid trapping. Table A1 presents details of the predictions compared with the manual labels from the prelabeling procedure, including SEM images, true labels, and predicted labels from different machine learning algorithms. Figure 3 – 6 present the confusion matrix plots based on 27 samples, comparing the match of actual and predicted labels for the investigated four types of formation damage issues. For fine migration (Fig. 3 ), the confusion matrices indicate that most samples with fine-migration issues were correctly identified. The CNN model also performs adequately, 81.5%. For phase trapping detections (Fig. 4 ), XGBoost, LightGBM, and Random Forest have similar performance, while the accuracy of CNN is lower at 77.8%. Both gradient boosting and decision-tree methods are effective for capturing nonlinear dependencies among mineralogical factors, pore geometry, and capillary response. For phase trapping detections (Fig. 4 ), XGBoost, LightGBM, and Random Forest have similar performance, while the accuracy of CNN is lower at 77.8%. Both gradient boosting and decision-tree methods are effective for capturing nonlinear dependencies among mineralogical factors, pore geometry, and capillary response. The detection of wettability alterations yields the lowest accuracy (Fig. 6 ). The confusion matrices suggest a tendency to overpredict the “no alteration” class, implying that the models are unable to detect surface-energy changes from the current data. 4. Discussion The above result shows that machine learning, combined with SEM image feature extraction technologies, petrophysical data, and mineralogical information, can automatically classify formation damage issues. Only 155 samples in this study can provide a good performance. The varying performances of different algorithms across different damage types are worth discussing. 4.1 Model Performance In general, the performance of XGBoost and RF is better than others. Specifically, the XGBoost accuracies for the studied damage mechanisms range from 77.8% to 88.9%. The results are consistent with the rock type classification study that the gradient boosting model performs better when structured tabular data dominates the feature space (Houshmand et al., 2022 ). In this model, features from SEM images are extracted and organized into tabular data. The feature is combined with other numerical data for analysis. The sequential boosting mechanism of XGBoost is well-suited to the class imbalance in damage datasets (No issues, fine migrations, and phase trapping are the most common types of damage mechanisms in the dataset). RF uses 200 independent decision trees to minimize the influence of the noise and nonlinear features. CNN model indicates that high-resolution images and a small number of samples limit the accuracy. With insufficient samples, the model cannot examine the roles of influencing parameters, such as pore throat distribution/geometry, clay components, particle shape, and microfracture networks, across different mechanisms. Chen et al. ( 2023 ) observes a similar phenomenon in rock type classification projects. When the sample size is limited, transfer learning methods can better capture rock texture features for classification. On the contrary, models without transfer learning (i.e., no pretrained model for image feature extractions) achieve lower accuracy. The hybrid LightGBM uses a feature-extraction model to mitigate the limitation of having a better performance compared to the CNN model. However, the 2048-dimensional embeddings cannot accurately represent image features. 4.2 Four Types of Damage Classification Fine migration has the highest accuracy (i.e., 85.2% for both RF and XGBoost). The reason is due to the distinctiveness of the visual and mineralogical features. Mobilized clay platelets, small particles near pore throats, and a high percentage of clay minerals are indicators of migration damage (Galal et al., 2016 , Wilson et al., 2014 ). The accuracy of phase-trapping identification exceeds 88% with XGBoost and LightGBM. Steep inflection points in capillary pressure-saturation curves typically indicate narrow pore throat size, increasing the likelihood of phase trapping. Pressure is the primary evidence, while SEM images play a supporting role. Geochemistry-related phenomena, such as rock-fluid interactions and wettability, are less accurately categorized. Wettability alteration occurs due to mineral adsorption at the surface. The process may influence capillary pressure, but cannot cause changes in SEM images (Al-Yaseri et al., 2016 ). For the rock-fluid interactions, the distinct features can be observed when damage has already progressed significantly. For example, as mentioned in the Western Desert of Egypt case study (Alqutt et al., 2024 ). Insoluble silica gel is formed when HCl treatment triggers the illite and kaolinite decomposition. Additionally, the small sample size is a major challenge in this study (i.e., 155 samples for training and 27 for testing). One misclassification results in 3.7% accuracy penalty. Especially in wettability alteration evaluation, due to the amount of the sample with the issues being few, all models overpredict the "no alteration" class. Conclusions A computer vision–based workflow integrating SEM images, capillary pressure curves, and petrophysical–mineralogical data was successfully developed to automate the characterization of four formation-damage mechanisms: fine migration, phase trapping, rock–fluid interactions, and wettability alterations. Compared with traditional evaluation techniques, the innovative method significantly reduces manual interpretation time and minimizes human error. Notably, the study extended previous AI-based formation-damage diagnosis tools by introducing computer vision techniques that integrate SEM and capillary-pressure images with mineralogical and petrophysical data. Despite using a small dataset (155 samples), the model achieved high accuracy, indicating the workflow's potential. The following summarizes the findings of this study: Among the four evaluated machine learning algorithms, Random Forest, XGBoost, CNN, and a hybrid LightGBM model, the tree-based ensemble methods (Random Forest and XGBoost) consistently delivered the highest prediction accuracy (70.4% to 88.9% accuracy with different formation damage types). The other two methods have relatively low performance. Fine migration and phase trapping were identified with a higher accuracy compared to rock-fluid interactions and wettability alterations due to their relatively clear mineralogical and pore-structure signatures. The CNN model showed a lower performance, indicating that learning directly from pixel-level information alone is less effective than using pretrained feature extractors combined with structured data. High-resolution images are required. Otherwise, the algorithm cannot identify the type of damage. The hybrid LightGBM model demonstrated competitive performance, particularly for fluid-trapping predictions, confirming the value of fusing numerical and deep image features to capture heterogeneous formation-damage signals. Abbreviations AI artificial intelligence ANN artificial neural network CNN Convolutional Neural Network LightGBM Light Gradient Boosting Machine ML machine learning RF Random Forest SEM Scanning Electron Microscopy XRD X-Ray Diffraction Declarations Declaration of Competing Interest The authors report no declarations of interest. Author Contribution YW: Conceptualization, Machine Learning Algorithm Development, Draft and Revising the ManuscriptAG: Review and Editing, ResourcesSS: Review and Editing, Supervision CP: Review and Editing, Resources Acknowledgment The authors would like to express their sincere appreciation to the Research in Energy Sustainability and Innovative Low-Impact Environmental Technologies (RESILIENT) Laboratory at Southern Methodist University for their support of the study. References Al-Yaseri A, Mukainah A, Lebedev H, Barifcani M, A., Iglauer S (2016) Impact of fines and rock wettability on reservoir formation damage. Geophys Prospect 64:860–874 Alqutt M, Sabaa A, Salem A, Seleem M, Abdel-Aziz I, Bassem A, Hussein A (2024) Reservoir Formation Damage, Myths, Facts, and Lessons Learned From the Damage of a Deep Well in the Western Desert of Egypt. Mediterranean Offshore Conference : SPE, D031S027R002 Bennion DB (2002) An overview of formation damage mechanisms causing a reduction in the productivity and injectivity of oil and gas producing formations. J Can Pet Technol 41 Chen W, Su L, Chen X, Huang Z (2023) Rock image classification using deep residual neural network with transfer learning. Front Earth Sci 10:1079447 Civan F (2007) Formation damage mechanisms and their phenomenological modeling—an overview. SPE European Formation Damage Conference and Exhibition : SPE, SPE-107857-MS Civan F (2023) Reservoir formation damage: fundamentals, modeling, assessment, and mitigation. Gulf Professional Publishing Erbas D, Gumrah F (2001) The use of genetic algorithms as an optimization tool for predicting permeability alteration in formation damage and improvement modelling. PETSOC Canadian International Petroleum Conference : PETSOC, PETSOC-2001-2052 Galal SK, Elgibaly AA, Elsayed SK (2016) Formation damage due to fines migration and its remedial methods. Egyptian J Petroleum 25:515–524 Ghalambor A, Economides M (2002) Formation damage abatement: A quarter-century perspective. SPE J 7:4–13 Heiland J, Grove B, Harvey J, Walton I, Martin A (2009) New fundamental insights into perforation-induced formation damage. SPE European Formation Damage Conference and Exhibition : SPE, SPE-122845-MS Houshmand N, GoodFellow S, Esmaeili K, Calderón JCO (2022) Rock type classification based on petrophysical, geochemical, and core imaging data using machine and deep learning techniques. Appl Comput Geosci 16:100104 Jiaojiao G, Jienian Y, Zhiyoong L, Zhong H (2010) Mechanisms and Prevention of Damage for Formations with Low-porosity and Low-permeability. SPE International Oil and Gas Conference and Exhibition in China : SPE, SPE-130961-MS Khurshid I, Afgan I (2021) Investigation of water composition on formation damage and related energy recovery from geothermal reservoirs: Geochemical and geomechanics insights. Energies 14:7415 Kuang J, Yarbrough J, Enayat S, Edward N, Wang J, Vargas FM (2019) Evaluation of solvents for in-situ asphaltene deposition remediation. Fuel 241:1076–1084 Li J, Xiong G, Li N, Zhang Y, Rui Y (2024) Simulation of sandstone formation damage caused by solid particle invasion. J Dispers Sci Technol 45:1767–1778 Noah ZA (2016) Impact of Formation Damage on Well Productivity throughout Experimental Work and Field case study. Africa Oil and Gas conference . Ridge, Accra Nunez Garcia W, Kleber M, Polo R, Franco C, Escobar M, Sierra A, Arango M (2015) Comprehensive methodology to identify, quantify and eliminate the formation damage mechanisms, succesfully applied for the first time by the operator in a colombian mature field; including formation damage modeling, well candidate selection, stimulation treatment design and execution: A case history. SPE Latin America and Caribbean Petroleum Engineering Conference : SPE, D021S015R003 Struchkov I, Rogachev M, Kalinin E, Roschin P (2019) Laboratory investigation of asphaltene-induced formation damage. J Petroleum Explor Prod Technol 9:1443–1455 Sun Z, Chen Z (2025) Research Status and Development Direction of Formation Damage Prediction and Diagnosis Technologies. Appl Sci 15:1169 Tavakkoli M, Panuganti SR, Taghikhani V, Pishvaie MR, Chapman WG (2014) Precipitated asphaltene amount at high-pressure and high-temperature conditions. Energy Fuels 28:1596–1610 Tsar M, Bahrami H, Rezaee R, Murickan G, Mehmood S, Ghasemi M, Ameri A, Mehdizadeh M (2012) Effect of Drilling Fluid (Water-based vs Oil-based) on Phase Trap Damage in Tight Sand Gas Reservoirs (SPE 154652). 74th EAGE Conference and Exhibition incorporating EUROPEC 2012 : European Association of Geoscientists & Engineers, cp-293-00197 Velasquez I, Silva I, Martínez L, Rattia L, Labrador H, Villanueva I, Pérez V, Agüero B, Gonzalez M, Monroy R (2021) Interfacial phenomena in petroleum reservoir conditions related to fluid-fluid interactions and rock-fluid interactions: Formulation effects and in porous medium test. J Petrol Sci Eng 207:109076 Wilson M, Wilson L, Patey I (2014) The influence of individual clay minerals on formation damage of reservoir sandstones: a critical review with some new insights. Clay Miner 49:147–164 Xiong H, Holditch SA (1995) A comprehensive approach to formation damage diagnosis and corresponding stimulation type and fluid selection. SPE Oklahoma City Oil and Gas Symposium/Production and Operations Symposium : SPE, SPE-29531-MS Xu C, Kang Y, You Z, Chen M (2016) Review on formation damage mechanisms and processes in shale gas reservoir: Known and to be known. J Nat Gas Sci Eng 36:1208–1219 Yang J, Dong Z, Dong M, Yang Z, Lin M, Zhang J, Chen C (2016) Wettability alteration during low-salinity waterflooding and the relevance of divalent ions in this process. Energy Fuels 30:72–79 Zhou Y, Yang W, Yin D (2022) Experimental investigation on reservoir damage caused by clay minerals after water injection in low permeability sandstone reservoirs. J Petroleum Explor Prod Technol 12:915–924 Additional Declarations No competing interests reported. Supplementary Files AppendixA.docx Cite Share Download PDF Status: Under Review Version 1 posted Reviewers agreed at journal 05 May, 2026 Reviewers agreed at journal 04 May, 2026 Reviewers invited by journal 03 May, 2026 Editor assigned by journal 20 Apr, 2026 Submission checks completed at journal 17 Apr, 2026 First submitted to journal 15 Apr, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9430254","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":634991500,"identity":"3e3ce760-11c9-4d39-9158-8862aa160064","order_by":0,"name":"Yuxing Wu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAt0lEQVRIiWNgGAWjYJCCA0AsB2GykaDFmDQtIJDYQLQWgxs5hocLfh1Onz/tjAHDh7LDxGhJSzg8s+9w7obbOQaMM84RpSX5wGHentu5G6RzDJh524jSktgA0pIuPxuo5S9xWoC28Py4ncAAdBgzIzFaJM88SzjM2/DfcMPttIKDPefSCWvhO55j/JnnT5q8/OzkjQ9+lFkT1qJwAEgwtkE4BwirBwL5BhD5hyi1o2AUjIJRMFIBAJywQpVVZRMhAAAAAElFTkSuQmCC","orcid":"","institution":"Southern Methodist University","correspondingAuthor":true,"prefix":"","firstName":"Yuxing","middleName":"","lastName":"Wu","suffix":""},{"id":634991501,"identity":"ff11a55f-4502-45d7-ab82-88e6b243cdb6","order_by":1,"name":"Ali Ghalambor","email":"","orcid":"","institution":"Oil Center Research International, LLC","correspondingAuthor":false,"prefix":"","firstName":"Ali","middleName":"","lastName":"Ghalambor","suffix":""},{"id":634991502,"identity":"94c9c1ce-04ae-443a-a7fa-af896deba36d","order_by":2,"name":"Saeed Salehi","email":"","orcid":"","institution":"Southern Methodist University","correspondingAuthor":false,"prefix":"","firstName":"Saeed","middleName":"","lastName":"Salehi","suffix":""},{"id":634991503,"identity":"21d8b196-3e45-4ff3-a19c-ab5c3921337d","order_by":3,"name":"Craig Phillips","email":"","orcid":"","institution":"Crested Butte Petrophysical Consultants","correspondingAuthor":false,"prefix":"","firstName":"Craig","middleName":"","lastName":"Phillips","suffix":""}],"badges":[],"createdAt":"2026-04-15 18:23:12","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9430254/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9430254/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":109100032,"identity":"44056901-3fa0-4e76-b652-1ec05de77fc6","added_by":"auto","created_at":"2026-05-12 14:19:49","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":128163,"visible":true,"origin":"","legend":"\u003cp\u003eMajor formation damage mechanisms.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-9430254/v1/c1d6b94f20d472396e156f7b.png"},{"id":109100131,"identity":"d1df5c74-25bb-4183-9feb-de2dfab0925b","added_by":"auto","created_at":"2026-05-12 14:20:14","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":146477,"visible":true,"origin":"","legend":"\u003cp\u003eFlowchart of the study.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-9430254/v1/2564c72792f65a01155b01f5.png"},{"id":109099888,"identity":"9f347af5-0dd6-4ef9-8dc7-be41a6f2459a","added_by":"auto","created_at":"2026-05-12 14:19:09","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":73238,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix visualization plots of different models for identifying fine migration formation damage.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-9430254/v1/f164439ef80d211419895683.png"},{"id":109100132,"identity":"24633d09-8071-46bd-889d-c111f9f4165f","added_by":"auto","created_at":"2026-05-12 14:20:14","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":75787,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix visualization plots of different models for identifying phase trapping formation damage.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-9430254/v1/20235605c1d5a31b1a83a234.png"},{"id":109099884,"identity":"624e4209-8f62-40c8-9b73-aaa01418a584","added_by":"auto","created_at":"2026-05-12 14:19:09","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":71023,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix visualization plots of different models for identifying rock-fluid interaction formation damage.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-9430254/v1/d126e4e9f00c6bf53f0a2d17.png"},{"id":109099991,"identity":"90e2a2ea-4b86-45c0-878e-6db4ed299a0f","added_by":"auto","created_at":"2026-05-12 14:19:30","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":71441,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix visualization plots of different models for identifying wettability alteration formation damage.\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-9430254/v1/caf973b20179732d6913965f.png"},{"id":109100197,"identity":"f3e9e907-0148-4075-a2f8-b3c4344c794a","added_by":"auto","created_at":"2026-05-12 14:20:38","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":714209,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9430254/v1/c13eb2f4-10b3-4c62-a566-3c21f428217c.pdf"},{"id":109100118,"identity":"59e98f1a-55ea-41ef-9086-003c72525949","added_by":"auto","created_at":"2026-05-12 14:20:04","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":3835884,"visible":true,"origin":"","legend":"","description":"","filename":"AppendixA.docx","url":"https://assets-eu.researchsquare.com/files/rs-9430254/v1/c2b55f571c48e0aaa7543895.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Machine-Learning-Driven Formation Damage Classification Based on SEM Images and Petrophysical Data","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eFormation damage impairs the permeability and porosity of reservoirs, reducing the natural productivity during the entire well lifespan - drilling, completion, production, and stimulation operations, and workover processes (Bennion, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2002\u003c/span\u003e, Civan, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2023\u003c/span\u003e, Xu et al., \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e2016\u003c/span\u003e). The damage significantly reduces production and can also decrease water or gas injection rates (Bennion, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2002\u003c/span\u003e). Especially in sandstone reservoirs with clay minerals, the instability of the clay enhances interactions between fluid and rock (Wilson et al., \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2014\u003c/span\u003e, Zhou et al., \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). Stimulation and complicated remedial operations are needed to restore injectivity or deliverability (Ghalambor \u0026amp; Economides, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2002\u003c/span\u003e). A case study in the Western Desert of Egypt shows that formation damage reduces oil production by 10% to 45% (Alqutt et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2024\u003c/span\u003e, Noah, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2016\u003c/span\u003e). A NaCl-KCl completion fluid is employed for workover operations, resulting in approximately a 45% reduction in production. Acid stimulation aims to mitigate the damage. However, due to clay mineral dissolution and precipitation in high temperatures, production continues to decrease. Hydraulic fracturing is used to bypass the damage zone.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e illustrates the major formation damage mechanisms, which are broadly categorized into four groups: mechanical, chemical, biological, and thermal. Mechanical formation damage mechanisms are governed by solid particle transport, fluid entrapment, and changes in effective formation stress. Fine migration occurs because of the mobilization of clay minerals and detrital particles within the pore network. When fluid flows with a high velocity or ionic strength changes, small particles detach from the main rock and plug the pore throats, significantly reducing permeability (Galal et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2016\u003c/span\u003e). Solid invasion refers to when drilling fluid filtrate carries solid particles, including weighting agents, polymers, and bridging materials, into the near-wellbore formation. The pore connectivity is damaged. Therefore, the effective permeability decreases (Li et al., \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). Phase trapping is a mechanistically damage that water- or hydrocarbon-based fluids introduced during drilling or completion operations become immobilized within pore spaces by capillary forces, effectively blocking the flow paths of reservoir hydrocarbons (Tsar et al., \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2012\u003c/span\u003e). Perforation-induced damage is another crucial mechanism of formation damage, characterized by rock failure and compaction around the perforation tunnel generated during high-velocity jet perforations. The comminuted particles clog the fluid flow path in the crushed zone (Heiland et al., \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2009\u003c/span\u003e). Under large pressure drawdowns or geomechanical stress changes, dilatant behavior may develop, temporarily increasing pore volume through microfracture generation, while compactive failure irreversibly reduces pore volume and permeability (Khurshid \u0026amp; Afgan, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). Chemical formation damage mechanisms encompass rock-fluid interactions, fluid-fluid incompatibilities, and wettability alteration. Clay minerals \u0026mdash; particularly smectite and mixed-layer illite/smectite \u0026mdash; are highly susceptible to hydration and osmotic swelling when exposed to low-salinity or chemically incompatible aqueous fluids, leading to pore throat constriction and permeability reduction. Ion exchange reactions and mineral dissolution-precipitation processes driven by changes in pH, ionic strength, or temperature can further modify pore geometry and surface chemistry. When incompatible fluids mix \u0026mdash; for instance, sulfate-rich injection water contacting barium- or strontium-bearing formation brines \u0026mdash; scale precipitation may occur, plugging the pore network and reducing effective permeability (Velasquez et al., \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). Wettability alteration, induced by the adsorption of polar organic compounds, surfactants, or multivalent ions onto mineral surfaces, modifies the capillary pressure-saturation relationships governing fluid distribution in the pore network, leading to increased water saturation, relative permeability reduction, and fluid trapping (Al-Yaseri et al., \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2016\u003c/span\u003e). To understand the mechanisms of damage, rock mineralogy and interactions between rock and fluids need to be examined. When temperature and pressure change, asphaltene components form and precipitate (Struchkov et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2019\u003c/span\u003e, Tavakkoli et al., \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2014\u003c/span\u003e). Xylene and toluene injections are potential inhibitors that reduce organic deposits (Kuang et al., \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2019\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eFormation damage during drilling and completion is the most common issue. Overbalanced drilling (i.e., wellbore pressure exceeds formation pressure), combined with mud and completion fluids, exacerbates the damage. Formation pore network and pore size distribution, level of overbalance conditions, formation mineralogy, composition and particle size of the drilling and completion fluid, and exposure time influence the damage development (Civan, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2007\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eAlthough the principles of formation damage have been well investigated in previous studies, conventional methods for damage evaluation still rely on human judgment. Long time consuming and human errors limit the efficiency, resulting in delays in operational decision-making (Ghalambor \u0026amp; Economides, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2002\u003c/span\u003e). Introduction of artificial intelligence (AI) and machine learning (ML) into the petroleum industry has opened new opportunities. Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e summarizes the literature on the use of AI and ML for formation damage diagnosis. Xiong and Holditch (\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e1995\u003c/span\u003e) first introduced an expert-system-based framework for formation damage diagnosis and stimulation operation designs. Fuzzy logic inference is employed as a core analyzer. The study demonstrates the applicability of AI approaches for damage classification under uncertainty. Jiaojiao et al. (\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2010\u003c/span\u003e) uses an artificial neural network (ANN) algorithm to analyze the petrophysical parameters and mineralogical data, aiming to predict five types of formation sensitivities and water-blocking damage, achieving field-validated prediction accuracy exceeding 85%. The results highlight the capability of nonlinear data-driven models to capture complex rock\u0026ndash;fluid interactions. Nunez Garcia et al. (\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2015\u003c/span\u003e) develops an integrated ML workflow to identify dominant formation-damage mechanisms. The algorithm is validated by water injection wells in a mature Colombian oil field. The approach combines feature engineering with supervised learning algorithms to map operational and reservoir variables to damage indicators. Erbas and Gumrah (\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2001\u003c/span\u003e) applies genetic algorithm-based mathematical models to simulate permeability alteration during acidizing and geochemical damage and validate the models against case-study data with high predictive accuracy. More recently, Sun and Chen (\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2025\u003c/span\u003e) presents a comprehensive review of AI-driven formation-damage diagnostics, emphasizing advances in neural-network architectures, imaging-based characterization, and coupled numerical modeling. The analysis highlights the increasing integration of data-driven and physics-informed approaches for near\u0026ndash;real-time subsurface assessment and decision support.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eSummary of the literature on the use of AI and ML for formation damage diagnosis.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePaper\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eML Model\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eInput Data\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTarget Damage Mechanisms\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eXiong and Holditch (\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e1995\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eExpert system and fuzzy logic model\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eEngineering rules\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eFormation damage classification\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eJiaojiao et al. (\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2010\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eANN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePetrophysical properties and chemical interactions\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eFormation sensitivities and water blocking\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNunez Garcia et al. (\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2015\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSupervised machine learning workflow\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eInjection data and reservoir variables\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eFines mobilization, solids invasion, and emulsion formation\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eErbas and Gumrah (\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2001\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGenetic Algorithm\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAcidizing and geochemical reaction parameters\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003ePermeability changes during acidizing\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eIn this study, an innovative workflow is proposed to automatically characterize formation damage by the integration of computer vision and ML methods. The approach minimizes human interpretation errors, significantly reduces the analysis time, and enables the evaluation of an extensive database.\u003c/p\u003e"},{"header":"2. Data Collection, Preprocessing, and Machine Learning Model Setup","content":"\u003cp\u003eIn this study, an ML workflow is developed to identify the four major damage mechanisms: fines migration, phase trapping, rock\u0026ndash;fluid interactions, and wettability alteration. Each mechanism is categorized as an independent binary classification, indicating high- and low-risk of damage. SEM images, capillary pressure, petrophysical properties, and mineralogy are considered for the damage characterization. The dataset consists of 155 samples to normalize pixel intensity distributions. Computer vision technologies (e.g., OpenCV) are used to resize the image to 224 x 224 pixels and convert to RGB format to ensure compatibility with machine learning algorithms. Other data is preprocessed by standardized normalization to remove scale dependency.\u003c/p\u003e\n\u003cp\u003eFor prelabeling the samples, fine migration is determined by the presence of weak formations, clay minerals, and small and detachable solids in SEM images and mineral data. Wettability alteration is assessed based on the salinity and mineral components. Variations in salinity and divalent cation concentrations\u0026mdash;particularly Ca\u0026sup2;⁺ and Mg\u0026sup2;⁺\u0026mdash;can significantly influence surface charge interactions between polar components of crude oil and mineral surfaces, especially under low-salinity conditions, resulting in wettability alterations (Yang et al., \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e2016\u003c/span\u003e). Phase trapping assessments are based on the capillary pressure profile. A large change in pressure indicates fluid immobilization, which leads to phase trapping damage. Rock\u0026ndash;fluid interactions are identified by chemical indicators associated with the interactions.\u003c/p\u003e\n\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\" class=\"fr-table-selection-hover\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eSummary of Machine Learning Algorithms Used in This Study.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003ccolgroup cols=\"4\"\u003e\u003c/colgroup\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eAlgorithm\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003eInput Data\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003eKey Features\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003eBenefits\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eRandom Forest\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003eMineralogy, XRD, petrophysical data:, SEM images, and capillary pressure curves\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003ePretrained model (ResNet50, 2048-dim) for image feature extraction\u0026thinsp;+\u0026thinsp;200 decision trees, bootstrap aggregation\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003eEasy to train, interpretable feature importance, handles nonlinear data well\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eXGBoost\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003eMineralogy, XRD, and petrophysical data\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003e500 estimators, max depth\u0026thinsp;=\u0026thinsp;5, learning rate\u0026thinsp;=\u0026thinsp;0.05, subsampling\u0026thinsp;=\u0026thinsp;0.8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003eHandles class imbalance well, efficient on structured data\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eCNN\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003eMineralogy, XRD, petrophysical data, SEM images, and capillary pressure curves\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003eCNN model for images feature extraction\u0026thinsp;+\u0026thinsp;Dense (128) + Dropout\u0026thinsp;+\u0026thinsp;Sigmoid classifier\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003eLearns pore structure, fines distribution, microfractures directly from images\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eLightGBM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003eFusion of mineralogy, XRD, petrophysical data, SEM images, and capillary pressure curves\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003eNumerical features scaled; ResNet50 (2048-dim) deep image features combined; LightGBM classifier\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003eCombines physical, mineralogical, and visual microstructural information. Minimize bias.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n\u003c/div\u003e\n\u003cp\u003eFour ML algorithms are studied to compare predictive performance. Table \u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e summarizes the features of the ML algorithms used in this study: i) The RF classifier is implemented to capture nonlinear relationships; ii) An XGBoost model is used with 500 estimators, a learning rate of 0.05, maximum tree depth of 5, and subsampling ratio of 0.8 to mitigate overfitting; iii) A CNN with the Adam optimizer is developed for the damage classifier based on direct image-based categorization; iv) Additionally, a hybrid LightGBM algorithm with deep image feature embeddings is included. For all models, data, including SEM images, capillary pressure, petrophysical properties, mineralogy, and prelabeling results, are fed to the ML. The dataset is randomly split into training (80%) and testing (20%) sets, which are then used to train and test the algorithms, respectively.\u003c/p\u003e"},{"header":"3. Results","content":"\u003cp\u003eTable\u0026nbsp;3 illustrates the accuracy of the four types of machine learning models in identifying formation damage. XGBoost and Random Forest demonstrate the most accurate predictions with average accuracies above 80%. The CNN model achieved slightly lower accuracies, suggesting that directly learning from the images has low efficiency compared to using a pretrained model to extract image features. Due to the lack of data, the CNN cannot develop a robust classifier to learn textures, pore structures, clay components, fracture distributions, and particle shapes. The pretrained model a higher accuracy. The hybrid LightGBM model, which integrates numerical and deep image features, achieved competitive performance, especially in predicting fluid trapping. Table \u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003eA1\u003c/span\u003e presents details of the predictions compared with the manual labels from the prelabeling procedure, including SEM images, true labels, and predicted labels from different machine learning algorithms.\u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e\u0026ndash;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e present the confusion matrix plots based on 27 samples, comparing the match of actual and predicted labels for the investigated four types of formation damage issues. For fine migration (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e), the confusion matrices indicate that most samples with fine-migration issues were correctly identified. The CNN model also performs adequately, 81.5%.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFor phase trapping detections (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e), XGBoost, LightGBM, and Random Forest have similar performance, while the accuracy of CNN is lower at 77.8%. Both gradient boosting and decision-tree methods are effective for capturing nonlinear dependencies among mineralogical factors, pore geometry, and capillary response.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFor phase trapping detections (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e), XGBoost, LightGBM, and Random Forest have similar performance, while the accuracy of CNN is lower at 77.8%. Both gradient boosting and decision-tree methods are effective for capturing nonlinear dependencies among mineralogical factors, pore geometry, and capillary response.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe detection of wettability alterations yields the lowest accuracy (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e). The confusion matrices suggest a tendency to overpredict the \u0026ldquo;no alteration\u0026rdquo; class, implying that the models are unable to detect surface-energy changes from the current data.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"4. Discussion","content":"\u003cp\u003eThe above result shows that machine learning, combined with SEM image feature extraction technologies, petrophysical data, and mineralogical information, can automatically classify formation damage issues. Only 155 samples in this study can provide a good performance. The varying performances of different algorithms across different damage types are worth discussing.\u003c/p\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e4.1 Model Performance\u003c/h2\u003e \u003cp\u003eIn general, the performance of XGBoost and RF is better than others. Specifically, the XGBoost accuracies for the studied damage mechanisms range from 77.8% to 88.9%. The results are consistent with the rock type classification study that the gradient boosting model performs better when structured tabular data dominates the feature space (Houshmand et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). In this model, features from SEM images are extracted and organized into tabular data. The feature is combined with other numerical data for analysis. The sequential boosting mechanism of XGBoost is well-suited to the class imbalance in damage datasets (No issues, fine migrations, and phase trapping are the most common types of damage mechanisms in the dataset). RF uses 200 independent decision trees to minimize the influence of the noise and nonlinear features.\u003c/p\u003e \u003cp\u003eCNN model indicates that high-resolution images and a small number of samples limit the accuracy. With insufficient samples, the model cannot examine the roles of influencing parameters, such as pore throat distribution/geometry, clay components, particle shape, and microfracture networks, across different mechanisms. Chen et al. (\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2023\u003c/span\u003e) observes a similar phenomenon in rock type classification projects. When the sample size is limited, transfer learning methods can better capture rock texture features for classification. On the contrary, models without transfer learning (i.e., no pretrained model for image feature extractions) achieve lower accuracy. The hybrid LightGBM uses a feature-extraction model to mitigate the limitation of having a better performance compared to the CNN model. However, the 2048-dimensional embeddings cannot accurately represent image features.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003e4.2 Four Types of Damage Classification\u003c/h2\u003e \u003cp\u003eFine migration has the highest accuracy (i.e., 85.2% for both RF and XGBoost). The reason is due to the distinctiveness of the visual and mineralogical features. Mobilized clay platelets, small particles near pore throats, and a high percentage of clay minerals are indicators of migration damage (Galal et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2016\u003c/span\u003e, Wilson et al., \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2014\u003c/span\u003e). The accuracy of phase-trapping identification exceeds 88% with XGBoost and LightGBM. Steep inflection points in capillary pressure-saturation curves typically indicate narrow pore throat size, increasing the likelihood of phase trapping. Pressure is the primary evidence, while SEM images play a supporting role.\u003c/p\u003e \u003cp\u003eGeochemistry-related phenomena, such as rock-fluid interactions and wettability, are less accurately categorized. Wettability alteration occurs due to mineral adsorption at the surface. The process may influence capillary pressure, but cannot cause changes in SEM images (Al-Yaseri et al., \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2016\u003c/span\u003e). For the rock-fluid interactions, the distinct features can be observed when damage has already progressed significantly. For example, as mentioned in the Western Desert of Egypt case study (Alqutt et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). Insoluble silica gel is formed when HCl treatment triggers the illite and kaolinite decomposition.\u003c/p\u003e \u003cp\u003eAdditionally, the small sample size is a major challenge in this study (i.e., 155 samples for training and 27 for testing). One misclassification results in 3.7% accuracy penalty. Especially in wettability alteration evaluation, due to the amount of the sample with the issues being few, all models overpredict the \"no alteration\" class.\u003c/p\u003e \u003c/div\u003e"},{"header":"Conclusions","content":"\u003cp\u003eA computer vision\u0026ndash;based workflow integrating SEM images, capillary pressure curves, and petrophysical\u0026ndash;mineralogical data was successfully developed to automate the characterization of four formation-damage mechanisms: fine migration, phase trapping, rock\u0026ndash;fluid interactions, and wettability alterations. Compared with traditional evaluation techniques, the innovative method significantly reduces manual interpretation time and minimizes human error. Notably, the study extended previous AI-based formation-damage diagnosis tools by introducing computer vision techniques that integrate SEM and capillary-pressure images with mineralogical and petrophysical data. Despite using a small dataset (155 samples), the model achieved high accuracy, indicating the workflow's potential. The following summarizes the findings of this study:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eAmong the four evaluated machine learning algorithms, Random Forest, XGBoost, CNN, and a hybrid LightGBM model, the tree-based ensemble methods (Random Forest and XGBoost) consistently delivered the highest prediction accuracy (70.4% to 88.9% accuracy with different formation damage types). The other two methods have relatively low performance.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eFine migration and phase trapping were identified with a higher accuracy compared to rock-fluid interactions and wettability alterations due to their relatively clear mineralogical and pore-structure signatures.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eThe CNN model showed a lower performance, indicating that learning directly from pixel-level information alone is less effective than using pretrained feature extractors combined with structured data. High-resolution images are required. Otherwise, the algorithm cannot identify the type of damage.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eThe hybrid LightGBM model demonstrated competitive performance, particularly for fluid-trapping predictions, confirming the value of fusing numerical and deep image features to capture heterogeneous formation-damage signals.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003eAI\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;\u0026nbsp;artificial intelligence\u003c/p\u003e\n\u003cp\u003eANN\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;\u0026nbsp;artificial neural network\u003c/p\u003e\n\u003cp\u003eCNN\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;\u0026nbsp;Convolutional Neural Network\u003c/p\u003e\n\u003cp\u003eLightGBM\u0026nbsp; \u0026nbsp; \u0026nbsp;\u0026nbsp;Light Gradient Boosting Machine\u003c/p\u003e\n\u003cp\u003eML\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;\u0026nbsp;machine learning\u003c/p\u003e\n\u003cp\u003eRF\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;Random Forest\u003c/p\u003e\n\u003cp\u003eSEM\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;\u0026nbsp;Scanning Electron Microscopy\u003c/p\u003e\n\u003cp\u003eXRD \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;X-Ray Diffraction\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e \u003ch2\u003eDeclaration of Competing Interest\u003c/h2\u003e \u003cp\u003eThe authors report no declarations of interest.\u003c/p\u003e \u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eYW: Conceptualization, Machine Learning Algorithm Development, Draft and Revising the ManuscriptAG: Review and Editing, ResourcesSS: Review and Editing, Supervision CP: Review and Editing, Resources\u003c/p\u003e\u003ch2\u003eAcknowledgment\u003c/h2\u003e \u003cp\u003eThe authors would like to express their sincere appreciation to the Research in Energy Sustainability and Innovative Low-Impact Environmental Technologies (RESILIENT) Laboratory at Southern Methodist University for their support of the study.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eAl-Yaseri A, Mukainah A, Lebedev H, Barifcani M, A., Iglauer S (2016) Impact of fines and rock wettability on reservoir formation damage. Geophys Prospect 64:860\u0026ndash;874\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlqutt M, Sabaa A, Salem A, Seleem M, Abdel-Aziz I, Bassem A, Hussein A (2024) Reservoir Formation Damage, Myths, Facts, and Lessons Learned From the Damage of a Deep Well in the Western Desert of Egypt. \u003cem\u003eMediterranean Offshore Conference\u003c/em\u003e: SPE, D031S027R002\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBennion DB (2002) An overview of formation damage mechanisms causing a reduction in the productivity and injectivity of oil and gas producing formations. J Can Pet Technol 41\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen W, Su L, Chen X, Huang Z (2023) Rock image classification using deep residual neural network with transfer learning. Front Earth Sci 10:1079447\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCivan F (2007) Formation damage mechanisms and their phenomenological modeling\u0026mdash;an overview. \u003cem\u003eSPE European Formation Damage Conference and Exhibition\u003c/em\u003e: SPE, SPE-107857-MS\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCivan F (2023) Reservoir formation damage: fundamentals, modeling, assessment, and mitigation. Gulf Professional Publishing\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eErbas D, Gumrah F (2001) The use of genetic algorithms as an optimization tool for predicting permeability alteration in formation damage and improvement modelling. \u003cem\u003ePETSOC Canadian International Petroleum Conference\u003c/em\u003e: PETSOC, PETSOC-2001-2052\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGalal SK, Elgibaly AA, Elsayed SK (2016) Formation damage due to fines migration and its remedial methods. Egyptian J Petroleum 25:515\u0026ndash;524\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGhalambor A, Economides M (2002) Formation damage abatement: A quarter-century perspective. SPE J 7:4\u0026ndash;13\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHeiland J, Grove B, Harvey J, Walton I, Martin A (2009) New fundamental insights into perforation-induced formation damage. \u003cem\u003eSPE European Formation Damage Conference and Exhibition\u003c/em\u003e: SPE, SPE-122845-MS\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHoushmand N, GoodFellow S, Esmaeili K, Calder\u0026oacute;n JCO (2022) Rock type classification based on petrophysical, geochemical, and core imaging data using machine and deep learning techniques. Appl Comput Geosci 16:100104\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJiaojiao G, Jienian Y, Zhiyoong L, Zhong H (2010) Mechanisms and Prevention of Damage for Formations with Low-porosity and Low-permeability. \u003cem\u003eSPE International Oil and Gas Conference and Exhibition in China\u003c/em\u003e: SPE, SPE-130961-MS\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhurshid I, Afgan I (2021) Investigation of water composition on formation damage and related energy recovery from geothermal reservoirs: Geochemical and geomechanics insights. Energies 14:7415\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKuang J, Yarbrough J, Enayat S, Edward N, Wang J, Vargas FM (2019) Evaluation of solvents for in-situ asphaltene deposition remediation. Fuel 241:1076\u0026ndash;1084\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi J, Xiong G, Li N, Zhang Y, Rui Y (2024) Simulation of sandstone formation damage caused by solid particle invasion. J Dispers Sci Technol 45:1767\u0026ndash;1778\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNoah ZA (2016) Impact of Formation Damage on Well Productivity throughout Experimental Work and Field case study. \u003cem\u003eAfrica Oil and Gas conference\u003c/em\u003e. Ridge, Accra\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNunez Garcia W, Kleber M, Polo R, Franco C, Escobar M, Sierra A, Arango M (2015) Comprehensive methodology to identify, quantify and eliminate the formation damage mechanisms, succesfully applied for the first time by the operator in a colombian mature field; including formation damage modeling, well candidate selection, stimulation treatment design and execution: A case history. \u003cem\u003eSPE Latin America and Caribbean Petroleum Engineering Conference\u003c/em\u003e: SPE, D021S015R003\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStruchkov I, Rogachev M, Kalinin E, Roschin P (2019) Laboratory investigation of asphaltene-induced formation damage. J Petroleum Explor Prod Technol 9:1443\u0026ndash;1455\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSun Z, Chen Z (2025) Research Status and Development Direction of Formation Damage Prediction and Diagnosis Technologies. Appl Sci 15:1169\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTavakkoli M, Panuganti SR, Taghikhani V, Pishvaie MR, Chapman WG (2014) Precipitated asphaltene amount at high-pressure and high-temperature conditions. Energy Fuels 28:1596\u0026ndash;1610\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTsar M, Bahrami H, Rezaee R, Murickan G, Mehmood S, Ghasemi M, Ameri A, Mehdizadeh M (2012) Effect of Drilling Fluid (Water-based vs Oil-based) on Phase Trap Damage in Tight Sand Gas Reservoirs (SPE 154652). \u003cem\u003e74th EAGE Conference and Exhibition incorporating EUROPEC 2012\u003c/em\u003e: European Association of Geoscientists \u0026amp; Engineers, cp-293-00197\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVelasquez I, Silva I, Mart\u0026iacute;nez L, Rattia L, Labrador H, Villanueva I, P\u0026eacute;rez V, Ag\u0026uuml;ero B, Gonzalez M, Monroy R (2021) Interfacial phenomena in petroleum reservoir conditions related to fluid-fluid interactions and rock-fluid interactions: Formulation effects and in porous medium test. J Petrol Sci Eng 207:109076\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWilson M, Wilson L, Patey I (2014) The influence of individual clay minerals on formation damage of reservoir sandstones: a critical review with some new insights. Clay Miner 49:147\u0026ndash;164\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXiong H, Holditch SA (1995) A comprehensive approach to formation damage diagnosis and corresponding stimulation type and fluid selection. \u003cem\u003eSPE Oklahoma City Oil and Gas Symposium/Production and Operations Symposium\u003c/em\u003e: SPE, SPE-29531-MS\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXu C, Kang Y, You Z, Chen M (2016) Review on formation damage mechanisms and processes in shale gas reservoir: Known and to be known. J Nat Gas Sci Eng 36:1208\u0026ndash;1219\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang J, Dong Z, Dong M, Yang Z, Lin M, Zhang J, Chen C (2016) Wettability alteration during low-salinity waterflooding and the relevance of divalent ions in this process. Energy Fuels 30:72\u0026ndash;79\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhou Y, Yang W, Yin D (2022) Experimental investigation on reservoir damage caused by clay minerals after water injection in low permeability sandstone reservoirs. J Petroleum Explor Prod Technol 12:915\u0026ndash;924\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":false,"email":"","identity":"journal-of-petroleum-exploration-and-production-technology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Journal of Petroleum Exploration and Production Technology","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"VoR Journals","inReviewEnabled":false,"inReviewRevisionsEnabled":false},"keywords":"Formation Damage, Machine Learning, Computer Vision Technologies, Scanning Electron Microscopy (SEM)","lastPublishedDoi":"10.21203/rs.3.rs-9430254/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9430254/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eOil and gas recovery is significantly reduced by formation damage caused by improper drilling, completion, and production operations. Conventional evaluation methods rely on core sample analysis, well-logging assessments, and interpretation of production data, which require greater human effort and are susceptible to human error. This study presents an innovative automated category framework for the formation damage classification. Computer vision and machine learning are applied to evaluate scanning electron microscopy (SEM) images, petrophysical properties, and mineral data to characterize formation damage mechanisms. A dataset of SEM images taken from core samples is created, including both damaged and undamaged formations. Computer vision technologies, such as OpenCV, are employed to preprocess images. Capillary pressure, X-ray diffraction (XRD), and petrophysical measurements are also considered to improve accuracy. All data is fed to four machine learning algorithms, Random Forest (RF), XGBoost, a convolutional neural network (CNN), and a hybrid Light Gradient Boosting Machine (LightGBM) model, to identify different types of formation damage. Four common damage issues, fine migration, phase trapping, rock-fluid interactions, and wettability alteration, are the focus of this paper. The results show that XGBoost achieves the highest classification accuracy (over 77%). For fine migration, the accuracy of both RF and XGBoost are 85.2%. XGBoost and LightGBM are effective for distinguishing phase-trapping problems, achieving more than 88% accuracy. Rock-fluid interactions and wettability alterations range from 59.3% to 77.8%. The difficulty of detecting geochemical interactions and surface energy changes from the current database results in low accuracy. CNN model has the lowest performance. The ineffectiveness indicates that transfer learning with petrophysical inputs is better than direct pixel-level learning, especially when you cannot generate high-resolution images. Compared with conventional evaluation, the innovative method significantly reduces analysis time and minimizes bias, supporting decision-making in well stimulation design, formation damage mitigation, and production optimization strategies.\u003c/p\u003e","manuscriptTitle":"Machine-Learning-Driven Formation Damage Classification Based on SEM Images and Petrophysical Data","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-05-12 14:15:44","doi":"10.21203/rs.3.rs-9430254/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"reviewerAgreed","content":"20333544303142898395526577304874486279","date":"2026-05-05T17:52:10+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"262726481095655867494287506147167619416","date":"2026-05-04T08:54:47+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-05-03T23:26:45+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-04-20T12:58:32+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-04-17T18:51:35+00:00","index":"","fulltext":""},{"type":"submitted","content":"Journal of Petroleum Exploration and Production Technology","date":"2026-04-15T18:15:20+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":false,"email":"","identity":"journal-of-petroleum-exploration-and-production-technology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Journal of Petroleum Exploration and Production Technology","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"VoR Journals","inReviewEnabled":false,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"b7bd2bac-44b9-4205-9df6-df39da516819","owner":[],"postedDate":"May 12th, 2026","published":true,"recentEditorialEvents":[{"type":"reviewerAgreed","content":"20333544303142898395526577304874486279","date":"2026-05-05T17:52:10+00:00","index":15,"fulltext":""},{"type":"reviewerAgreed","content":"262726481095655867494287506147167619416","date":"2026-05-04T08:54:47+00:00","index":13,"fulltext":""},{"type":"reviewersInvited","content":"7","date":"2026-05-03T23:26:45+00:00","index":"","fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-05-12T14:15:44+00:00","versionOfRecord":[],"versionCreatedAt":"2026-05-12 14:15:44","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9430254","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9430254","identity":"rs-9430254","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00