Lithology-Guided Shear Wave Velocity Prediction: Ensemble Machine Learning and Composite Models for Bridging the Generalization Gap | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Lithology-Guided Shear Wave Velocity Prediction: Ensemble Machine Learning and Composite Models for Bridging the Generalization Gap Md. Asif Uz Zaman Antu, Mohammad Islam Miah, Md. Abdul Matin Mondol, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8821053/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract A very important parameter in seismic reservoir characterization and geomechanical modeling is the shear wave velocity (Vs). Nevertheless, it is not easy to quantify in the wells because the acquisition cost can be very expensive, even though machine learning (ML) is an affordable method for creating synthetic logs; random train-test splitting is common in traditional research. This approach does not consider spatial heterogeneity, leading to excessively optimistic projections of generalization. To explore this critical gap, this paper develops a lithology-sensitive composite model of Vs prediction, and the model is thoroughly tested on a blind-well dataset of complex lithologies of sandstone, heterolithic, and shales. Three different machine learning models (e.g., Random Forest (RF), Artificial Neural Network (ANN), and Extreme Gradient Boosting (XGB)) and composite models are evaluated to determine their predictive effectiveness, as well as the existing empirical correlations, in the case of domain-shift. The findings showed that there is no dominant ML algorithm in all lithological facies. Sandstone, heterolithic, and shales are used with the algorithms of RF, ANN, and XGB, respectively. Results revealed that the composite model worked the best with an operational precision of 71.0 percent of predictions within the ± 150 m/s of measured values, which were much better than individual ML models and empirical correlations. This paper indicates that the ensemble-based geologic methodology could offer a sound methodology in forecasting Vs of basins that have complex geology and scanty information. Machine learning Domain adaptation Shear wave Geomechanical models Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8821053","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":587685538,"identity":"10ce04e4-2f0e-434c-968d-4fce3abe5da3","order_by":0,"name":"Md. Asif Uz Zaman Antu","email":"","orcid":"","institution":"University of Dhaka","correspondingAuthor":false,"prefix":"","firstName":"Md.","middleName":"Asif Uz Zaman","lastName":"Antu","suffix":""},{"id":587685539,"identity":"79b7cd77-8f7c-4a99-a941-bb3971ecf892","order_by":1,"name":"Mohammad Islam Miah","email":"","orcid":"","institution":"Chittagong University of Engineering \u0026 Technology","correspondingAuthor":false,"prefix":"","firstName":"Mohammad","middleName":"Islam","lastName":"Miah","suffix":""},{"id":587685540,"identity":"8ca68b18-c1d3-422a-8929-10c6c270663a","order_by":2,"name":"Md. Abdul Matin Mondol","email":"","orcid":"","institution":"Bangladesh Petroleum Exploration and Production Company Limited","correspondingAuthor":false,"prefix":"","firstName":"Md.","middleName":"Abdul Matin","lastName":"Mondol","suffix":""},{"id":587685541,"identity":"36a96dec-673d-4128-9d9a-0e74306addd4","order_by":3,"name":"Md. Anwar Hossain Bhuiyan","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAlklEQVRIiWNgGAWjYDACCR7GBzxgFg+DBLFamA1I1sImQZoW/tm9xyretjHI8zfwHrxBnCV3zqXdnNvGYDjjAF+yBXEOu5Fjdpu3jYFxAwOPGXEOkwdqKQZqsSdeiwFQCzNQSyLxWgxv5CVLzjknkTzjMLF+kbuRe/DDmzIb2/72XiJDDAqATmImRf0oGAWjYBSMAvwAALgbKGkNagIdAAAAAElFTkSuQmCC","orcid":"","institution":"University of Dhaka","correspondingAuthor":true,"prefix":"","firstName":"Md.","middleName":"Anwar Hossain","lastName":"Bhuiyan","suffix":""},{"id":587685542,"identity":"a950e747-f2e8-4c06-a98a-35c6e520bdbc","order_by":4,"name":"Mohammad Solaiman","email":"","orcid":"","institution":"University of Dhaka","correspondingAuthor":false,"prefix":"","firstName":"Mohammad","middleName":"","lastName":"Solaiman","suffix":""}],"badges":[],"createdAt":"2026-02-08 11:08:30","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8821053/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8821053/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":102431822,"identity":"89ee82e2-123a-4fcd-8241-a243e341337c","added_by":"auto","created_at":"2026-02-11 15:28:05","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1213943,"visible":true,"origin":"","legend":"","description":"","filename":"ManuscriptAntu.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8821053/v1_covered_94ed68ee-8823-405e-9bf9-7692f6478c9b.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Lithology-Guided Shear Wave Velocity Prediction: Ensemble Machine Learning and Composite Models for Bridging the Generalization Gap","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Machine learning, Domain adaptation, Shear wave, Geomechanical models","lastPublishedDoi":"10.21203/rs.3.rs-8821053/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8821053/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eA very important parameter in seismic reservoir characterization and geomechanical modeling is the shear wave velocity (Vs). Nevertheless, it is not easy to quantify in the wells because the acquisition cost can be very expensive, even though machine learning (ML) is an affordable method for creating synthetic logs; random train-test splitting is common in traditional research. This approach does not consider spatial heterogeneity, leading to excessively optimistic projections of generalization. To explore this critical gap, this paper develops a lithology-sensitive composite model of Vs prediction, and the model is thoroughly tested on a blind-well dataset of complex lithologies of sandstone, heterolithic, and shales. Three different machine learning models (e.g., Random Forest (RF), Artificial Neural Network (ANN), and Extreme Gradient Boosting (XGB)) and composite models are evaluated to determine their predictive effectiveness, as well as the existing empirical correlations, in the case of domain-shift. The findings showed that there is no dominant ML algorithm in all lithological facies. Sandstone, heterolithic, and shales are used with the algorithms of RF, ANN, and XGB, respectively. Results revealed that the composite model worked the best with an operational precision of 71.0 percent of predictions within the \u0026plusmn;\u0026thinsp;150 m/s of measured values, which were much better than individual ML models and empirical correlations. This paper indicates that the ensemble-based geologic methodology could offer a sound methodology in forecasting Vs of basins that have complex geology and scanty information.\u003c/p\u003e","manuscriptTitle":"Lithology-Guided Shear Wave Velocity Prediction: Ensemble Machine Learning and Composite Models for Bridging the Generalization Gap","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-02-10 13:25:21","doi":"10.21203/rs.3.rs-8821053/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"beb9880b-d4c5-4284-ad58-f16c5dfa5233","owner":[],"postedDate":"February 10th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-02-11T15:27:15+00:00","versionOfRecord":[],"versionCreatedAt":"2026-02-10 13:25:21","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8821053","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8821053","identity":"rs-8821053","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.