Predicting Alzheimer's disease in imbalanced datasets focusing on cardiovascular risk scales with machine learning models

preprint OA: closed
Full text JSON View at publisher
Full text 13,039 characters · extracted from preprint-html · click to expand
Predicting Alzheimer's disease in imbalanced datasets focusing on cardiovascular risk scales with machine learning models | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Predicting Alzheimer's disease in imbalanced datasets focusing on cardiovascular risk scales with machine learning models Gemma García-Lluch, Angélica Resendiz Mora, Lucrecia Moreno Royo, and 3 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4565529/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Purpose Considering the aging population, the prevalence of Alzheimer's disease (AD) is on the rise. As there is currently no cure for AD, it is crucial to identify the key factors contributing to its progression. Cardiovascular risk is believed to play a significant role in the advancement of AD, potentially leading to neurodegenerative changes in the brain. Therefore, this project seeks to demonstrate the effectiveness of using machine learning models (ML) to develop non-invasive and cost-effective screening tools incorporating various cardiovascular risk scores. Methods We gathered data from the electronic health records (EHR) of a hospital of reference in Spain. This process yielded a highly imbalanced dataset of 177 diagnosed subjects and 48 controls aged 50 to 75. To address this common issue, we employed a range of ML models, along with balancing techniques and metrics, to overcome such a typical problem, leading to the development of highly accurate models. Results Several bagging, boosting, linear, and stacked models resulted in better F1-Score, and cardiovascular risk scales, such as SCORE2, were essential for such prediction algorithms. Glucose levels seemed important in AD prediction, and drugs such as anticholinergics, antidepressants, or angiotensin-converting enzyme inhibitors were positively related to AD prediction. In contrast, nonsteroidal anti-inflammatory drugs and angiotensin receptor blockers had the opposite effect. Conclusion Our research demonstrates the potential of machine learning techniques to improve the screening of AD patients before they undergo invasive and costly diagnosis tests, allowing personalized rationalization of healthcare costs and improving patient care. Alzheimer Cardiovascular Risk Risk Prediction Machine Learning Imbalance Data Data-Driven Approach Full Text Additional Declarations No competing interests reported. Supplementary Files SupplementaryMaterial.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4565529","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":317494935,"identity":"17b2e30f-42f2-43ad-856d-2ee204820eed","order_by":0,"name":"Gemma García-Lluch","email":"","orcid":"","institution":"Cátedra DeCo MICOF-CEU UCH, Universidad Cardenal Herrera-CEU","correspondingAuthor":false,"prefix":"","firstName":"Gemma","middleName":"","lastName":"García-Lluch","suffix":""},{"id":317494937,"identity":"dd624db5-363d-4959-a280-dd276c613900","order_by":1,"name":"Angélica Resendiz Mora","email":"","orcid":"","institution":"Embedded Systems and Artificial Intelligence Group, Universidad Cardenal Herrera-CEU","correspondingAuthor":false,"prefix":"","firstName":"Angélica","middleName":"Resendiz","lastName":"Mora","suffix":""},{"id":317494939,"identity":"b217936e-839a-40a4-9f1c-0b997eb85338","order_by":2,"name":"Lucrecia Moreno Royo","email":"","orcid":"","institution":"Cátedra DeCo MICOF-CEU UCH, Universidad Cardenal Herrera-CEU","correspondingAuthor":false,"prefix":"","firstName":"Lucrecia","middleName":"Moreno","lastName":"Royo","suffix":""},{"id":317494941,"identity":"3a3f78b1-36e4-49fa-aad8-94674696e9b0","order_by":3,"name":"Consuelo Cháfer-Pericás","email":"","orcid":"","institution":"Research Group in Alzheimer Disease, Instituto de Investigación Sanitaria La Fe","correspondingAuthor":false,"prefix":"","firstName":"Consuelo","middleName":"","lastName":"Cháfer-Pericás","suffix":""},{"id":317494943,"identity":"55359b32-f2ad-42ee-beee-d13598ae7424","order_by":4,"name":"Miquel Baquero","email":"","orcid":"","institution":"Hospital Universitari i Politècnic La Fe","correspondingAuthor":false,"prefix":"","firstName":"Miquel","middleName":"","lastName":"Baquero","suffix":""},{"id":317494944,"identity":"d366fb75-5ba1-4814-a63a-7a7c18f56e2f","order_by":5,"name":"Juan Pardo","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA0ElEQVRIiWNgGAWjYBACxgYILcPPwMBGmhYeyQZitcAAj8EBYrUwz0h+9uDnHjse4xtAxo+Kwwz8/AcIOGxGmrlhz7NkHrMbIMaZwwySMxIIaUkwk+A5wAzUksMmzdh2mMHgBgGHMc5I/yb550A9j/EMkJZ/hxnszxN0WI6ZNM+BwzwGEiAtDUBbGAg5rOdNubHMgeM8EmeemUn2HEvnkbhBQIthe/q2h28OVMvxtyc/k/hRYy3H30/AYYboMciDXz0QyBOdTkbBKBgFo2DkAgCe9T8l5uX6tQAAAABJRU5ErkJggg==","orcid":"","institution":"Embedded Systems and Artificial Intelligence Group, Universidad Cardenal Herrera-CEU","correspondingAuthor":true,"prefix":"","firstName":"Juan","middleName":"","lastName":"Pardo","suffix":""}],"badges":[],"createdAt":"2024-06-11 16:32:18","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4565529/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4565529/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":59221905,"identity":"b7fab369-111b-44e0-87ce-934ecfb4c98b","added_by":"auto","created_at":"2024-06-27 21:49:24","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":820486,"visible":true,"origin":"","legend":"","description":"","filename":"alzscreenjpardo.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4565529/v1_covered_e41faf39-8a29-41d7-ae32-4ac704895b06.pdf"},{"id":59212594,"identity":"5aa906a3-8b44-480c-828f-24a5c03f3429","added_by":"auto","created_at":"2024-06-27 17:53:27","extension":"docx","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":21418,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryMaterial.docx","url":"https://assets-eu.researchsquare.com/files/rs-4565529/v1/135c1f23c79658dc3c1a978e.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Predicting Alzheimer's disease in imbalanced datasets focusing on cardiovascular risk scales with machine learning models","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Alzheimer, Cardiovascular Risk, Risk Prediction, Machine Learning, Imbalance Data, Data-Driven Approach","lastPublishedDoi":"10.21203/rs.3.rs-4565529/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4565529/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003ePurpose\u003c/h2\u003e \u003cp\u003eConsidering the aging population, the prevalence of Alzheimer's disease (AD) is on the rise. As there is currently no cure for AD, it is crucial to identify the key factors contributing to its progression. Cardiovascular risk is believed to play a significant role in the advancement of AD, potentially leading to neurodegenerative changes in the brain. Therefore, this project seeks to demonstrate the effectiveness of using machine learning models (ML) to develop non-invasive and cost-effective screening tools incorporating various cardiovascular risk scores.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003eWe gathered data from the electronic health records (EHR) of a hospital of reference in Spain. This process yielded a highly imbalanced dataset of 177 diagnosed subjects and 48 controls aged 50 to 75. To address this common issue, we employed a range of ML models, along with balancing techniques and metrics, to overcome such a typical problem, leading to the development of highly accurate models.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eSeveral bagging, boosting, linear, and stacked models resulted in better F1-Score, and cardiovascular risk scales, such as SCORE2, were essential for such prediction algorithms. Glucose levels seemed important in AD prediction, and drugs such as anticholinergics, antidepressants, or angiotensin-converting enzyme inhibitors were positively related to AD prediction. In contrast, nonsteroidal anti-inflammatory drugs and angiotensin receptor blockers had the opposite effect.\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e \u003cp\u003e Our research demonstrates the potential of machine learning techniques to improve the screening of AD patients before they undergo invasive and costly diagnosis tests, allowing personalized rationalization of healthcare costs and improving patient care.\u003c/p\u003e","manuscriptTitle":"Predicting Alzheimer's disease in imbalanced datasets focusing on cardiovascular risk scales with machine learning models","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-06-27 17:53:22","doi":"10.21203/rs.3.rs-4565529/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"10e5a391-ff86-446e-9578-17ce4468bcf2","owner":[],"postedDate":"June 27th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-07-11T09:21:05+00:00","versionOfRecord":[],"versionCreatedAt":"2024-06-27 17:53:22","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4565529","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4565529","identity":"rs-4565529","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00