Interpretable Machine Learning for Cognitive Impairment Assessment: SHAP-Based Analysis of XGBoost Models Using NHANES 2011-2014 Cognitive Tests in Older Adults

doi:10.21203/rs.3.rs-6872563/v1

Interpretable Machine Learning for Cognitive Impairment Assessment: SHAP-Based Analysis of XGBoost Models Using NHANES 2011-2014 Cognitive Tests in Older Adults

2025 · doi:10.21203/rs.3.rs-6872563/v1

preprint OA: closed

Full text JSON View at publisher

Full text 83,887 characters · extracted from preprint-html · click to expand

Interpretable Machine Learning for Cognitive Impairment Assessment: SHAP-Based Analysis of XGBoost Models Using NHANES 2011-2014 Cognitive Tests in Older Adults | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Interpretable Machine Learning for Cognitive Impairment Assessment: SHAP-Based Analysis of XGBoost Models Using NHANES 2011-2014 Cognitive Tests in Older Adults Ligong Ji, Xiaoming Ma, Yuxiang Zhang, Xiaojing Li, Zenglin Cai, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6872563/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background: Cognitive impairment in older adults is a key risk factor for dementia, yet early detection remains challenging. Traditional models lack interpretability and struggle to capture complex risk patterns. Interpretable machine learning offers a solution by combining predictive accuracy with transparency. This study uses National Health and Nutrition Examination Survey (NHANES) 2011–2014 data to develop explainable models for domain-specific cognitive tests in adults aged 60 and above. Objective: This study establishes a novel explainable AI framework that bridges interpretable machine learning models for domain-specific cognitive impairment assessment in adults over 60 years old, leveraging the NHANES 2011-2014 dataset to identify critical biomarkers and modifiable risk factors. Methods: Four cognitive tests—Digit Symbol Substitution Test (DSST), Delayed Recall Test (DRT), Animal Fluency Test (AFT), and Immediate Recall Test (IRT)—were individually modeled using XGBoost regression. Model robustness was ensured through five-fold cross-validation and grid search hyperparameter optimization. SHapley Additive exPlanations (SHAP) were applied to elucidate feature contributions. Results: The optimized models achieved superior predictive performance: DSST (RMSE = 12.473, R² = 0.459, MAE = 9.975), DRT (RMSE = 2.004, R² = 0.148, MAE = 1.617), AFT (RMSE = 5.043, R² = 0.191, MAE = 3.977), and IRT (RMSE = 4.082, R² = 0.197, MAE = 3.252). Conclusion: This is the first study to establish individualized XGBoost models for distinct NHANES cognitive domains, demonstrating how SHAP-driven interpretability enhances early screening strategies. Our findings highlight metabolic markers (e.g., Diabetes Mellitus) and lifestyle factors as actionable targets for dementia prevention in aging populations. NHANES XGBoost SHAP Cognitive Impairment Cognition Test Aging Figures Figure 1 Figure 2 Figure 3 1. Introduction Cognitive decline in older adults represents a significant and growing public health challenge, profound implications for individuals, families, and healthcare systems. The prevalence of cognitive impairment among adults aged 65 and older exceeds 20% in the United States, and the associated economic burden of dementia care is projected to reach $1 trillion annually by 2050[1]. Given this, early identification of modifiable risk factors is essential for developing effective preventive interventions and mitigating the societal impact of cognitive decline. However, current methods for cognitive assessment, such as the Mini-Mental State Examination (MMSE) and the Montreal Cognitive Assessment (MoCA), face several limitations that hinder their capacity to detect subtle cognitive deficits and model the complex relationships between risk factors and cognitive outcomes. Two key limitations are particularly noteworthy: 1) Composite scoring systems, like the MMSE, tend to obscure domain-specific deficits by aggregating diverse cognitive measures into a single score; 2) Traditional statistical methods often fail to capture the nonlinear, interactive effects among a wide array of multidimensional predictors, including demographic factors, lifestyle behaviors, metabolic markers, and neuropsychological test scores[2]. These shortcomings emphasize the need for more advanced approaches that can accurately model the complexity of cognitive decline and provide actionable insights for early intervention. While traditional statistical approaches have identified various associations between demographic factors and cognitive performance, they often fall short in addressing the intricate, multifaceted nature of cognitive decline. Specifically, factors such as age, sex, education level, metabolic health, and environmental influences interact in complex, nonlinear ways that are not easily captured by conventional methods. In contrast, modern cognitive assessments, such as the National Health and Nutrition Examination Survey (NHANES) cognitive battery, offer a more comprehensive evaluation by incorporating multiple cognitive domains, including memory, fluency, and executive function. NHANES cognitive battery provides a distinct evaluative perspective by incorporating multidimensional assessments of memory, fluency, and executive function, provide a unique opportunity to examine cognitive function through a multifactorial lens. These tests assess critical aspects of cognitive processing, from hippocampal-dependent memory consolidation to frontostriatal processing speed, enabling a nuanced understanding of cognitive impairment in older adults. NHANES cognitive assessments, comprising the Immediate Recall Test (IRT)[3], Delayed Recall Test (DRT)[4], IRT/DRT can quantify hippocampal-dependent memory consolidation, Animal Fluency Test (AFT)[5] can assesses left frontal-temporal semantic networks, and Digit Symbol Substitution Test (DSST)[6] proxies frontostriatal processing speed, provide a unique opportunity to investigate these multifactorial relationships through advanced analytics. Despite their advantages, prior research[7] in this area relies on the Z-score normalization method to aggregate scores across different cognitive tests into a single composite measure. While this approach simplifies statistical analysis, it introduces several critical limitations. Specifically, it ignores the biological differences between cognitive domains, fails to account for threshold effects (e.g., accelerated decline in processing speed associated with high HbA1c levels), and distorts the covariance structure of tests that share neural substrates. As a result, traditional methods like Z-score aggregation may obscure domain-specific deficits and fail to capture the full complexity of cognitive impairment. To summarize, traditional analysis[7] using z-score aggregation introduce critical biases: First, Equal weighting of biologically distinct domains (e.g., 1 SD in DSST ≠ 1 SD in DRT neurobiologically). Secondly, loss of threshold effects (e.g., accelerated DSST decline when HbA1c >6.5%). Thirdly, covariance distortion across tests with shared neural substrates. Machine learning approaches, particularly ensemble methods like eXtreme Gradient Boosting (XGBoost)[8], offer a promising alternative to traditional statistical methods. XGBoost excels in modeling nonlinear relationships between predictors and cognitive outcomes, enabling more accurate predictions and the identification of complex interactions among variables [8]. Furthermore, XGBoost can effectively handle mixed data types, including continuous, categorical, and ordinal variables, making it well-suited for analyzing the diverse set of predictors that influence cognitive decline. However, the widespread adoption of machine learning techniques in clinical settings is hindered by their "black-box" nature. While these models provide highly accurate predictions, they often lack interpretability, which is essential for clinical decision-making. To address this issue, explainable AI (XAI) techniques, such as SHapley Additive exPlanations (SHAP), have been developed to provide insights into the underlying decision-making process of complex models. SHAP offers a framework for identifying the contributions of individual features to cognitive outcomes, enabling clinicians to better understand the role of specific risk factors in cognitive decline. By using SHAP to analyze XGBoost models, this study seeks to provide domain-specific, interpretable insights into the relationships between cognitive performance and a wide range of predictors, including lifestyle factors, metabolic markers, and demographic characteristics[9]. In summary, this study aims to bridge the gap between advanced machine learning models and clinically relevant, interpretable explanations of cognitive impairment. By leveraging SHAP-based analysis of XGBoost models trained on NHANES cognitive data, we seek to uncover critical risk factors and interactions that influence cognitive decline in older adults. This approach promises to provide a more nuanced understanding of cognitive impairment and offer valuable insights for targeted, preventive interventions. 1.IRT Test Content: Participants are presented with a list of unrelated words (e.g., 10 nouns) and asked to verbally recall as many words as possible immediately after presentation. This assesses short-term episodic memory and auditory-verbal learning capacity. Scoring Method: Total score = Number of correctly recalled words (0-10) No penalty for order errors or intrusions (non-list words) 2.DRT Test Content: After a 20–30-minute delay (filled with non-memory tasks), participants are asked to recall the same word list from the IRT without prior warning. This evaluates long-term memory retention and hippocampal-dependent consolidation. Scoring Method: Total score = Number of correctly recalled words (0-10) Compared to IRT scores to calculate percent retention: (DRT/IRT) ×100 3.AFT Test Content: Participants name as many animals as possible within 60 seconds, excluding proper nouns or repetitions. This measures semantic memory access, verbal fluency, and executive control of lexical retrieval. Scoring Method: Total score = Valid unique animal names generated Exclusions: Non-animal terms (e.g., "dragon"), repetitions, or variants (e.g., "dog"/"puppy" count as one) 4.DSST Test Content: Participants match numbers (1-9) to corresponding abstract symbols using a reference key within 120 seconds. This assesses processing speed, visual-motor coordination, and working memory. Scoring Method: Total score = Number of correct symbol-number pairings (0-133) Errors (incorrect matches) are recorded but typically not penalized in NHANES scoring. Recent advances in machine learning, particularly ensemble methods like XGBoost, offer superior predictive accuracy for complex biomedical datasets. However, clinical adoption requires interpretability frameworks that align with neuropsychological theory. Moreover, the "black-box" nature of such models hinders clinical interpretability—a gap addressed by explainable AI (XAI) techniques like SHapley Additive exPlanations (SHAP)[2]. SHAP provides domain-specific feature attribution by: 1) Calculating independent contributions to each cognitive domain. 2) Identifying critical biomarker thresholds (e.g., systolic BP >140 mmHg). 3) Visualizing interaction effects (e.g., education × pesticide exposure). 2. Methods 2.1Study Population This cross-sectional analysis utilized data from the NHANES 2011-2014 cycles. Among 3,802 participants aged ≥60 years who completed cognitive testing, the following eligibility criteria were applied: Inclusion Criteria: 2.1.1 Age ≥60 years at survey participation; 2.1.2 Complete data on all four cognitive tests (CERAD Word Learning, Delayed Recall, Animal Fluency, Digit Symbol Substitution). Exclusion Criteria: 2.1.3 Refuse to take cognition tests; 2.1.4 Missing responses in any cognitive test item (n=698 excluded); 2.1.5 Incomplete covariate data (demographics, comorbidities, or laboratory measures) (n=463 excluded). 2.1.6 The final unweighted analytical sample consisted of 2,733 participants. Ethical Considerations As this study involved retrospective analysis of publicly available, de-identified NHANES data, institutional review board (IRB) approval was exempted under 45 CFR §46.104(d)(4), research involving secondary analysis of existing public datasets that contain no identifiable private information is exempt from institutional review board (IRB) approval. The original NHANES protocols were reviewed and approved by the National Center for Health Statistics (NCHS) Ethics Review Board (Protocol #2011-17), with written informed consent obtained from all participants during data collection. 2.2 Statistical Analysis All statistical analyses were performed using R software (version 4.4.3; R Foundation for Statistical Computing). This study cohort consisted of 2,733 participants, which exceeding the minimum sample size required for machine learning models with up to 20 features as determined by 5-fold cross-validation. Continuous variables (e.g., cognitive test scores, age) were presented as mean ± standard deviation (SD). Categorical variables (e.g., sex, education level) presented as counts and proportions (%), as presented in Table 1 . For each machine learning model, such as eXtreme Gradient Boosting (XGBoost), Random Forest, and Support Vector Machines (SVM), independent training-validation splits were generated using stratified random sampling to ensure representative distributions across key variables. The dataset was divided into training and validation sets in a 4:1 ratio, which ensured complete separation between the training processes and prevented any data leakage or inter-model contamination. All variables showed no statistically significant differences between the training and validation sets. Each validation set was used exclusively to evaluate the corresponding model. Performance metrics were calculated as follows: Root Mean Squared Error (RMSE) - Used to measure model-specific prediction errors, providing an estimate of the average deviation between predicted and observed values; R-squared (R²) - Used to quantify the proportion of variance explained by the model, offering insights into the model's overall fit and explanatory power; Mean Absolute Error (MAE) - Used to assess the absolute accuracy of each model, reflecting the average magnitude of the prediction errors. No cross-model baseline comparisons were performed, as the primary focus of this analysis was on assessing the intra-model generalizability rather than comparing model performance across different architectures. 3. Results 3.1 Characteristics of participants In this study, 2,733 participants aged ≥60 years who completed all cognitive assessments. As summarized in Table 1 , detailed demographic characteristics of the cohort are presented using standard descriptive statistics. Table 1. Demographic Characteristics & Clinical Information of Participants Characteristic All (N = 2,733) Male (%) 1331 (48.7) Age (Years) 69.4 ± 6.75 Education (%) Less than 9th grade 299 (10.9) 9-11th grade (Includes 12th grade with no diploma) 386 (14.1) High school graduate/GED or equivalent 639 (23.4) Some college or AA degree 786 (28.8) College graduate or above 623 (22.8) Marriage (%) Never married 154 (5.6) Married 1514 (55.4) Divorced 389 (14.2) Widowed 532 (19.5) Separated 68 (2.5) Living with partner 76 (2.8) Drink (%) No 871 (31.9) Yes 1862 (68.1) Hypertension (%) No 1025 (37.5) Yes Dyslipidemia No 1191 (43.6) Yes 1542 (56.4) Recall1 4.71 ± 1.69 Recall2 6.71 ± 1.82 Recall3 7.53 ± 1.80 Delay 5.95 ± 2.29 Animal Fluency Test 16.6 ± 5.44 Diabetes (%) No 1982 (72.5) Borderline 125 (4.6) Yes 626 (22.9) Pesticide No 2364 (86.5) Yes 369 (13.5) Herbicide No 2507 (91.7) Yes 226 (8.3) High Activity No Change 2118 (77.5) Worse Now 472 (17.3) Better Now 143 (5.2) Moderate Activity Yes 2497 (91.4) No 236 (8.6) Walk/Cycle to Work Yes 296 (10.8) No 2437 (89.2) High Exercise Yes 759 (27.8) No 1974 (72.2) Moderate Exercise Yes 326 (11.9) No 2407 (88.1) Sitting Minutes/ Day 393 ± 190 Sleeping Hours/ Day 7.01 ± 1.41 Sleeping Disorder Yes 560 (20.5) No 2173 (79.5) Olfactory changes Yes 250 (9.1) No 2483 (90.9) Taste changes Yes 1062 (38.9) No 1671 (61.1) Smelling Problem Yes 264 (9.7) No 2469 (90.3) Tasting Problem Yes 164 (6.0) No 2569 (94.0) IRT score 19.00 ± 4.62 AFT score 16.60 ± 5.44 DSST score 45.90 ± 17.10 DRT score 5.95 ± 2.29 3.2 Models Development & Validation Based on the four final models constructed by XGBoost algorithm (with regression modeling for IRT, DRT, AFT, and DSST, subsequently optimized via grid search), variable importance ranking plot ( Figure 1 ), partial dependence plot ( Figure 2 ), and SHAP swarm plots ( Figure 3 ) were generated. 3.2 Models Evaluation RMSE R 2 MAE IRT 4.082 0.197 3.252 DRT 2.007 0.144 1.621 AFT 4.862 0.195 3.846 DSST 12.947 0.438 10.271 4. Discussion A gradual deterioration in cognitive status is commonly observed in individuals aged ≥ 60 years as part of the natural aging process, with a notable decline in cognitive function over time[10]. As depicted in Figure 1 and Figure 3 , factors such as education level or years of formal education account for a significant portion of the variance in cognitive impairment within the elderly population, where cognitive decline is closely linked with increasing functional disability[11]. In addition, as shown in Figure 2 and Figure 3 , age plays a crucial role in the onset and progression of cognitive impairment. Aging itself is an unavoidable and major risk factor for cognitive decline[12]. A regular and sufficient amount of sleep is vital for maintaining brain health. However, with the advancement of age, both the duration of sleep and the occurrence of sleep disorders can significantly impact cognitive function in older adults. A cohort study conducted in China[13] highlighted that, compared to elderly individuals who sleep 6 hours per day, those with less than 5 hours of sleep had a 30% increased risk of cognitive impairment (HR = 1.30, 95% CI: 1.05 – 1.62). Furthermore, those who slept 7, 8, and more than 9 hours per day had a 34% (HR = 1.34, 95% CI: 1.09 – 1.64), 40% (HR = 1.40, 95% CI: 1.17 – 1.69), and 43% (HR = 1.43, 95% CI: 1.19 – 1.70) increased risk of cognitive impairment, respectively. Trend tests indicated that as sleep duration increased beyond 6 hours, the risk of cognitive impairment also heightened, suggesting a dose-response relationship ( P < 0.001). It is important to emphasize that physical activity, exercise, and moderate work have a protective effect on cognitive health in older adults. Whether it involves high-intensity or moderate physical exercises, or even simpler activities such as cycling or walking to work, all contribute positively to mitigating cognitive decline[14]. On the other hand, chronic alcohol consumption significantly undermines cognitive reserve. Long-term heavy alcohol use (defined by the World Health Organization/European Medicines Agency as exceeding 60 grams of pure alcohol per day for men and more than 40 grams per day for women) is strongly associated with an increased risk of cognitive impairment and dementia[15]. Additionally, underlying medical conditions common in older adults, such as hypertension, dyslipidemia, and sensory changes—specifically in taste and smell—can further exacerbate cognitive decline[16]. Finally, exposure to pesticides and herbicides continues to pose a significant threat to cognitive function in the elderly. Various studies have shown that such environmental toxins contribute to cognitive deterioration in residents[17, 18]. 5. Conclusion This is the first study to establish individualized XGBoost models for distinct cognitive domains within the NHANES dataset, marking a significant advancement in the application of machine learning for cognitive assessment in older adults. By leveraging SHAP-driven interpretability, this research demonstrates how machine learning models can not only provide highly accurate predictions of cognitive decline but also offer valuable insights into the underlying mechanisms driving these predictions. SHAP values, which highlight the contribution of individual features to model outcomes, provide a clearer understanding of how specific factors influence cognitive impairment, thereby enhancing early screening strategies for dementia and other cognitive disorders. Our findings underscore the critical role of metabolic markers, such as Diabetes Mellitus, Sleep Disorders, in the prediction of cognitive decline, highlighting their potential as actionable targets for dementia prevention. Additionally, lifestyle factors, including physical activity and sleep duration, were identified as key modifiable risk factors. These insights provide a framework for developing personalized intervention strategies, where healthcare providers can target specific risk factors based on an individual's unique cognitive profile. Ultimately, our study contributes to the growing body of knowledge on the importance of early and individualized risk assessments for dementia prevention in aging populations, offering a promising avenue for improving long-term cognitive health outcomes. Declarations Funding: This study was supported by the Jiangsu Commission of Health (K2023023, LKZ2024006), the Suzhou Municipal Health Commission (DZXYJ202313), and the Nanjing Drum Tower Hospital Clinical Research Project (2021-LCYJ-MS-21). Competing interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. References Livingston G, Huntley J, Sommerlad A, Ames D, Ballard C, Banerjee S, et al. Dementia prevention, intervention, and care: 2020 report of the Lancet Commission. Lancet (London, England). 2020;396:413-46 https://doi.org/10.1016/s0140-6736(20)30367-6. Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From Local Explanations to Global Understanding with Explainable AI for Trees. Nature machine intelligence. 2020;2:56-67 https://doi.org/10.1038/s42256-019-0138-9. Bernhardt EBJDUTG. Testing foreign language reading comprehension: The immediate recall protocol. 1983;16:27-33. Camos V, Portrat S. The impact of cognitive load on delayed recall. Psychonomic bulletin & review. 2015;22:1029-34 https://doi.org/10.3758/s13423-014-0772-5. Sebaldt R, Dalziel W, Massoud F, Tanguay A, Ward R, Thabane L, et al. Detection of cognitive impairment and dementia using the animal fluency test: the DECIDE study. 2009;36:599-604. Jaeger J. Digit Symbol Substitution Test: The Case for Sensitivity Over Specificity in Neuropsychological Testing. Journal of clinical psychopharmacology. 2018;38:513-9 https://doi.org/10.1097/jcp.0000000000000941. Ma X, Huang W, Lu L, Li H, Ding J, Sheng S, et al. Developing and validating a nomogram for cognitive impairment in the older people based on the NHANES. Front Neurosci. 2023;17:1195570 https://doi.org/10.3389/fnins.2023.1195570. Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, California, USA: Association for Computing Machinery; 2016. p. 785–94. Martin SA, Townend FJ, Barkhof F, Cole JH. Interpretable machine learning for dementia: A systematic review. Alzheimer's & dementia : the journal of the Alzheimer's Association. 2023;19:2135-49 https://doi.org/10.1002/alz.12948. Yuan Y, Peng C, Burr JA, Lapane KL. Frailty, cognitive impairment, and depressive symptoms in Chinese older adults: an eight-year multi-trajectory analysis. BMC Geriatrics. 2023;23:843 https://doi.org/10.1186/s12877-023-04554-1. Lövdén M, Fratiglioni L, Glymour MM, Lindenberger U, Tucker-Drob EM. Education and Cognitive Functioning Across the Life Span. Psychological science in the public interest : a journal of the American Psychological Society. 2020;21:6-41 https://doi.org/10.1177/1529100620920576. Lin Y, Wang H, Tian Y, Gong L, Chang C. [Factors influencing cognitive function among the older adults in Beijing]. Beijing da xue xue bao Yi xue ban = Journal of Peking University Health sciences. 2024;56:456-61 https://doi.org/10.19723/j.issn.1671-167X.2024.03.012. Wei Y, Lin JL, Chen G, Pei LJ. [A cohort study of association between sleep duration and cognitive impairment in the elderly aged 65 years and older in China]. Zhonghua liu xing bing xue za zhi = Zhonghua liuxingbingxue zazhi. 2022;43:359-65 https://doi.org/10.3760/cma.j.cn112338-20210410-00305. Iso-Markku P, Aaltonen S, Kujala UM, Halme HL, Phipps D, Knittle K, et al. Physical Activity and Cognitive Decline Among Older Adults: A Systematic Review and Meta-Analysis. JAMA network open. 2024;7:e2354285 https://doi.org/10.1001/jamanetworkopen.2023.54285. Rehm J, Hasan OSM, Black SE, Shield KD, Schwarzinger M. Alcohol use and dementia: a systematic scoping review. Alzheimer's Research & Therapy. 2019;11:1 https://doi.org/10.1186/s13195-018-0453-0. Wang X, Wang Z, Qi S, Zhang M, Zhang X, Guan Y, et al. Risk factors and their interaction on cognitive impairment in the elderly in China: case-control study. 2020;41:705-10. Aloizou AM, Siokas V, Vogiatzi C, Peristeri E, Docea AO, Petrakis D, et al. Pesticides, cognitive functions and dementia: A review. Toxicology letters. 2020;326:31-51 https://doi.org/10.1016/j.toxlet.2020.03.005. Hsiao CC, Yang AM, Wang C, Lin CY. Association between glyphosate exposure and cognitive function, depression, and neurological diseases in a representative sample of US adults: NHANES 2013-2014 analysis. Environmental research. 2023;237:116860 https://doi.org/10.1016/j.envres.2023.116860. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6872563","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":472771695,"identity":"a3d45ac7-1bef-41e7-a8f0-c549d163c3f5","order_by":0,"name":"Ligong Ji","email":"","orcid":"","institution":"Taixing Second People's Hospital","correspondingAuthor":false,"prefix":"","firstName":"Ligong","middleName":"","lastName":"Ji","suffix":""},{"id":472771696,"identity":"b01fa4bd-ee38-4d7a-91db-f602a2bfd36b","order_by":1,"name":"Xiaoming Ma","email":"","orcid":"","institution":"Suzhou Hospital, Affiliated Hospital of Medical School, Nanjing University","correspondingAuthor":false,"prefix":"","firstName":"Xiaoming","middleName":"","lastName":"Ma","suffix":""},{"id":472771697,"identity":"14847849-1496-4358-8f16-5f5c296fe094","order_by":2,"name":"Yuxiang Zhang","email":"","orcid":"","institution":"Nanjing Drum Tower Hospital","correspondingAuthor":false,"prefix":"","firstName":"Yuxiang","middleName":"","lastName":"Zhang","suffix":""},{"id":472771698,"identity":"2077a95c-e2f6-4a37-9cb0-6bc417f993c0","order_by":3,"name":"Xiaojing Li","email":"","orcid":"","institution":"Suzhou Hospital, Affiliated Hospital of Medical School, Nanjing University","correspondingAuthor":false,"prefix":"","firstName":"Xiaojing","middleName":"","lastName":"Li","suffix":""},{"id":472771699,"identity":"739b5570-3ca7-43fb-a3be-30d240229285","order_by":4,"name":"Zenglin Cai","email":"","orcid":"","institution":"Suzhou Hospital, Affiliated Hospital of Medical School, Nanjing University","correspondingAuthor":false,"prefix":"","firstName":"Zenglin","middleName":"","lastName":"Cai","suffix":""},{"id":472771700,"identity":"889dc4e2-5509-4e24-87f4-8afa44cc096b","order_by":5,"name":"Jiahui Shen","email":"","orcid":"","institution":"Suzhou Hospital, Affiliated Hospital of Medical School, Nanjing University","correspondingAuthor":false,"prefix":"","firstName":"Jiahui","middleName":"","lastName":"Shen","suffix":""},{"id":472771701,"identity":"5609ec1e-cfcf-4b88-8228-4f194ccaab51","order_by":6,"name":"Yijie Wu","email":"","orcid":"","institution":"Suzhou Hospital, Affiliated Hospital of Medical School, Nanjing University","correspondingAuthor":false,"prefix":"","firstName":"Yijie","middleName":"","lastName":"Wu","suffix":""},{"id":472771702,"identity":"9baa3cf3-4854-4bf3-a844-b41954ba58f6","order_by":7,"name":"Jingwei Li","email":"","orcid":"","institution":"Suzhou Hospital, Affiliated Hospital of Medical School, Nanjing University","correspondingAuthor":false,"prefix":"","firstName":"Jingwei","middleName":"","lastName":"Li","suffix":""},{"id":472771703,"identity":"3987ccb2-2352-48e8-9772-a3e9f66f3638","order_by":8,"name":"Jingjing Su","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAvElEQVRIiWNgGAWjYHCCBAMJAzY5Nvb2AyRosSjgM+bjOZNAgj0VH+QS50k4GBCnWn5GwoOCGwZm6W0SDAkMPyq2EdZicCMhwXCGQVpum3TjAcaeM7eJ0CKRkGAsYXAst03mQAIzYxsRWoAOSzD+Y/A/nU0iwYA4LQxAh4ECOYF4LQZnHoC1GLYBA/kgUX6Rb89JM5D4wyYv395+8MGPCmIcxsCTBo+PA8SoBwL2ww+IVDkKRsEoGAUjFQAAy606Ty3UqRkAAAAASUVORK5CYII=","orcid":"","institution":"Second Affiliated Hospital of Nanjing Medical University","correspondingAuthor":true,"prefix":"","firstName":"Jingjing","middleName":"","lastName":"Su","suffix":""}],"badges":[],"createdAt":"2025-06-11 13:53:26","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6872563/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6872563/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":84941956,"identity":"b1fe66ab-b4b0-4d86-a43f-6de2b434611b","added_by":"auto","created_at":"2025-06-19 05:30:23","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":423190,"visible":true,"origin":"","legend":"\u003cp\u003eVariable importance ranking plot derived from the four models\u003c/p\u003e","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-6872563/v1/c02d5decbc01f38ee5f1160c.png"},{"id":84941957,"identity":"b0d6a1cf-e53d-4004-9e61-29032fe1abab","added_by":"auto","created_at":"2025-06-19 05:30:23","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":2356152,"visible":true,"origin":"","legend":"\u003cp\u003ePartial dependence plot derived from the four models\u003c/p\u003e","description":"","filename":"Figure2.png","url":"https://assets-eu.researchsquare.com/files/rs-6872563/v1/9e44e28cf8aa8557a6bd09b1.png"},{"id":84941958,"identity":"a3a53d26-3f91-43fd-bc1d-beea7c7420ca","added_by":"auto","created_at":"2025-06-19 05:30:23","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":1121659,"visible":true,"origin":"","legend":"\u003cp\u003eSHAP plot derived from the four models\u003c/p\u003e","description":"","filename":"Figure3.png","url":"https://assets-eu.researchsquare.com/files/rs-6872563/v1/c1f1587b1b328e91598165e4.png"},{"id":84942701,"identity":"8c762b9d-56a1-481c-9558-a08291966e00","added_by":"auto","created_at":"2025-06-19 05:38:30","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":4274600,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6872563/v1/ccddfadd-717c-4992-9d85-1d7b4e9cd52d.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Interpretable Machine Learning for Cognitive Impairment Assessment: SHAP-Based Analysis of XGBoost Models Using NHANES 2011-2014 Cognitive Tests in Older Adults","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eCognitive decline in older adults represents a significant and growing public health challenge, profound implications for individuals, families, and healthcare systems. The prevalence of cognitive impairment among adults aged 65 and older exceeds 20% in the United States, and the associated economic burden of dementia care is projected to reach $1 trillion annually by 2050[1]. Given this, early identification of modifiable risk factors is essential for developing effective preventive interventions and mitigating the societal impact of cognitive decline. However, current methods for cognitive assessment, such as the Mini-Mental State Examination (MMSE) and the Montreal Cognitive Assessment (MoCA), face several limitations that hinder their capacity to detect subtle cognitive deficits and model the complex relationships between risk factors and cognitive outcomes.\u003c/p\u003e\n\u003cp\u003eTwo key limitations are particularly noteworthy: 1) Composite scoring systems, like the MMSE, tend to obscure domain-specific deficits by aggregating diverse cognitive measures into a single score; 2) Traditional statistical methods often fail to capture the nonlinear, interactive effects among a wide array of multidimensional predictors, including demographic factors, lifestyle behaviors, metabolic markers, and neuropsychological test scores[2]. These shortcomings emphasize the need for more advanced approaches that can accurately model the complexity of cognitive decline and provide actionable insights for early intervention.\u003c/p\u003e\n\u003cp\u003eWhile traditional statistical approaches have identified various associations between demographic factors and cognitive performance, they often fall short in addressing the intricate, multifaceted nature of cognitive decline. Specifically, factors such as age, sex, education level, metabolic health, and environmental influences interact in complex, nonlinear ways that are not easily captured by conventional methods. In contrast, modern cognitive assessments, such as the National Health and Nutrition Examination Survey (NHANES) cognitive battery, offer a more comprehensive evaluation by incorporating multiple cognitive domains, including memory, fluency, and executive function. NHANES cognitive battery provides a distinct evaluative perspective by incorporating multidimensional assessments of memory, fluency, and executive function, provide a unique opportunity to examine cognitive function through a multifactorial lens. These tests assess critical aspects of cognitive processing, from hippocampal-dependent memory consolidation to frontostriatal processing speed, enabling a nuanced understanding of cognitive impairment in older adults. NHANES cognitive assessments, comprising the Immediate Recall Test (IRT)[3], Delayed Recall Test (DRT)[4], IRT/DRT can quantify hippocampal-dependent memory consolidation, Animal Fluency Test (AFT)[5] can assesses left frontal-temporal semantic networks, and Digit Symbol Substitution Test (DSST)[6] proxies frontostriatal processing speed, provide a unique opportunity to investigate these multifactorial relationships through advanced analytics.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eDespite their advantages, prior research[7] in this area relies on the Z-score normalization method to aggregate scores across different cognitive tests into a single composite measure. While this approach simplifies statistical analysis, it introduces several critical limitations. Specifically, it ignores the biological differences between cognitive domains, fails to account for threshold effects (e.g., accelerated decline in processing speed associated with high HbA1c levels), and distorts the covariance structure of tests that share neural substrates. As a result, traditional methods like Z-score aggregation may obscure domain-specific deficits and fail to capture the full complexity of cognitive impairment. To summarize, traditional analysis[7] using z-score aggregation introduce critical biases: First, Equal weighting of biologically distinct domains (e.g., 1 SD in DSST ≠ 1 SD in DRT neurobiologically). Secondly, loss of threshold effects (e.g., accelerated DSST decline when HbA1c \u0026gt;6.5%). Thirdly, covariance distortion across tests with shared neural substrates.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eMachine learning approaches, particularly ensemble methods like eXtreme Gradient Boosting (XGBoost)[8], offer a promising alternative to traditional statistical methods. XGBoost excels in modeling nonlinear relationships between predictors and cognitive outcomes, enabling more accurate predictions and the identification of complex interactions among variables [8]. Furthermore, XGBoost can effectively handle mixed data types, including continuous, categorical, and ordinal variables, making it well-suited for analyzing the diverse set of predictors that influence cognitive decline.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eHowever, the widespread adoption of machine learning techniques in clinical settings is hindered by their \"black-box\" nature. While these models provide highly accurate predictions, they often lack interpretability, which is essential for clinical decision-making. To address this issue, explainable AI (XAI) techniques, such as SHapley Additive exPlanations (SHAP), have been developed to provide insights into the underlying decision-making process of complex models. SHAP offers a framework for identifying the contributions of individual features to cognitive outcomes, enabling clinicians to better understand the role of specific risk factors in cognitive decline. By using SHAP to analyze XGBoost models, this study seeks to provide domain-specific, interpretable insights into the relationships between cognitive performance and a wide range of predictors, including lifestyle factors, metabolic markers, and demographic characteristics[9].\u003c/p\u003e\n\u003cp\u003eIn summary, this study aims to bridge the gap between advanced machine learning models and clinically relevant, interpretable explanations of cognitive impairment. By leveraging SHAP-based analysis of XGBoost models trained on NHANES cognitive data, we seek to uncover critical risk factors and interactions that influence cognitive decline in older adults. This approach promises to provide a more nuanced understanding of cognitive impairment and offer valuable insights for targeted, preventive interventions.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e1.IRT\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTest Content:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eParticipants are presented with a list of unrelated words (e.g., 10 nouns) and asked to verbally recall as many words as possible immediately after presentation. This assesses short-term episodic memory and auditory-verbal learning capacity.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eScoring Method:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTotal score = Number of correctly recalled words (0-10)\u003c/p\u003e\n\u003cp\u003eNo penalty for order errors or intrusions (non-list words)\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.DRT\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTest Content:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAfter a 20–30-minute delay (filled with non-memory tasks), participants are asked to recall the same word list from the IRT without prior warning. This evaluates long-term memory retention and hippocampal-dependent consolidation.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eScoring Method:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTotal score = Number of correctly recalled words (0-10)\u003c/p\u003e\n\u003cp\u003eCompared to IRT scores to calculate percent retention: (DRT/IRT) ×100\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e3.AFT\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTest Content:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eParticipants name as many animals as possible within 60 seconds, excluding proper nouns or repetitions. This measures semantic memory access, verbal fluency, and executive control of lexical retrieval.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eScoring Method:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTotal score = Valid unique animal names generated\u003c/p\u003e\n\u003cp\u003eExclusions: Non-animal terms (e.g., \"dragon\"), repetitions, or variants (e.g., \"dog\"/\"puppy\" count as one)\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.DSST\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTest Content:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eParticipants match numbers (1-9) to corresponding abstract symbols using a reference key within 120 seconds. This assesses processing speed, visual-motor coordination, and working memory.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eScoring Method:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTotal score = Number of correct symbol-number pairings (0-133)\u003c/p\u003e\n\u003cp\u003eErrors (incorrect matches) are recorded but typically not penalized in NHANES scoring.\u003c/p\u003e\n\u003cp\u003eRecent advances in machine learning, particularly ensemble methods like XGBoost, offer superior predictive accuracy for complex biomedical datasets. However, clinical adoption requires interpretability frameworks that align with neuropsychological theory. Moreover, the \"black-box\" nature of such models hinders clinical interpretability—a gap addressed by explainable AI (XAI) techniques like SHapley Additive exPlanations (SHAP)[2]. SHAP provides domain-specific feature attribution by: 1) Calculating independent contributions to each cognitive domain. 2) Identifying critical biomarker thresholds (e.g., systolic BP \u0026gt;140 mmHg). 3) Visualizing interaction effects (e.g., education × pesticide exposure).\u003c/p\u003e"},{"header":"2. Methods","content":"\u003cp\u003e\u003cstrong\u003e2.1Study Population\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis cross-sectional analysis utilized data from the NHANES 2011-2014 cycles. Among 3,802 participants aged ≥60 years who completed cognitive testing, the following eligibility criteria were applied:\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eInclusion Criteria:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e2.1.1 Age ≥60 years at survey participation;\u003c/p\u003e\n\u003cp\u003e2.1.2 Complete data on all four cognitive tests (CERAD Word Learning, Delayed Recall, Animal Fluency, Digit Symbol Substitution).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eExclusion Criteria:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e2.1.3 Refuse to take cognition tests;\u003c/p\u003e\n\u003cp\u003e2.1.4 Missing responses in any cognitive test item (n=698 excluded);\u003c/p\u003e\n\u003cp\u003e2.1.5 Incomplete covariate data (demographics, comorbidities, or laboratory measures) (n=463 excluded).\u003c/p\u003e\n\u003cp\u003e2.1.6 The final unweighted analytical sample consisted of 2,733 participants.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthical Considerations\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAs this study involved retrospective analysis of publicly available, de-identified NHANES data, institutional review board (IRB) approval was exempted under 45 CFR §46.104(d)(4), research involving secondary analysis of existing public datasets\u0026nbsp;that contain no identifiable private information is exempt from institutional review board (IRB) approval.\u003c/p\u003e\n\u003cp\u003eThe original NHANES protocols were reviewed and approved by the National Center for Health Statistics (NCHS) Ethics Review Board (Protocol #2011-17), with written informed consent obtained from all participants during data collection.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.2 Statistical Analysis\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll statistical analyses were performed using R software (version 4.4.3; R Foundation for Statistical Computing). This study cohort consisted of 2,733 participants, which exceeding the minimum sample size required for machine learning models with up to 20 features as determined by 5-fold cross-validation. Continuous variables (e.g., cognitive test scores, age) were presented as mean ± standard deviation (SD). Categorical variables (e.g., sex, education level) presented as counts and proportions (%), as presented in \u003cstrong\u003eTable 1\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003eFor each machine learning model, such as eXtreme Gradient Boosting (XGBoost), Random Forest, and Support Vector Machines (SVM), independent training-validation splits were generated using stratified random sampling to ensure representative distributions across key variables. The dataset was divided into training and validation sets in a 4:1 ratio, which ensured complete separation between the training processes and prevented any data leakage or inter-model contamination. All variables showed no statistically significant differences between the training and validation sets.\u003c/p\u003e\n\u003cp\u003eEach validation set was used exclusively to evaluate the corresponding model. Performance metrics were calculated as follows:\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eRoot Mean Squared Error (RMSE)\u003c/strong\u003e - Used to measure model-specific prediction errors, providing an estimate of the average deviation between predicted and observed values;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eR-squared (R²)\u003c/strong\u003e - Used to quantify the proportion of variance explained by the model, offering insights into the model's overall fit and explanatory power;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMean Absolute Error (MAE)\u003c/strong\u003e - Used to assess the absolute accuracy of each model, reflecting the average magnitude of the prediction errors.\u003c/p\u003e\n\u003cp\u003eNo cross-model baseline comparisons were performed, as the primary focus of this analysis was on assessing the intra-model generalizability rather than comparing model performance across different architectures.\u003c/p\u003e"},{"header":"3. Results","content":"\u003cp\u003e\u003cstrong\u003e3.1\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eCharacteristics of participants\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eIn this study, 2,733 participants aged \u0026ge;60 years who completed all cognitive assessments. As summarized in \u003cstrong\u003eTable 1\u003c/strong\u003e, detailed demographic characteristics of the cohort are presented using standard descriptive statistics.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 1. Demographic Characteristics \u0026amp; Clinical Information of Participants\u003c/strong\u003e\u003c/p\u003e\n\u003ctable border=\"0\" cellspacing=\"0\" cellpadding=\"0\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eCharacteristic\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eAll (N = 2,733)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eMale (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e1331 (48.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eAge (Years)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e69.4 \u0026plusmn; 6.75\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eEducation (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eLess than 9th grade\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e299 (10.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp; 9-11th grade (Includes 12th grade with no diploma)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e386 (14.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp; High school graduate/GED or equivalent\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e639 (23.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp; Some college or AA degree\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e786 (28.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp; College graduate or above\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e623 (22.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eMarriage (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp; Never married\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e154 (5.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Married\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e1514 (55.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Divorced\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e389 (14.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Widowed\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e532 (19.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Separated\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e68 (2.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Living with partner\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e76 (2.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eDrink (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e871 (31.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e1862 (68.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eHypertension (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e1025 (37.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eDyslipidemia\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e1191 (43.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e1542 (56.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eRecall1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e4.71 \u0026plusmn; 1.69\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eRecall2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e6.71 \u0026plusmn; 1.82\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eRecall3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e7.53 \u0026plusmn; 1.80\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eDelay\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e5.95 \u0026plusmn; 2.29\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eAnimal Fluency Test\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e16.6 \u0026plusmn; 5.44\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eDiabetes (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e1982 (72.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Borderline\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e125 (4.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e626 (22.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003ePesticide\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e2364 (86.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e369 (13.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eHerbicide\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eNo\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e2507 (91.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e226 (8.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eHigh Activity\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No Change\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e2118 (77.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Worse Now\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e472 (17.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Better Now\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e143 (5.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eModerate Activity\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e2497 (91.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e236 (8.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eWalk/Cycle to Work\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e296 (10.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e2437 (89.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eHigh Exercise\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e759 (27.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e1974 (72.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eModerate Exercise\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e326 (11.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e2407 (88.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eSitting Minutes/ Day\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e393 \u0026plusmn; 190\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eSleeping Hours/ Day\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e7.01 \u0026plusmn; 1.41\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eSleeping Disorder\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e560 (20.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e2173 (79.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eOlfactory changes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e250 (9.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e2483 (90.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eTaste changes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e1062 (38.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e1671 (61.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eSmelling Problem\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e264 (9.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e2469 (90.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eTasting Problem\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e164 (6.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e2569 (94.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eIRT score\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e19.00 \u0026plusmn; 4.62\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eAFT score\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e16.60 \u0026plusmn; 5.44\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eDSST score\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e45.90 \u0026plusmn; 17.10\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eDRT score\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e5.95 \u0026plusmn; 2.29\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u003cstrong\u003e3.2\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eModels Development \u0026amp; Validation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eBased on the four final models constructed by XGBoost algorithm (with regression modeling for IRT, DRT, AFT, and DSST, subsequently optimized via grid search), variable importance ranking plot (\u003cstrong\u003eFigure 1\u003c/strong\u003e), partial dependence plot (\u003cstrong\u003eFigure 2\u003c/strong\u003e), and SHAP swarm plots (\u003cstrong\u003eFigure 3\u003c/strong\u003e) were generated.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e3.2\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eModels Evaluation\u003c/strong\u003e\u003c/p\u003e\n\u003ctable border=\"0\" cellspacing=\"0\" cellpadding=\"0\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eRMSE\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eR\u003csup\u003e2\u003c/sup\u003e\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMAE\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003eIRT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e4.082\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e0.197\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e3.252\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003eDRT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e2.007\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e0.144\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e1.621\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003eAFT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e4.862\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e0.195\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e3.846\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003eDSST\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e12.947\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e0.438\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 138px;\"\u003e\n \u003cp\u003e10.271\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e"},{"header":"4. Discussion","content":"\u003cp\u003eA gradual deterioration in cognitive status is commonly observed in individuals aged ≥ 60 years as part of the natural aging process, with a notable decline in cognitive function over time[10]. As depicted in \u003cstrong\u003eFigure 1\u003c/strong\u003e and \u003cstrong\u003eFigure 3\u003c/strong\u003e, factors such as education level or years of formal education account for a significant portion of the variance in cognitive impairment within the elderly population, where cognitive decline is closely linked with increasing functional disability[11].\u003c/p\u003e\n\u003cp\u003eIn addition, as shown in \u003cstrong\u003eFigure 2\u003c/strong\u003e and \u003cstrong\u003eFigure 3\u003c/strong\u003e, age plays a crucial role in the onset and progression of cognitive impairment. Aging itself is an unavoidable and major risk factor for cognitive decline[12]. A regular and sufficient amount of sleep is vital for maintaining brain health. However, with the advancement of age, both the duration of sleep and the occurrence of sleep disorders can significantly impact cognitive function in older adults. A cohort study conducted in China[13] highlighted that, compared to elderly individuals who sleep 6 hours per day, those with less than 5 hours of sleep had a 30% increased risk of cognitive impairment (HR = 1.30, 95% CI: 1.05 – 1.62). Furthermore, those who slept 7, 8, and more than 9 hours per day had a 34% (HR = 1.34, 95% CI: 1.09 – 1.64), 40% (HR = 1.40, 95% CI: 1.17 – 1.69), and 43% (HR = 1.43, 95% CI: 1.19 – 1.70) increased risk of cognitive impairment, respectively. Trend tests indicated that as sleep duration increased beyond 6 hours, the risk of cognitive impairment also heightened, suggesting a dose-response relationship (\u003cem\u003eP\u003c/em\u003e \u0026lt; 0.001).\u003c/p\u003e\n\u003cp\u003e\u0026nbsp;It is important to emphasize that physical activity, exercise, and moderate work have a protective effect on cognitive health in older adults. Whether it involves high-intensity or moderate physical exercises, or even simpler activities such as cycling or walking to work, all contribute positively to mitigating cognitive decline[14].\u003c/p\u003e\n\u003cp\u003eOn the other hand, chronic alcohol consumption significantly undermines cognitive reserve. Long-term heavy alcohol use (defined by the World Health Organization/European Medicines Agency as exceeding 60 grams of pure alcohol per day for men and more than 40 grams per day for women) is strongly associated with an increased risk of cognitive impairment and dementia[15].\u003c/p\u003e\n\u003cp\u003eAdditionally, underlying medical conditions common in older adults, such as hypertension, dyslipidemia, and sensory changes—specifically in taste and smell—can further exacerbate cognitive decline[16].\u003c/p\u003e\n\u003cp\u003eFinally, exposure to pesticides and herbicides continues to pose a significant threat to cognitive function in the elderly. Various studies have shown that such environmental toxins contribute to cognitive deterioration in residents[17, 18].\u003c/p\u003e"},{"header":"5. Conclusion","content":"\u003cp\u003eThis is the first study to establish individualized XGBoost models for distinct cognitive domains within the NHANES dataset, marking a significant advancement in the application of machine learning for cognitive assessment in older adults. By leveraging SHAP-driven interpretability, this research demonstrates how machine learning models can not only provide highly accurate predictions of cognitive decline but also offer valuable insights into the underlying mechanisms driving these predictions. SHAP values, which highlight the contribution of individual features to model outcomes, provide a clearer understanding of how specific factors influence cognitive impairment, thereby enhancing early screening strategies for dementia and other cognitive disorders.\u003c/p\u003e\n\u003cp\u003eOur findings underscore the critical role of metabolic markers, such as Diabetes Mellitus, Sleep Disorders, in the prediction of cognitive decline, highlighting their potential as actionable targets for dementia prevention. Additionally, lifestyle factors, including physical activity and sleep duration, were identified as key modifiable risk factors. These insights provide a framework for developing personalized intervention strategies, where healthcare providers can target specific risk factors based on an individual's unique cognitive profile. Ultimately, our study contributes to the growing body of knowledge on the importance of early and individualized risk assessments for dementia prevention in aging populations, offering a promising avenue for improving long-term cognitive health outcomes.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eFunding:\u0026nbsp;\u003c/strong\u003eThis study was supported by the Jiangsu Commission of Health (K2023023, LKZ2024006), the Suzhou Municipal Health Commission (DZXYJ202313), and the Nanjing Drum Tower Hospital Clinical Research Project (2021-LCYJ-MS-21).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests:\u0026nbsp;\u003c/strong\u003eThe authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eLivingston G, Huntley J, Sommerlad A, Ames D, Ballard C, Banerjee S, et al. Dementia prevention, intervention, and care: 2020 report of the Lancet Commission. Lancet (London, England). 2020;396:413-46 https://doi.org/10.1016/s0140-6736(20)30367-6.\u003c/li\u003e\n\u003cli\u003eLundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From Local Explanations to Global Understanding with Explainable AI for Trees. Nature machine intelligence. 2020;2:56-67 https://doi.org/10.1038/s42256-019-0138-9.\u003c/li\u003e\n\u003cli\u003eBernhardt EBJDUTG. Testing foreign language reading comprehension: The immediate recall protocol. 1983;16:27-33.\u003c/li\u003e\n\u003cli\u003eCamos V, Portrat S. The impact of cognitive load on delayed recall. Psychonomic bulletin \u0026amp; review. 2015;22:1029-34 https://doi.org/10.3758/s13423-014-0772-5.\u003c/li\u003e\n\u003cli\u003eSebaldt R, Dalziel W, Massoud F, Tanguay A, Ward R, Thabane L, et al. Detection of cognitive impairment and dementia using the animal fluency test: the DECIDE study. 2009;36:599-604.\u003c/li\u003e\n\u003cli\u003eJaeger J. Digit Symbol Substitution Test: The Case for Sensitivity Over Specificity in Neuropsychological Testing. Journal of clinical psychopharmacology. 2018;38:513-9 https://doi.org/10.1097/jcp.0000000000000941.\u003c/li\u003e\n\u003cli\u003eMa X, Huang W, Lu L, Li H, Ding J, Sheng S, et al. Developing and validating a nomogram for cognitive impairment in the older people based on the NHANES. Front Neurosci. 2023;17:1195570 https://doi.org/10.3389/fnins.2023.1195570.\u003c/li\u003e\n\u003cli\u003eChen T, Guestrin C. XGBoost: A Scalable Tree Boosting System.\u0026nbsp; Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, California, USA: Association for Computing Machinery; 2016. p. 785\u0026ndash;94.\u003c/li\u003e\n\u003cli\u003eMartin SA, Townend FJ, Barkhof F, Cole JH. Interpretable machine learning for dementia: A systematic review. Alzheimer's \u0026amp; dementia : the journal of the Alzheimer's Association. 2023;19:2135-49 https://doi.org/10.1002/alz.12948.\u003c/li\u003e\n\u003cli\u003e Yuan Y, Peng C, Burr JA, Lapane KL. Frailty, cognitive impairment, and depressive symptoms in Chinese older adults: an eight-year multi-trajectory analysis. BMC Geriatrics. 2023;23:843 https://doi.org/10.1186/s12877-023-04554-1.\u003c/li\u003e\n\u003cli\u003e L\u0026ouml;vd\u0026eacute;n M, Fratiglioni L, Glymour MM, Lindenberger U, Tucker-Drob EM. Education and Cognitive Functioning Across the Life Span. Psychological science in the public interest : a journal of the American Psychological Society. 2020;21:6-41 https://doi.org/10.1177/1529100620920576.\u003c/li\u003e\n\u003cli\u003e Lin Y, Wang H, Tian Y, Gong L, Chang C. [Factors influencing cognitive function among the older adults in Beijing]. Beijing da xue xue bao Yi xue ban = Journal of Peking University Health sciences. 2024;56:456-61 https://doi.org/10.19723/j.issn.1671-167X.2024.03.012.\u003c/li\u003e\n\u003cli\u003e Wei Y, Lin JL, Chen G, Pei LJ. [A cohort study of association between sleep duration and cognitive impairment in the elderly aged 65 years and older in China]. Zhonghua liu xing bing xue za zhi = Zhonghua liuxingbingxue zazhi. 2022;43:359-65 https://doi.org/10.3760/cma.j.cn112338-20210410-00305.\u003c/li\u003e\n\u003cli\u003e Iso-Markku P, Aaltonen S, Kujala UM, Halme HL, Phipps D, Knittle K, et al. Physical Activity and Cognitive Decline Among Older Adults: A Systematic Review and Meta-Analysis. JAMA network open. 2024;7:e2354285 https://doi.org/10.1001/jamanetworkopen.2023.54285.\u003c/li\u003e\n\u003cli\u003e Rehm J, Hasan OSM, Black SE, Shield KD, Schwarzinger M. Alcohol use and dementia: a systematic scoping review. Alzheimer's Research \u0026amp; Therapy. 2019;11:1 https://doi.org/10.1186/s13195-018-0453-0.\u003c/li\u003e\n\u003cli\u003e Wang X, Wang Z, Qi S, Zhang M, Zhang X, Guan Y, et al. Risk factors and their interaction on cognitive impairment in the elderly in China: case-control study. 2020;41:705-10.\u003c/li\u003e\n\u003cli\u003e Aloizou AM, Siokas V, Vogiatzi C, Peristeri E, Docea AO, Petrakis D, et al. Pesticides, cognitive functions and dementia: A review. Toxicology letters. 2020;326:31-51 https://doi.org/10.1016/j.toxlet.2020.03.005.\u003c/li\u003e\n\u003cli\u003e Hsiao CC, Yang AM, Wang C, Lin CY. Association between glyphosate exposure and cognitive function, depression, and neurological diseases in a representative sample of US adults: NHANES 2013-2014 analysis. Environmental research. 2023;237:116860 https://doi.org/10.1016/j.envres.2023.116860.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"NHANES, XGBoost, SHAP, Cognitive Impairment, Cognition Test, Aging","lastPublishedDoi":"10.21203/rs.3.rs-6872563/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6872563/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground:\u003c/strong\u003e Cognitive impairment in older adults is a key risk factor for dementia, yet early detection remains challenging. Traditional models lack interpretability and struggle to capture complex risk patterns. Interpretable machine learning offers a solution by combining predictive accuracy with transparency. This study uses National Health and Nutrition Examination Survey (NHANES) 2011–2014 data to develop explainable models for domain-specific cognitive tests in adults aged 60 and above.\u003cbr\u003e\n \u003cstrong\u003eObjective:\u003c/strong\u003e This study establishes a novel explainable AI framework that bridges interpretable machine learning models for domain-specific cognitive impairment assessment in adults over 60 years old, leveraging the NHANES 2011-2014 dataset to identify critical biomarkers and modifiable risk factors.\u003cbr\u003e\n \u003cstrong\u003eMethods:\u003c/strong\u003e Four cognitive tests—Digit Symbol Substitution Test (DSST), Delayed Recall Test (DRT), Animal Fluency Test (AFT), and Immediate Recall Test (IRT)—were individually modeled using XGBoost regression. Model robustness was ensured through five-fold cross-validation and grid search hyperparameter optimization. SHapley Additive exPlanations (SHAP) were applied to elucidate feature contributions.\u003cbr\u003e\n \u003cstrong\u003eResults:\u003c/strong\u003e The optimized models achieved superior predictive performance: DSST (RMSE = 12.473, R² = 0.459, MAE = 9.975), DRT (RMSE = 2.004, R² = 0.148, MAE = 1.617), AFT (RMSE = 5.043, R² = 0.191, MAE = 3.977), and IRT (RMSE = 4.082, R² = 0.197, MAE = 3.252). \u003cbr\u003e\n \u003cstrong\u003eConclusion:\u003c/strong\u003e This is the first study to establish individualized XGBoost models for distinct NHANES cognitive domains, demonstrating how SHAP-driven interpretability enhances early screening strategies. Our findings highlight metabolic markers (e.g., Diabetes Mellitus) and lifestyle factors as actionable targets for dementia prevention in aging populations.\u003c/p\u003e","manuscriptTitle":"Interpretable Machine Learning for Cognitive Impairment Assessment: SHAP-Based Analysis of XGBoost Models Using NHANES 2011-2014 Cognitive Tests in Older Adults","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-06-19 05:30:19","doi":"10.21203/rs.3.rs-6872563/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"2c10a9a2-d125-4a73-b753-1c18bb22c0fc","owner":[],"postedDate":"June 19th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-06-19T05:30:19+00:00","versionOfRecord":[],"versionCreatedAt":"2025-06-19 05:30:19","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6872563","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6872563","identity":"rs-6872563","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00