Prediction of Estimated Glomerular Filtration Rate Slope and Kidney Prognosis of Patients with Chronic Kidney Disease

preprint OA: closed
Full text JSON View at publisher
Full text 84,487 characters · extracted from preprint-html · click to expand
Prediction of Estimated Glomerular Filtration Rate Slope and Kidney Prognosis of Patients with Chronic Kidney Disease | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Prediction of Estimated Glomerular Filtration Rate Slope and Kidney Prognosis of Patients with Chronic Kidney Disease Hajime Nagasu, Takaya Nakashima, Katsuhito Ihara, Ryo Fujimori, and 5 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6575802/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background Chronic kidney disease (CKD) is a significant global health challenge, yet the application of eGFR slope as a metric for CKD progression remains underdeveloped in primary care settings. Methods Using data from J-CKD-DB-Ex, Japan’s largest CKD database, we developed and validated a machine learning-based model to predict eGFR slope. The study included 10,474 patients aged ≥ 18 years with eGFR < 60 mL/min/1.73m² or proteinuria at baseline. Predictors included demographic, clinical, and laboratory data. We compared three models: linear regression, LightGBM, and LSTM networks. Results Among 10,474 patients (median age 69.0 years), the LightGBM model achieved superior performance (RMSE = 2.95 mL/min/1.73m²/year) compared to LSTM (RMSE = 3.94) and conventional linear regression (RMSE = 15.87). The model was implemented as a web-based application for clinical use. Conclusion This machine learning-based prediction model achieves superior accuracy in estimating eGFR trajectory and enables real-time prediction using single time-point data. The web-based tool supports early identification of high-risk patients, enabling timely interventions and specialist referrals in primary care settings. Health sciences/Nephrology/Kidney diseases/Chronic kidney disease/End-stage renal disease Health sciences/Medical research/Epidemiology Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 INTRODUCTION Since the concept of chronic kidney disease (CKD) was established in 2002, the importance of early diagnosis and treatment for kidney dysfunction has been widely recognized. However, the global number of patients requiring kidney replacement therapy continues to rise. Furthermore, CKD is a significant risk factor for cardiovascular disease (CVD) and mortality, posing a major public health challenge 1 , 2 . An epidemiological survey conducted in 2005 estimated that 12.9% of Japanese adults, or approximately 13.3 million people, had CKD. However, when considering the higher prevalence of CKD among individuals who do not undergo health checkups compared to those who do, a revised estimate suggests that one in five Japanese adults, or approximately 20 million people, may have CKD 3 . Given the substantial number of patients with CKD, it is impractical to manage all cases exclusively with nephrologists. As a result, the majority of individuals with early-stage CKD are managed mainly by primary care physicians (PCPs), whereas those with CKD stage G3b or more advanced disease are typically referred to nephrologists for specialized care. However, awareness of CKD and its early detection remains insufficient, as does the establishment of collaborative medical systems between PCPs and nephrologists in various regions. For advanced CKD (G3b–G5 in CKD staging), a 30–40% decline in eGFR or a doubling of serum creatinine levels was established as a surrogate endpoint for end-stage kidney disease (ESKD) by Kidney Disease: Improving Global Outcomes (KDIGO) in 2017 4 . These surrogate endpoints were validated in Japanese populations and incorporated into clinical guidelines for CKD evaluation 5 . In 2018, a scientific workshop sponsored by the National Kidney Foundation, U.S. Food and Drug Administration, and European Medicines Agency recommended the use of a surrogate endpoint for early CKD 6 . These findings highlight the value of assessing eGFR slope in CKD management. However, the integration of this approach into clinical practice by PCPs remains limited. Empowering PCPs to accurately predict CKD progression using routine clinical data is essential for advancing CKD prevention, facilitating early detection, and optimizing the timing of referrals to specialists. At present, several representative CKD prognosis prediction models have been developed. While these models enable risk assessment, none are based on the eGFR slope. Monitoring changes in eGFR trajectory and utilizing the eGFR slope could provide PCPs with a valuable tool for interpreting kidney function trends and integrating these insights into routine clinical practice. Notable CKD prognosis prediction models primarily focus on assessing the risk of progression to ESKD. While these models provide valuable insights for long-term risk stratification, tools capable of precisely forecasting future eGFR levels remain underdeveloped. This gap is significant, as predicting kidney function trajectory with precision would allow PCPs to anticipate CKD progression better and optimize patient management. Tracking changes in eGFR trajectory, mainly through the eGFR slope, could bridge this gap. The eGFR slope offers a dynamic and actionable metric that not only aids in interpreting kidney function prognosis but also supports PCPs in incorporating these insights into their daily clinical practice. This approach has the potential to improve early detection and facilitate more effective collaboration with specialists by offering a more precise projection of future kidney function. Using data from the J-CKD-DB-Ex, the largest longitudinal CKD database in Japan, we aimed to develop and validate a prognosis prediction model based on the eGFR slope. The J-CKD-DB-Ex is a real-world database tailored explicitly to CKD in Japan, offering comprehensive and nationally representative data. By systematically collecting information from hospitals nationwide, without significant regional biases, this database provides a detailed overview of CKD management practices in Japan. Its ability to capture the state of CKD diagnosis and treatment across diverse clinical settings makes it an invaluable resource for research. Utilizing this robust and representative dataset enhances the relevance and applicability of our prediction model to real-world clinical scenarios in Japan. The goal of this study is to estimate the extent of future kidney function decline by projecting the eGFR slope from current clinical information and predicting the trajectory 2–3 years into the future. Unlike conventional kidney failure prediction models, this eGFR slope prediction model is designed specifically for the context of daily clinical practice among PCPs. Additionally, it allows for on-demand predictions at any given point based on historical health data. By supporting disease awareness and fostering interdisciplinary collaboration, this model is expected to facilitate the prevention and early detection of CKD in primary care settings. This gap is significant, as accurate prediction of kidney function trajectories would allow PCPs to better anticipate CKD progression and optimise patient management. However, a major barrier to incorporating eGFR slope into routine clinical practice is the requirement for multiple eGFR measurements over time. In primary care settings, longitudinal data are often lacking or collected inconsistently. Therefore, the ability to predict eGFR slope from a single time-point measurement - combined with other routinely available clinical parameters - represents a significant advance. This feature greatly enhances the practicality and applicability of the model in real-world clinical settings, particularly for PCPs managing early-stage CKD. The eGFR slope provides a dynamic and actionable metric that not only aids in the interpretation of renal function prognosis, but also facilitates earlier detection and more effective collaboration between PCPs and specialists. METHODS Study design and ethics This is a non-interventional, retrospective, prognostic study to develop and validate machine learning models for predicting eGFR slope, using a large, multicenter registry of CKD. This study was approved by the ethics committee of Kawasaki Medical School. Owing to the nature of this study, it was exempt from the need to obtain informed consent from participants (Nos. 6400). Data source J-CKD-DB-Ex is the largest, longitudinal (for seven years) database of CKD in Japan. The Ministry of Health, Labor and Welfare has developed the Standardized Structured Medical Information eXchange (SS-MIX2) system, which streamlines the compilation of Electronic Health Record (EHR) data from various systems. The Japanese Society of Nephrology and the Japan Association for Medical Informatics have established a comprehensive clinical database for patients with CKD, known as the Japan Chronic Kidney Disease Database (J-CKD-DB-Ex), by leveraging the SS-MIX2 system 7 , 8 . J-CKD-DB-Ex included patients with CKD aged 18 years or older who had either proteinuria ≥ 1 by dipstick test or eGFR < 60 mL/min/1.73 m 2 . Currently, the database is being further expanded with the participation of 15 university hospitals and has registered approximately 250,000 cases of CKD to date. Patients flow was shown in Fig. 4 . Study population We included patients with CKD aged 18 years or older who had either proteinuria ≥ 1 by dipstick test or eGFR < 60 mL/min/1.73 m 2 . This study included only outpatients who had never been hospitalized throughout the study period, as it was designed to predict outcomes in an outpatient setting. For eGFR measurements, we set one-year windows starting from the initial eGFR measurement (index date) and adopted the first measurement within each window. Specifically, patients were required to have at least one eGFR measurement during each of three one-year windows: 1–2 years, 2–3 years, and 3–4 years after the index date. Therefore, during the observation period, a minimum of four measurements was required, consisting of the index date measurement plus one measurement from each of these three windows (Fig. 5 ). We excluded patients with 1) CKD G1 at baseline, 2) who had already initiated dialysis at baseline. Predictors We used the following baseline predictors: gender (binary), age (continuous, years), baseline eGFR (continuous, mL/min/1.73m²), serum creatinine (continuous, mg/dL), serum albumin (continuous, g/dL), serum sodium (continuous, mEq/L), serum potassium (continuous, mEq/L), qualitative urine protein by dipstick test (binary, positive/negative), quantitative urine protein (continuous, g/gCr), diabetes mellitus and hypertension diagnoses defined by International Classification of Diseases 10th revision (ICD-10) codes (both binary), and medication use including renin-angiotensin system (RAS) inhibitors, sodium-glucose cotransporter-2 (SGLT2) inhibitors, and mineralocorticoid receptor (MR) blockers (all binary). All predictors are used for LightGBM and LSTM. Outcomes The primary outcome is eGFR slope, defined as the difference between the initial eGFR and the follow-up eGFR over the time, measured in mL/min/1.73m²/year. The eGFR values were calculated using the Japanese GFR estimation equation based on serum creatinine values 9 . Prediction methods and statistical analysis We developed and validated prediction models as follows: Data splitting Definition of baseline data and outcomes Model development in the training set Hyperparameter tuning Model evaluation Data splitting We first randomly split the entire dataset into training (80%) and test (20%) sets. The test set was held out and used only for the final evaluation of the model's performance. For model development and validation, we employed 5-fold cross-validation on the training set, where the training data was divided into five equal parts. In each fold of the cross-validation, one part served as a validation set for hyperparameter tuning and model selection, while the remaining four parts were used for training. This process was repeated five times, with each fold serving as the validation set once, ensuring that every sample in the training data contributed to both model training and validation. Definition of baseline data and outcomes ・Predictors and Missing Data Handling We used the following baseline predictors: gender (binary), age (continuous, years), baseline eGFR (continuous, mL/min/1.73m²), serum creatinine (continuous, mg/dL), serum albumin (continuous, g/dL), serum sodium (continuous, mEq/L), serum potassium (continuous, mEq/L), qualitative urine protein by dipstick test (binary, positive/negative), quantitative urine protein (continuous, g/gCr), diabetes mellitus and hypertension diagnoses defined by International Classification of Diseases 10th revision (ICD-10) codes (both binary), and medication use including renin-angiotensin system (RAS) inhibitors, sodium-glucose cotransporter-2 (SGLT2) inhibitors, and mineralocorticoid receptor (MR) blockers (all binary). For missing laboratory data, we first attempted to impute values using the nearest available measurement within one month before or after the visit. For remaining missing values, we employed missing indicators in our analysis 10 . The proportions of missing data are provided in Supplement table 2 . Model development approaches We employed three distinct approaches to predict eGFR trajectories up to three years from the index date.: (1) a conventional linear regression model using historical eGFR measurements prior to the index date, which has been traditionally used in clinical settings, (2) a gradient boosting model using Light Gradient Boosting Machine (LightGBM), which is known for its efficiency and high performance in handling structured data 11 , and (3) a recursive prediction model using Long Short-Term Memory (LSTM) networks, which are specifically designed to process sequential data 12 . Further details on model development methodologies are provided in Supplement Note 1 . Hyperparameter Tuning Model hyperparameters were optimized during the training process using the 5-fold cross-validation structure described above. For both LightGBM and LSTM models, we employed Bayesian optimization for hyperparameter tuning. Performance measures, including RMSE and coefficient of determination, were calculated as the average across all validation folds. Bayesian optimization method was used for both models. A comprehensive summary of the necessary hyperparameters for both machine learning algorithms is provided in the Supplement table 3 . Evaluation criteria We evaluated the final model's predictive accuracy using the following systematic approach. The model was designed to predict eGFR values at three annual time points over a three-year period from the initial measurement. For both predicted and actual eGFR values, we calculated individual patient-specific slopes using linear regression analysis across four time points: baseline and years 1, 2, and 3. The model's performance was evaluated using the Root Mean Square Error (RMSE) between the predicted and actual slopes, calculated as: $$\:\text{RMSE}=\sqrt{\frac{1}{n}{\sum\:}_{i=1}^{n}{\left(\text{predicted }{\text{slope}}_{i}-\text{actual }{\text{slope}}_{i}\right)}^{2}}$$ where n is the number of patients in the test dataset, predicted slope is the slope calculated from the model's predictions, and actual slope is the slope derived from the observed eGFR values. Data Availability The data cannot be fully shared publicly. The reasons are as follows: Data contain potentially sensitive information; Patients did not provide informed consent regarding release of personal data; the Ethics Board of Kawasaki Medical School imposed data restriction. The data are owned by J-CKD-DB project committee. Interested readers may request the data at J-CKD-DB project committee; URL (Japanese), https://j-ckd-db.jp . And the following Email address of the hospital may be useful for readers: [email protected] RESULTS Baseline data Table 1 shows the baseline characteristics of the study population. The median age of participants was 69.0 years [IQR: 62.0–77.0], and 52% (5,493/10,474) of the cohort were male. The Median baseline eGFR was 52.7 mL/min/1.73m² [IQR: 44.7–57.8]. Laboratory values included serum creatinine of 1.0 mg/dL [IQR: 0.8–1.2], plasma albumin of 4.1 g/dL [IQR: 3.9–4.4], serum sodium of 141.0 mEq/L [IQR: 140.0-143.0], and serum potassium of 4.4 mEq/L [IQR: 4.1–4.7]. Regarding urinary findings, 52% (2,346/4,471) of patients had positive proteinuria by dipstick test, with median urine protein of 21.0 g/gCr [IQR: 7.0-67.5]. Comorbidities and medications were also documented: diabetes mellitus was present in 75% of patients and hypertension in 71%. Regarding medications, 24% of patients were prescribed RAS inhibitors, 2.2% were on MR blockers, and 0.4% were on SGLT2 inhibitors at baseline. With a median follow-up period of 6 years, longitudinal eGFR measurements showed a gradual decline over the three-year follow-up period: median eGFR at 1 year was 52.7 mL/min/1.73m² [IQR: 44.5–59.7], at 2 years was 51.9 mL/min/1.73m² [IQR: 43.0–59.0], and at 3 years was 51.2 mL/min/1.73m² [IQR: 42.0-58.5]. Comparison of model performance We compared three different approaches to predicting three-year eGFR trajectories ( Fig. 1 ). The linear regression model used only pre-baseline eGFR measurements for prediction, whereas both the LightGBM and LSTM models used baseline eGFR values together with patient characteristics to predict future trajectories. These predictions were compared with the actual slope calculated using linear regression analysis of quarterly measured eGFR values over three years at baseline, years 1, 2 and 3. The conventional linear regression model showed an RMSE of 15.87 mL/min/1.73m²/year. The LSTM model, which included additional clinical parameters, achieved an RMSE of 3.94 mL/min/1.73m²/year, while the LightGBM model showed the highest accuracy with an RMSE of 2.95 mL/min/1.73m²/year. Implementation of the models in clinical settings Figure 2 shows the web-based application we developed incorporating the LightGBM model, which demonstrated the highest prediction accuracy among the three approaches. The application provides an intuitive interface for healthcare providers to input patient data and visualize predicted eGFR trajectories. When clinicians input patient characteristics and laboratory data, the application displays the predicted eGFR slope values (top right panel) and predicted future trajectories over a three-year period (bottom right panel). DISCUSSION A conceptual diagram of the results of this study is shown in Fig. 3 . We have successfully developed a prediction model for eGFR slope using a nationwide CKD cohort, achieving a 3-year slope error of 2.8 mL/min/1.73m 2 in RMSE. This represents a significant improvement over the conventional linear regression model, which showed an error of 17.2 mL/min/1.73m 2 . The model's ability to generate predictions using single time-point data, commonly available in clinical practice, makes it particularly valuable. When baseline data was included in the multiple linear regression model, the RMSE improved significantly to 3.07 compare to linear regression. This result suggests that baseline data may have a greater influence on future eGFR slope than past eGFR slope. The model was designed primarily with baseline data so that it could be used in PCP practice settings. This limited the input information to base line data, which raised concerns about accuracy, but the RMSE remained satisfactory. The results of this are consistent with the report of Kovesdy et al 13 . Also shows regression coefficients in the Supplement Table 1 . Furthermore, we have developed a web-based application that effectively visualizes predicted results for both PCPs and patients. Recent randomized controlled trials (RCTs) investigating renoprotective effects have increasingly focused on eGFR slope changes as a surrogate endpoint. Multiple studies have highlighted the importance of both early albuminuria changes and GFR slope as surrogate endpoints for kidney disease progression. Major trials such as CREDENCE and EMPA-KIDNEY have incorporated eGFR slope as an exploratory and tertiary outcome, respectively, while maintaining kidney composite outcomes as their primary endpoint 14 , 15 . More recently, Phase 3 trials have begun using these surrogate endpoints as primary outcomes 16 . The evidence supporting GFR slopes as surrogate endpoints is more robust compared to albuminuria changes, suggesting that routine eGFR slope assessment may be more clinically valuable than UACR monitoring. The eGFR slope provides a tangible measure of CKD progression and addresses crucial needs of PCPs, particularly in facilitating collaboration with nephrologists. Recent developments include tools for visualizing long-term eGFR slopes 17 , with studies demonstrating increased specialist referrals following PCPs' use of such applications. Our prediction model offers enhanced temporal flexibility compared to traditional approaches. While calculating eGFR slope typically requires multiple measurements over time, potentially delaying prognosis assessment 18 , our model can generate predictions at any point using available historical health data. For PCPs managing patients with CKD, the critical information needs include predicting disease progression and assessing the risk of near-term kidney failure. The visualization of kidney function changes through eGFR slope not only serves PCPs' needs but also helps raise patient awareness in early CKD stages. Numerous models exist for predicting kidney failure, with improving accuracy. The Kidney Failure Risk Equations, developed using Cox proportional hazards models for patients with CKD, and the survival index developed by DOPPS for hemodialysis patients represent significant advances 19 , 20 , 21 , 22 . The Kidney Failure Risk Equations, derived from 3,449 CKD G3-5 patients in Ontario, Canada 23 , was externally validated using British Columbia data 24 . The Risk Prediction Equations, developed using data from 5,222,711 people across 28 countries in the CKD Prognosis Consortium 25 , predicts 5-year eGFR decline. While machine learning models have been developed to predict eGFR slope stability 26 , these typically provide binary predictions rather than quantitative assessments. Our study has several limitations. SS-MIX2 system database lacks information on CKD etiology, body mass index, and blood pressure levels. During model development, we evaluated the impact of SGLT2 inhibitors and MR blocker on eGFR slope predictions. However, their limited prescription rates in Japan during our study period (2014–2022) resulted in minimal predictive contribution. Given their negligible impact on model performance and to enhance practical usability, we opted to exclude these medication factors from the web-based application for clinical use. Additional limitation is selection bias from requiring baseline and at least three follow-up eGFR measurements, which may exclude patients with poor healthcare access. While this improves eGFR slope estimation, it limits generalizability. However, since interventions target patients engaged in care, this enhances clinical relevance. The model may also overestimate outcomes for those with irregular follow-up. External validation in more diverse datasets is needed to assess broader applicability. The practical implementation of prediction tools through applications is crucial for clinical utility. Visualization enhances usefulness for both PCPs and patients, as demonstrated by Kanda et al.'s machine-learning-based web system for CKD risk prediction and treatment 27 . While existing tools like the Kidney Failure Risk Equations (available at https://kidneyfailurerisk.com/ ) offer valuable insights, their 5-year prediction window may be insufficient for clinical practice, particularly for early-stage patients with CKD. Our model's ability to predict actual eGFR slopes and visualize CKD grade transitions provides a distinct advantage in patient education and management. Declarations Acknowledgments We thank the study participants and the members of the J-CKD-DB-Ex Study Group. The authors meet criteria for authorship as recommended by the ICMJE. The authors did not receive payment related to the development of the manuscript. Boehringer Ingelheim was given the opportunity to review the manuscript for medical and scientific accuracy as well as intellectual property considerations. This study was supported and funded by Nippon Boehringer Ingelheim Co., Ltd. and Eli Lilly Japan K.K. Contributors: H.N., T.N., K.I., T.G., D.N., S.K., T.S. and N.K. were involved in conceptualization, study design. H.N., T.N., T.G. and S.K. were responsible for screening data sources and data extraction. T.N. and T.G. performed data analysis and constructed web application. H.N., T.N., K.I., T.G., and S.K. wrote the original draft of the manuscript. T.S. and N.K. revised and wrote the final draft. All authors have read and approved the manuscript. Competing interests: Katsuhito Ihara and Daisuke Nitta are employees of Nippon Boehringer Ingelheim Co., Ltd. Funding: This work was supported in part by Nippon Boehringer Ingelheim Co., Ltd. and Eli Lilly Japan K.K. References Gansevoort, R.T. et al. Chronic kidney disease and cardiovascular risk: epidemiology, mechanisms, and prevention. Lancet 382, 339–352 (2013). Andermann, A. Breaking away from the disease-focused paradigm. Lancet 376, 2073–2074 (2010). CKD Clinical Practice Guide 2024 (Japanese). The Japanese Society for Nephrology (2024). Imai, E. et al. Prevalence of chronic kidney disease in the Japanese general population. Clin Exp Nephrol 13, 621–630 (2009). Inker, L.A. et al. GFR Slope as a Surrogate End Point for Kidney Disease Progression in Clinical Trials: A Meta-Analysis of Treatment Effects of Randomized Controlled Trials. J Am Soc Nephrol 30, 1735–1745 (2019). Inker, L.A. et al. KDOQI US commentary on the 2012 KDIGO clinical practice guideline for the evaluation and management of CKD. Am J Kidney Dis 63, 713–735 (2014). Nakagawa, N. et al. J-CKD-DB: a nationwide multicentre electronic health record-based chronic kidney disease database in Japan. Sci Rep 10, 7351 (2020). Nagasu, H. et al. Kidney Outcomes Associated With SGLT2 Inhibitors Versus Other Glucose-Lowering Drugs in Real-world Clinical Practice: The Japan Chronic Kidney Disease Database. Diabetes Care 44, 2542–2551 (2021). Matsuo, S. et al. Revised equations for estimated GFR from serum creatinine in Japan. Am J Kidney Dis 53, 982–992 (2009). Sisk, R., Sperrin, M., Peek, N., van Smeden, M. & Martin, G.P. Imputation and missing indicators for handling missing data in the development and deployment of clinical prediction models: A simulation study. Stat Methods Med Res 32, 1461–1477 (2023). Guolin Ke, Q.M., Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu. LightGBM: a highly efficient gradient boosting decision tree. Proceedings of the 31st International Conference on Neural Information Processing Systems , 3149–3157 (2017). Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput 9, 1735–1780 (1997). Kovesdy, C.P. et al. Past Decline Versus Current eGFR and Subsequent ESRD Risk. J Am Soc Nephrol 27, 2447–2455 (2016). Perkovic, V. et al. Canagliflozin and Renal Outcomes in Type 2 Diabetes and Nephropathy. N Engl J Med 380, 2295–2306 (2019). The, E.-K.C.G. et al. Empagliflozin in Patients with Chronic Kidney Disease. N Engl J Med 388, 117–127 (2023). Heerspink, H.J.L. et al. Sparsentan in patients with IgA nephropathy: a prespecified interim analysis from a randomised, double-blind, active-controlled clinical trial. Lancet 401, 1584–1594 (2023). Nakazawa, J. et al. A Long-term Estimated Glomerular Filtration Rate Plot Analysis Permits the Accurate Assessment of a Decline in the Renal Function by Minimizing the Influence of Estimated Glomerular Filtration Rate Fluctuations. Intern Med 61, 1823–1833 (2022). Itano, S., Kanda, E., Nagasu, H., Nangaku, M. & Kashihara, N. eGFR slope as a surrogate endpoint for clinical study in early stage of chronic kidney disease: from The Japan Chronic Kidney Disease Database. Clin Exp Nephrol 27, 847–856 (2023). Tangri, N. et al. A predictive model for progression of chronic kidney disease to kidney failure. JAMA 305, 1553–1559 (2011). Liu, P. et al. Predicting the risks of kidney failure and death in adults with moderate to severe chronic kidney disease: multinational, longitudinal, population based, cohort study. BMJ 385, e078063 (2024). Tangri, N., Ferguson, T. & Komenda, P. Pro: Risk scores for chronic kidney disease progression are robust, powerful and ready for implementation. Nephrol Dial Transplant 32, 748–751 (2017). Kanda, E., Bieber, B.A., Pisoni, R.L., Robinson, B.M. & Fuller, D.S. Importance of simultaneous evaluation of multiple risk factors for hemodialysis patients' mortality and development of a novel index: dialysis outcomes and practice patterns study. PLoS One 10, e0128652 (2015). Tangri, N. et al. Multinational Assessment of Accuracy of Equations for Predicting Risk of Kidney Failure: A Meta-analysis. JAMA 315, 164–174 (2016). Grams, M.E. et al. The Kidney Failure Risk Equation: Evaluation of Novel Input Variables including eGFR Estimated Using the CKD-EPI 2021 Equation in 59 Cohorts. J Am Soc Nephrol 34, 482–494 (2023). Nelson, R.G. et al. Development of Risk Prediction Equations for Incident Chronic Kidney Disease. JAMA 322, 2104–2114 (2019). Lukomski, L. et al. First experiences with machine learning predictions of accelerated declining eGFR slope of living kidney donors 3 years after donation. J Nephrol 37, 1631–1642 (2024). Kanda, E., Epureanu, B.I., Adachi, T. & Kashihara, N. Machine-learning-based Web system for the prediction of chronic kidney disease progression and mortality. PLOS Digit Health 2, e0000188 (2023). Table 1 Table 1 is available in the Supplementary Files section. Additional Declarations There is NO Competing Interest. Supplementary Files SupplementtablefigureeGFRslopepredictionfinalv5.docx supplemental information Table1.png Table 1. Baseline Characteristics Clinical and demographic characteristics of the study population at baseline, including laboratory values, comorbidities, and medications. Abbreviations: eGFR, estimated Glomerular Filtration Rate; CKD, Chronic Kidney Disease; IQR, Interquartile Range; RAS, Renin-Angiotensin System; SGLT2, Sodium-Glucose Cotransporter-2; MR, Mineralocorticoid Receptor Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6575802","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":452792064,"identity":"f6f207b7-c84d-429b-9c90-6bebdabe26a8","order_by":0,"name":"Hajime Nagasu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA90lEQVRIiWNgGAWjYBACAwaGhANw3gcI04x4LYwziNSCAMw8xGgxZz/w8NCNijvy8u7Hn0nbnNlmzyDdvO0BQ41NNC4tlj0JCYdzzjwz3Hgmx0w658btxAaZY+UGDMfSchtwOewAUEtu22HGjQ05bNI5H24n2N/IMZNgbDiMW8v5B0At/w7bb+x//kza4sNtewYJQlpugGxpOJw4XyLBTJrhxm3GBsJagLbkHDucvEHijbFlzxmgXyTSyiQS8PnlfE7y55yaw7bz+9Mf3vhxDOSw5G0SH2pscGphYOBJgIYDsmACTuUgwA5RK4/b0FEwCkbBKBjpAACs7mafNK1w3AAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0000-0003-2549-4126","institution":"Kawasaki Medical School","correspondingAuthor":true,"prefix":"","firstName":"Hajime","middleName":"","lastName":"Nagasu","suffix":""},{"id":452792065,"identity":"94c3ff77-dbe6-4087-8fbd-273ee0f3aba6","order_by":1,"name":"Takaya Nakashima","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Takaya","middleName":"","lastName":"Nakashima","suffix":""},{"id":452792066,"identity":"3d397e9f-9786-4afb-b0a1-cc5d3aab4c1d","order_by":2,"name":"Katsuhito Ihara","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Katsuhito","middleName":"","lastName":"Ihara","suffix":""},{"id":452792067,"identity":"a29f5955-a324-4d35-87fe-fb90a6d68579","order_by":3,"name":"Ryo Fujimori","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Ryo","middleName":"","lastName":"Fujimori","suffix":""},{"id":452792068,"identity":"94278a67-ccd2-48f7-8f57-15c867bd302b","order_by":4,"name":"Tadahiro Goto","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Tadahiro","middleName":"","lastName":"Goto","suffix":""},{"id":452792069,"identity":"6b130ce2-345d-488f-b7ce-2398f4cfa26a","order_by":5,"name":"Daisuke Nitta","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Daisuke","middleName":"","lastName":"Nitta","suffix":""},{"id":452792070,"identity":"deaa42de-1463-4ef0-8606-2c5438957c57","order_by":6,"name":"Seiji Kishi","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Seiji","middleName":"","lastName":"Kishi","suffix":""},{"id":452792071,"identity":"d4f5e19f-2970-462c-a4fb-268fc2292c46","order_by":7,"name":"Tamaki Sasaki","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Tamaki","middleName":"","lastName":"Sasaki","suffix":""},{"id":452792072,"identity":"b2c36ccd-e2d8-4edf-a7f1-ac89952273bc","order_by":8,"name":"Naoki Kashihara","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Naoki","middleName":"","lastName":"Kashihara","suffix":""}],"badges":[],"createdAt":"2025-05-02 06:15:08","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6575802/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6575802/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":82892428,"identity":"13643857-f542-409a-abd5-d607d7f83aaa","added_by":"auto","created_at":"2025-05-16 12:19:27","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":28071,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eComparison of eGFR Slope Prediction Models\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003ePerformance comparison of three prediction approaches using RMSE.\u003c/p\u003e\n\u003cp\u003eAbbreviations: eGFR, estimated Glomerular Filtration Rate; RMSE, Root Mean Square Error; LSTM, Long Short-Term Memory; LightGBM, Light Gradient Boosting Machine\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-6575802/v1/243ad0c8814ac70f7ea267ef.png"},{"id":82893700,"identity":"a3d044f8-dd8a-496a-82d9-f029ce8c2131","added_by":"auto","created_at":"2025-05-16 12:27:27","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":48365,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eApplication of Prediction eGFR Slope\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWeb-based interface demonstration for clinical implementation of the prediction model.\u003c/p\u003e\n\u003cp\u003eAbbreviations: eGFR, estimated Glomerular Filtration Rate\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-6575802/v1/dc2b465d8da22dc2fbf3ad53.png"},{"id":82892426,"identity":"b0c0aa54-a57f-47b9-85ed-18e67cb85df6","added_by":"auto","created_at":"2025-05-16 12:19:27","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":32103,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eConceptual diagram of the results of this study\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA conceptual diagram of the results of this study. Y-axis: eGFR (estimated Glomerular Filtration Rate) X-axis:Year.,Blue Line: Predictions from Linear Regression., Red Line: Predictions from LightGBM.,\u003cstrong\u003e \u003c/strong\u003eDashed Line: True eGFR slope. Abbreviations: eGFR, estimated Glomerular Filtration Rate, LightGBM, Light Gradient Boosting Machine\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-6575802/v1/053587acce6d3566dd9c23ee.png"},{"id":82892430,"identity":"e253ec84-4b9e-449e-b8dd-ed5d988046fe","added_by":"auto","created_at":"2025-05-16 12:19:27","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":30821,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ePatients flow of this study\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe patient data flow used in this study is show. The study used data from J-CKD-DB-Ex from 2014 to 2022. eGFR, estimated glomerular filtration rate\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-6575802/v1/6a776a63b33298dacc8a6dd7.png"},{"id":82892432,"identity":"af63e06a-d318-480a-a807-1a93ee1a2917","added_by":"auto","created_at":"2025-05-16 12:19:27","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":30214,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDefinition of measurement windows and data collection points for eGFR\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eStudy timeline depicting the index date and follow-up measurement windows for eGFR data collection. Abbreviations: eGFR, estimated Glomerular Filtration Rate\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-6575802/v1/624c9f28a3496de514706880.png"},{"id":87761630,"identity":"1222ef6e-170a-40b2-a7d4-3dc75b878c84","added_by":"auto","created_at":"2025-07-28 17:04:43","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":914253,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6575802/v1/fb4d7442-9881-4a1c-a330-c0dcdfbfc051.pdf"},{"id":82893702,"identity":"cf7ef99a-80f7-4c91-b1c4-836d116a1771","added_by":"auto","created_at":"2025-05-16 12:27:27","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":26004,"visible":true,"origin":"","legend":"supplemental information","description":"","filename":"SupplementtablefigureeGFRslopepredictionfinalv5.docx","url":"https://assets-eu.researchsquare.com/files/rs-6575802/v1/0cdc04ef6cb4c5c9200a3551.docx"},{"id":82894215,"identity":"4fa515e3-1e70-409d-92bb-1507953e71a5","added_by":"auto","created_at":"2025-05-16 12:35:27","extension":"png","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":86174,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTable 1. Baseline Characteristics\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eClinical and demographic characteristics of the study population at baseline, including laboratory values, comorbidities, and medications.\u003c/p\u003e\n\u003cp\u003eAbbreviations: eGFR, estimated Glomerular Filtration Rate; CKD, Chronic Kidney Disease; IQR, Interquartile Range; RAS, Renin-Angiotensin System; SGLT2, Sodium-Glucose Cotransporter-2; MR, Mineralocorticoid Receptor\u003c/p\u003e","description":"","filename":"Table1.png","url":"https://assets-eu.researchsquare.com/files/rs-6575802/v1/96966804a6537467db4fa74f.png"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"Prediction of Estimated Glomerular Filtration Rate Slope and Kidney Prognosis of Patients with Chronic Kidney Disease","fulltext":[{"header":"INTRODUCTION","content":"\u003cp\u003eSince the concept of chronic kidney disease (CKD) was established in 2002, the importance of early diagnosis and treatment for kidney dysfunction has been widely recognized. However, the global number of patients requiring kidney replacement therapy continues to rise. Furthermore, CKD is a significant risk factor for cardiovascular disease (CVD) and mortality, posing a major public health challenge\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. An epidemiological survey conducted in 2005 estimated that 12.9% of Japanese adults, or approximately 13.3\u0026nbsp;million people, had CKD. However, when considering the higher prevalence of CKD among individuals who do not undergo health checkups compared to those who do, a revised estimate suggests that one in five Japanese adults, or approximately 20\u0026nbsp;million people, may have CKD\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e. Given the substantial number of patients with CKD, it is impractical to manage all cases exclusively with nephrologists. As a result, the majority of individuals with early-stage CKD are managed mainly by primary care physicians (PCPs), whereas those with CKD stage G3b or more advanced disease are typically referred to nephrologists for specialized care. However, awareness of CKD and its early detection remains insufficient, as does the establishment of collaborative medical systems between PCPs and nephrologists in various regions.\u003c/p\u003e \u003cp\u003eFor advanced CKD (G3b\u0026ndash;G5 in CKD staging), a 30\u0026ndash;40% decline in eGFR or a doubling of serum creatinine levels was established as a surrogate endpoint for end-stage kidney disease (ESKD) by Kidney Disease: Improving Global Outcomes (KDIGO) in 2017\u003csup\u003e4\u003c/sup\u003e. These surrogate endpoints were validated in Japanese populations and incorporated into clinical guidelines for CKD evaluation\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e. In 2018, a scientific workshop sponsored by the National Kidney Foundation, U.S. Food and Drug Administration, and European Medicines Agency recommended the use of a surrogate endpoint for early CKD\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e. These findings highlight the value of assessing eGFR slope in CKD management.\u003c/p\u003e \u003cp\u003eHowever, the integration of this approach into clinical practice by PCPs remains limited. Empowering PCPs to accurately predict CKD progression using routine clinical data is essential for advancing CKD prevention, facilitating early detection, and optimizing the timing of referrals to specialists.\u003c/p\u003e \u003cp\u003eAt present, several representative CKD prognosis prediction models have been developed. While these models enable risk assessment, none are based on the eGFR slope. Monitoring changes in eGFR trajectory and utilizing the eGFR slope could provide PCPs with a valuable tool for interpreting kidney function trends and integrating these insights into routine clinical practice.\u003c/p\u003e \u003cp\u003eNotable CKD prognosis prediction models primarily focus on assessing the risk of progression to ESKD. While these models provide valuable insights for long-term risk stratification, tools capable of precisely forecasting future eGFR levels remain underdeveloped. This gap is significant, as predicting kidney function trajectory with precision would allow PCPs to anticipate CKD progression better and optimize patient management. Tracking changes in eGFR trajectory, mainly through the eGFR slope, could bridge this gap. The eGFR slope offers a dynamic and actionable metric that not only aids in interpreting kidney function prognosis but also supports PCPs in incorporating these insights into their daily clinical practice. This approach has the potential to improve early detection and facilitate more effective collaboration with specialists by offering a more precise projection of future kidney function.\u003c/p\u003e \u003cp\u003eUsing data from the J-CKD-DB-Ex, the largest longitudinal CKD database in Japan, we aimed to develop and validate a prognosis prediction model based on the eGFR slope. The J-CKD-DB-Ex is a real-world database tailored explicitly to CKD in Japan, offering comprehensive and nationally representative data. By systematically collecting information from hospitals nationwide, without significant regional biases, this database provides a detailed overview of CKD management practices in Japan. Its ability to capture the state of CKD diagnosis and treatment across diverse clinical settings makes it an invaluable resource for research. Utilizing this robust and representative dataset enhances the relevance and applicability of our prediction model to real-world clinical scenarios in Japan. The goal of this study is to estimate the extent of future kidney function decline by projecting the eGFR slope from current clinical information and predicting the trajectory 2\u0026ndash;3 years into the future. Unlike conventional kidney failure prediction models, this eGFR slope prediction model is designed specifically for the context of daily clinical practice among PCPs. Additionally, it allows for on-demand predictions at any given point based on historical health data. By supporting disease awareness and fostering interdisciplinary collaboration, this model is expected to facilitate the prevention and early detection of CKD in primary care settings. This gap is significant, as accurate prediction of kidney function trajectories would allow PCPs to better anticipate CKD progression and optimise patient management. However, a major barrier to incorporating eGFR slope into routine clinical practice is the requirement for multiple eGFR measurements over time. In primary care settings, longitudinal data are often lacking or collected inconsistently. Therefore, the ability to predict eGFR slope from a single time-point measurement - combined with other routinely available clinical parameters - represents a significant advance. This feature greatly enhances the practicality and applicability of the model in real-world clinical settings, particularly for PCPs managing early-stage CKD. The eGFR slope provides a dynamic and actionable metric that not only aids in the interpretation of renal function prognosis, but also facilitates earlier detection and more effective collaboration between PCPs and specialists.\u003c/p\u003e"},{"header":"METHODS","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eStudy design and ethics\u003c/h2\u003e \u003cp\u003eThis is a non-interventional, retrospective, prognostic study to develop and validate machine learning models for predicting eGFR slope, using a large, multicenter registry of CKD. This study was approved by the ethics committee of Kawasaki Medical School. Owing to the nature of this study, it was exempt from the need to obtain informed consent from participants (Nos. 6400).\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eData source\u003c/h3\u003e\n\u003cp\u003eJ-CKD-DB-Ex is the largest, longitudinal (for seven years) database of CKD in Japan.\u003c/p\u003e \u003cp\u003eThe Ministry of Health, Labor and Welfare has developed the Standardized Structured Medical Information eXchange (SS-MIX2) system, which streamlines the compilation of Electronic Health Record (EHR) data from various systems. The Japanese Society of Nephrology and the Japan Association for Medical Informatics have established a comprehensive clinical database for patients with CKD, known as the Japan Chronic Kidney Disease Database (J-CKD-DB-Ex), by leveraging the SS-MIX2 system\u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e. J-CKD-DB-Ex included patients with CKD aged 18 years or older who had either proteinuria\u0026thinsp;\u0026ge;\u0026thinsp;1 by dipstick test or eGFR\u0026thinsp;\u0026lt;\u0026thinsp;60 mL/min/1.73 m\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. Currently, the database is being further expanded with the participation of 15 university hospitals and has registered approximately 250,000 cases of CKD to date. Patients flow was shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e4\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e\n\u003ch3\u003eStudy population\u003c/h3\u003e\n\u003cp\u003eWe included patients with CKD aged 18 years or older who had either proteinuria\u0026thinsp;\u0026ge;\u0026thinsp;1 by dipstick test or eGFR\u0026thinsp;\u0026lt;\u0026thinsp;60 mL/min/1.73 m\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. This study included only outpatients who had never been hospitalized throughout the study period, as it was designed to predict outcomes in an outpatient setting. For eGFR measurements, we set one-year windows starting from the initial eGFR measurement (index date) and adopted the first measurement within each window. Specifically, patients were required to have at least one eGFR measurement during each of three one-year windows: 1\u0026ndash;2 years, 2\u0026ndash;3 years, and 3\u0026ndash;4 years after the index date. Therefore, during the observation period, a minimum of four measurements was required, consisting of the index date measurement plus one measurement from each of these three windows (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e5\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eWe excluded patients with 1) CKD G1 at baseline, 2) who had already initiated dialysis at baseline.\u003c/p\u003e\n\u003ch3\u003ePredictors\u003c/h3\u003e\n\u003cp\u003eWe used the following baseline predictors: gender (binary), age (continuous, years), baseline eGFR (continuous, mL/min/1.73m\u0026sup2;), serum creatinine (continuous, mg/dL), serum albumin (continuous, g/dL), serum sodium (continuous, mEq/L), serum potassium (continuous, mEq/L), qualitative urine protein by dipstick test (binary, positive/negative), quantitative urine protein (continuous, g/gCr), diabetes mellitus and hypertension diagnoses defined by International Classification of Diseases 10th revision (ICD-10) codes (both binary), and medication use including renin-angiotensin system (RAS) inhibitors, sodium-glucose cotransporter-2 (SGLT2) inhibitors, and mineralocorticoid receptor (MR) blockers (all binary). All predictors are used for LightGBM and LSTM.\u003c/p\u003e\n\u003ch3\u003eOutcomes\u003c/h3\u003e\n\u003cp\u003eThe primary outcome is eGFR slope, defined as the difference between the initial eGFR and the follow-up eGFR over the time, measured in mL/min/1.73m\u0026sup2;/year. The eGFR values were calculated using the Japanese GFR estimation equation based on serum creatinine values\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003ePrediction methods and statistical analysis\u003c/h2\u003e \u003cp\u003eWe developed and validated prediction models as follows:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eData splitting\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eDefinition of baseline data and outcomes\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eModel development in the training set\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eHyperparameter tuning\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eModel evaluation\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eData splitting\u003c/h3\u003e\n\u003cp\u003eWe first randomly split the entire dataset into training (80%) and test (20%) sets. The test set was held out and used only for the final evaluation of the model's performance. For model development and validation, we employed 5-fold cross-validation on the training set, where the training data was divided into five equal parts. In each fold of the cross-validation, one part served as a validation set for hyperparameter tuning and model selection, while the remaining four parts were used for training. This process was repeated five times, with each fold serving as the validation set once, ensuring that every sample in the training data contributed to both model training and validation.\u003c/p\u003e\n\u003ch3\u003eDefinition of baseline data and outcomes\u003c/h3\u003e\n\u003cp\u003e・Predictors and Missing Data Handling\u003c/p\u003e \u003cp\u003eWe used the following baseline predictors: gender (binary), age (continuous, years), baseline eGFR (continuous, mL/min/1.73m\u0026sup2;), serum creatinine (continuous, mg/dL), serum albumin (continuous, g/dL), serum sodium (continuous, mEq/L), serum potassium (continuous, mEq/L), qualitative urine protein by dipstick test (binary, positive/negative), quantitative urine protein (continuous, g/gCr), diabetes mellitus and hypertension diagnoses defined by International Classification of Diseases 10th revision (ICD-10) codes (both binary), and medication use including renin-angiotensin system (RAS) inhibitors, sodium-glucose cotransporter-2 (SGLT2) inhibitors, and mineralocorticoid receptor (MR) blockers (all binary).\u003c/p\u003e \u003cp\u003eFor missing laboratory data, we first attempted to impute values using the nearest available measurement within one month before or after the visit. For remaining missing values, we employed missing indicators in our analysis\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e. The proportions of missing data are provided in \u003cb\u003eSupplement table 2\u003c/b\u003e.\u003c/p\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eModel development approaches\u003c/h2\u003e \u003cp\u003eWe employed three distinct approaches to predict eGFR trajectories up to three years from the index date.: (1) a conventional linear regression model using historical eGFR measurements prior to the index date, which has been traditionally used in clinical settings, (2) a gradient boosting model using Light Gradient Boosting Machine (LightGBM), which is known for its efficiency and high performance in handling structured data\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e, and (3) a recursive prediction model using Long Short-Term Memory (LSTM) networks, which are specifically designed to process sequential data\u003csup\u003e\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e. Further details on model development methodologies are provided in \u003cb\u003eSupplement Note 1\u003c/b\u003e.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eHyperparameter Tuning\u003c/h2\u003e \u003cp\u003eModel hyperparameters were optimized during the training process using the 5-fold cross-validation structure described above. For both LightGBM and LSTM models, we employed Bayesian optimization for hyperparameter tuning. Performance measures, including RMSE and coefficient of determination, were calculated as the average across all validation folds. Bayesian optimization method was used for both models. A comprehensive summary of the necessary hyperparameters for both machine learning algorithms is provided in \u003cb\u003ethe Supplement table 3\u003c/b\u003e.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eEvaluation criteria\u003c/h2\u003e \u003cp\u003eWe evaluated the final model's predictive accuracy using the following systematic approach. The model was designed to predict eGFR values at three annual time points over a three-year period from the initial measurement.\u003c/p\u003e \u003cp\u003eFor both predicted and actual eGFR values, we calculated individual patient-specific slopes using linear regression analysis across four time points: baseline and years 1, 2, and 3. The model's performance was evaluated using the Root Mean Square Error (RMSE) between the predicted and actual slopes, calculated as:\u003cdiv id=\"Equa\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equa\" name=\"EquationSource\"\u003e\n$$\\:\\text{RMSE}=\\sqrt{\\frac{1}{n}{\\sum\\:}_{i=1}^{n}{\\left(\\text{predicted }{\\text{slope}}_{i}-\\text{actual }{\\text{slope}}_{i}\\right)}^{2}}$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003ewhere n is the number of patients in the test dataset, predicted slope is the slope calculated from the model's predictions, and actual slope is the slope derived from the observed eGFR values.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eData Availability\u003c/h2\u003e \u003cp\u003eThe data cannot be fully shared publicly. The reasons are as follows: Data contain potentially sensitive information; Patients did not provide informed consent regarding release of personal data; the Ethics Board of Kawasaki Medical School imposed data restriction. The data are owned by J-CKD-DB project committee. Interested readers may request the data at J-CKD-DB project committee; URL (Japanese), \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://j-ckd-db.jp\u003c/span\u003e\u003cspan address=\"https://j-ckd-db.jp\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. And the following Email address of the hospital may be useful for readers: [email protected]\u003c/p\u003e \u003c/div\u003e"},{"header":"RESULTS","content":"\u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eBaseline data\u003c/h2\u003e \u003cp\u003e \u003cb\u003eTable\u0026nbsp;1\u003c/b\u003e shows the baseline characteristics of the study population. The median age of participants was 69.0 years [IQR: 62.0\u0026ndash;77.0], and 52% (5,493/10,474) of the cohort were male. The Median baseline eGFR was 52.7 mL/min/1.73m\u0026sup2; [IQR: 44.7\u0026ndash;57.8]. Laboratory values included serum creatinine of 1.0 mg/dL [IQR: 0.8\u0026ndash;1.2], plasma albumin of 4.1 g/dL [IQR: 3.9\u0026ndash;4.4], serum sodium of 141.0 mEq/L [IQR: 140.0-143.0], and serum potassium of 4.4 mEq/L [IQR: 4.1\u0026ndash;4.7]. Regarding urinary findings, 52% (2,346/4,471) of patients had positive proteinuria by dipstick test, with median urine protein of 21.0 g/gCr [IQR: 7.0-67.5]. Comorbidities and medications were also documented: diabetes mellitus was present in 75% of patients and hypertension in 71%. Regarding medications, 24% of patients were prescribed RAS inhibitors, 2.2% were on MR blockers, and 0.4% were on SGLT2 inhibitors at baseline. With a median follow-up period of 6 years, longitudinal eGFR measurements showed a gradual decline over the three-year follow-up period: median eGFR at 1 year was 52.7 mL/min/1.73m\u0026sup2; [IQR: 44.5\u0026ndash;59.7], at 2 years was 51.9 mL/min/1.73m\u0026sup2; [IQR: 43.0\u0026ndash;59.0], and at 3 years was 51.2 mL/min/1.73m\u0026sup2; [IQR: 42.0-58.5].\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eComparison of model performance\u003c/h2\u003e \u003cp\u003eWe compared three different approaches to predicting three-year eGFR trajectories \u003cb\u003e(\u003c/b\u003eFig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The linear regression model used only pre-baseline eGFR measurements for prediction, whereas both the LightGBM and LSTM models used baseline eGFR values together with patient characteristics to predict future trajectories. These predictions were compared with the actual slope calculated using linear regression analysis of quarterly measured eGFR values over three years at baseline, years 1, 2 and 3.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe conventional linear regression model showed an RMSE of 15.87 mL/min/1.73m\u0026sup2;/year. The LSTM model, which included additional clinical parameters, achieved an RMSE of 3.94 mL/min/1.73m\u0026sup2;/year, while the LightGBM model showed the highest accuracy with an RMSE of 2.95 mL/min/1.73m\u0026sup2;/year.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003eImplementation of the models in clinical settings\u003c/h2\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003e shows the web-based application we developed incorporating the LightGBM model, which demonstrated the highest prediction accuracy among the three approaches. The application provides an intuitive interface for healthcare providers to input patient data and visualize predicted eGFR trajectories. When clinicians input patient characteristics and laboratory data, the application displays the predicted eGFR slope values (top right panel) and predicted future trajectories over a three-year period (bottom right panel).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"DISCUSSION","content":"\u003cp\u003eA conceptual diagram of the results of this study is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e3\u003c/span\u003e. We have successfully developed a prediction model for eGFR slope using a nationwide CKD cohort, achieving a 3-year slope error of 2.8 mL/min/1.73m\u003csup\u003e2\u003c/sup\u003e in RMSE. This represents a significant improvement over the conventional linear regression model, which showed an error of 17.2 mL/min/1.73m\u003csup\u003e2\u003c/sup\u003e. The model's ability to generate predictions using single time-point data, commonly available in clinical practice, makes it particularly valuable. When baseline data was included in the multiple linear regression model, the RMSE improved significantly to 3.07 compare to linear regression. This result suggests that baseline data may have a greater influence on future eGFR slope than past eGFR slope. The model was designed primarily with baseline data so that it could be used in PCP practice settings. This limited the input information to base line data, which raised concerns about accuracy, but the RMSE remained satisfactory. The results of this are consistent with the report of Kovesdy et al\u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e. Also shows regression coefficients in the \u003cb\u003eSupplement Table\u0026nbsp;1\u003c/b\u003e. Furthermore, we have developed a web-based application that effectively visualizes predicted results for both PCPs and patients.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eRecent randomized controlled trials (RCTs) investigating renoprotective effects have increasingly focused on eGFR slope changes as a surrogate endpoint. Multiple studies have highlighted the importance of both early albuminuria changes and GFR slope as surrogate endpoints for kidney disease progression. Major trials such as CREDENCE and EMPA-KIDNEY have incorporated eGFR slope as an exploratory and tertiary outcome, respectively, while maintaining kidney composite outcomes as their primary endpoint\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e. More recently, Phase 3 trials have begun using these surrogate endpoints as primary outcomes\u003csup\u003e\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e. The evidence supporting GFR slopes as surrogate endpoints is more robust compared to albuminuria changes, suggesting that routine eGFR slope assessment may be more clinically valuable than UACR monitoring.\u003c/p\u003e \u003cp\u003eThe eGFR slope provides a tangible measure of CKD progression and addresses crucial needs of PCPs, particularly in facilitating collaboration with nephrologists. Recent developments include tools for visualizing long-term eGFR slopes\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e, with studies demonstrating increased specialist referrals following PCPs' use of such applications. Our prediction model offers enhanced temporal flexibility compared to traditional approaches. While calculating eGFR slope typically requires multiple measurements over time, potentially delaying prognosis assessment\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e, our model can generate predictions at any point using available historical health data. For PCPs managing patients with CKD, the critical information needs include predicting disease progression and assessing the risk of near-term kidney failure. The visualization of kidney function changes through eGFR slope not only serves PCPs' needs but also helps raise patient awareness in early CKD stages.\u003c/p\u003e \u003cp\u003eNumerous models exist for predicting kidney failure, with improving accuracy. The Kidney Failure Risk Equations, developed using Cox proportional hazards models for patients with CKD, and the survival index developed by DOPPS for hemodialysis patients represent significant advances\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e, \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e. The Kidney Failure Risk Equations, derived from 3,449 CKD G3-5 patients in Ontario, Canada\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e, was externally validated using British Columbia data\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e. The Risk Prediction Equations, developed using data from 5,222,711 people across 28 countries in the CKD Prognosis Consortium\u003csup\u003e\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e, predicts 5-year eGFR decline. While machine learning models have been developed to predict eGFR slope stability\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e, these typically provide binary predictions rather than quantitative assessments.\u003c/p\u003e \u003cp\u003eOur study has several limitations. SS-MIX2 system database lacks information on CKD etiology, body mass index, and blood pressure levels. During model development, we evaluated the impact of SGLT2 inhibitors and MR blocker on eGFR slope predictions. However, their limited prescription rates in Japan during our study period (2014\u0026ndash;2022) resulted in minimal predictive contribution. Given their negligible impact on model performance and to enhance practical usability, we opted to exclude these medication factors from the web-based application for clinical use. Additional limitation is selection bias from requiring baseline and at least three follow-up eGFR measurements, which may exclude patients with poor healthcare access. While this improves eGFR slope estimation, it limits generalizability. However, since interventions target patients engaged in care, this enhances clinical relevance. The model may also overestimate outcomes for those with irregular follow-up. External validation in more diverse datasets is needed to assess broader applicability.\u003c/p\u003e \u003cp\u003eThe practical implementation of prediction tools through applications is crucial for clinical utility. Visualization enhances usefulness for both PCPs and patients, as demonstrated by Kanda et al.'s machine-learning-based web system for CKD risk prediction and treatment\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e. While existing tools like the Kidney Failure Risk Equations (available at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://kidneyfailurerisk.com/\u003c/span\u003e\u003cspan address=\"https://kidneyfailurerisk.com/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) offer valuable insights, their 5-year prediction window may be insufficient for clinical practice, particularly for early-stage patients with CKD. Our model's ability to predict actual eGFR slopes and visualize CKD grade transitions provides a distinct advantage in patient education and management.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgments\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe thank the study participants and the members of the J-CKD-DB-Ex Study Group.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe authors meet criteria for authorship as recommended by the ICMJE. The authors did not receive payment related to the development of the manuscript. Boehringer Ingelheim was given the opportunity to review the manuscript for medical and scientific accuracy as well as intellectual property considerations. This study was supported and funded by Nippon Boehringer Ingelheim Co., Ltd. and Eli Lilly Japan K.K.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eContributors:\u003c/strong\u003e\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eH.N., T.N., K.I., T.G., D.N., S.K., T.S. and N.K. were involved in conceptualization, study design. \u0026nbsp;H.N., T.N., T.G. and S.K. were responsible for screening data sources and data extraction. T.N. and T.G. performed data analysis and constructed web application. H.N., T.N., K.I., T.G., and S.K. wrote the original draft of the manuscript. T.S. and N.K. revised and wrote the final draft. All authors have read and approved the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests:\u003c/strong\u003e\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eKatsuhito Ihara and Daisuke Nitta are employees of Nippon Boehringer Ingelheim Co., Ltd.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding:\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported in part by Nippon Boehringer Ingelheim Co., Ltd. and Eli Lilly Japan K.K.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eGansevoort, R.T. \u003cem\u003eet al.\u003c/em\u003e Chronic kidney disease and cardiovascular risk: epidemiology, mechanisms, and prevention. Lancet 382, 339\u0026ndash;352 (2013).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAndermann, A. Breaking away from the disease-focused paradigm. Lancet 376, 2073\u0026ndash;2074 (2010).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCKD Clinical Practice Guide 2024 (Japanese). \u003cem\u003eThe Japanese Society for Nephrology\u003c/em\u003e (2024).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eImai, E. \u003cem\u003eet al.\u003c/em\u003e Prevalence of chronic kidney disease in the Japanese general population. Clin Exp Nephrol 13, 621\u0026ndash;630 (2009).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eInker, L.A. \u003cem\u003eet al.\u003c/em\u003e GFR Slope as a Surrogate End Point for Kidney Disease Progression in Clinical Trials: A Meta-Analysis of Treatment Effects of Randomized Controlled Trials. J Am Soc Nephrol 30, 1735\u0026ndash;1745 (2019).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eInker, L.A. \u003cem\u003eet al.\u003c/em\u003e KDOQI US commentary on the 2012 KDIGO clinical practice guideline for the evaluation and management of CKD. Am J Kidney Dis 63, 713\u0026ndash;735 (2014).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNakagawa, N. \u003cem\u003eet al.\u003c/em\u003e J-CKD-DB: a nationwide multicentre electronic health record-based chronic kidney disease database in Japan. Sci Rep 10, 7351 (2020).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNagasu, H. \u003cem\u003eet al.\u003c/em\u003e Kidney Outcomes Associated With SGLT2 Inhibitors Versus Other Glucose-Lowering Drugs in Real-world Clinical Practice: The Japan Chronic Kidney Disease Database. Diabetes Care 44, 2542\u0026ndash;2551 (2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMatsuo, S. \u003cem\u003eet al.\u003c/em\u003e Revised equations for estimated GFR from serum creatinine in Japan. Am J Kidney Dis 53, 982\u0026ndash;992 (2009).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSisk, R., Sperrin, M., Peek, N., van Smeden, M. \u0026amp; Martin, G.P. Imputation and missing indicators for handling missing data in the development and deployment of clinical prediction models: A simulation study. Stat Methods Med Res 32, 1461\u0026ndash;1477 (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGuolin Ke, Q.M., Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu. LightGBM: a highly efficient gradient boosting decision tree. \u003cem\u003eProceedings of the 31st International Conference on Neural Information Processing Systems\u003c/em\u003e, 3149\u0026ndash;3157 (2017).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHochreiter, S. \u0026amp; Schmidhuber, J. Long short-term memory. Neural Comput 9, 1735\u0026ndash;1780 (1997).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKovesdy, C.P. \u003cem\u003eet al.\u003c/em\u003e Past Decline Versus Current eGFR and Subsequent ESRD Risk. J Am Soc Nephrol 27, 2447\u0026ndash;2455 (2016).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePerkovic, V. \u003cem\u003eet al.\u003c/em\u003e Canagliflozin and Renal Outcomes in Type 2 Diabetes and Nephropathy. N Engl J Med 380, 2295\u0026ndash;2306 (2019).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThe, E.-K.C.G. \u003cem\u003eet al.\u003c/em\u003e Empagliflozin in Patients with Chronic Kidney Disease. N Engl J Med 388, 117\u0026ndash;127 (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHeerspink, H.J.L. \u003cem\u003eet al.\u003c/em\u003e Sparsentan in patients with IgA nephropathy: a prespecified interim analysis from a randomised, double-blind, active-controlled clinical trial. Lancet 401, 1584\u0026ndash;1594 (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNakazawa, J. \u003cem\u003eet al.\u003c/em\u003e A Long-term Estimated Glomerular Filtration Rate Plot Analysis Permits the Accurate Assessment of a Decline in the Renal Function by Minimizing the Influence of Estimated Glomerular Filtration Rate Fluctuations. Intern Med 61, 1823\u0026ndash;1833 (2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eItano, S., Kanda, E., Nagasu, H., Nangaku, M. \u0026amp; Kashihara, N. eGFR slope as a surrogate endpoint for clinical study in early stage of chronic kidney disease: from The Japan Chronic Kidney Disease Database. Clin Exp Nephrol 27, 847\u0026ndash;856 (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTangri, N. \u003cem\u003eet al.\u003c/em\u003e A predictive model for progression of chronic kidney disease to kidney failure. JAMA 305, 1553\u0026ndash;1559 (2011).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu, P. \u003cem\u003eet al.\u003c/em\u003e Predicting the risks of kidney failure and death in adults with moderate to severe chronic kidney disease: multinational, longitudinal, population based, cohort study. BMJ 385, e078063 (2024).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTangri, N., Ferguson, T. \u0026amp; Komenda, P. Pro: Risk scores for chronic kidney disease progression are robust, powerful and ready for implementation. Nephrol Dial Transplant 32, 748\u0026ndash;751 (2017).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKanda, E., Bieber, B.A., Pisoni, R.L., Robinson, B.M. \u0026amp; Fuller, D.S. Importance of simultaneous evaluation of multiple risk factors for hemodialysis patients' mortality and development of a novel index: dialysis outcomes and practice patterns study. PLoS One 10, e0128652 (2015).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTangri, N. \u003cem\u003eet al.\u003c/em\u003e Multinational Assessment of Accuracy of Equations for Predicting Risk of Kidney Failure: A Meta-analysis. JAMA 315, 164\u0026ndash;174 (2016).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGrams, M.E. \u003cem\u003eet al.\u003c/em\u003e The Kidney Failure Risk Equation: Evaluation of Novel Input Variables including eGFR Estimated Using the CKD-EPI 2021 Equation in 59 Cohorts. J Am Soc Nephrol 34, 482\u0026ndash;494 (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNelson, R.G. \u003cem\u003eet al.\u003c/em\u003e Development of Risk Prediction Equations for Incident Chronic Kidney Disease. JAMA 322, 2104\u0026ndash;2114 (2019).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLukomski, L. \u003cem\u003eet al.\u003c/em\u003e First experiences with machine learning predictions of accelerated declining eGFR slope of living kidney donors 3 years after donation. J Nephrol 37, 1631\u0026ndash;1642 (2024).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKanda, E., Epureanu, B.I., Adachi, T. \u0026amp; Kashihara, N. Machine-learning-based Web system for the prediction of chronic kidney disease progression and mortality. PLOS Digit Health 2, e0000188 (2023).\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"},{"header":"Table 1","content":"\u003cp\u003eTable 1 is available in the Supplementary Files section.\u003c/p\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-6575802/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6575802/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eChronic kidney disease (CKD) is a significant global health challenge, yet the application of eGFR slope as a metric for CKD progression remains underdeveloped in primary care settings.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003eUsing data from J-CKD-DB-Ex, Japan\u0026rsquo;s largest CKD database, we developed and validated a machine learning-based model to predict eGFR slope. The study included 10,474 patients aged\u0026thinsp;\u0026ge;\u0026thinsp;18 years with eGFR\u0026thinsp;\u0026lt;\u0026thinsp;60 mL/min/1.73m\u0026sup2; or proteinuria at baseline. Predictors included demographic, clinical, and laboratory data. We compared three models: linear regression, LightGBM, and LSTM networks.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eAmong 10,474 patients (median age 69.0 years), the LightGBM model achieved superior performance (RMSE\u0026thinsp;=\u0026thinsp;2.95 mL/min/1.73m\u0026sup2;/year) compared to LSTM (RMSE\u0026thinsp;=\u0026thinsp;3.94) and conventional linear regression (RMSE\u0026thinsp;=\u0026thinsp;15.87). The model was implemented as a web-based application for clinical use.\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e \u003cp\u003eThis machine learning-based prediction model achieves superior accuracy in estimating eGFR trajectory and enables real-time prediction using single time-point data. The web-based tool supports early identification of high-risk patients, enabling timely interventions and specialist referrals in primary care settings.\u003c/p\u003e","manuscriptTitle":"Prediction of Estimated Glomerular Filtration Rate Slope and Kidney Prognosis of Patients with Chronic Kidney Disease","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-05-16 12:19:22","doi":"10.21203/rs.3.rs-6575802/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"f2bce04a-f253-4f88-a5ff-560e2b5b4d33","owner":[],"postedDate":"May 16th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":48154251,"name":"Health sciences/Nephrology/Kidney diseases/Chronic kidney disease/End-stage renal disease"},{"id":48154252,"name":"Health sciences/Medical research/Epidemiology"}],"tags":[],"updatedAt":"2025-09-09T08:23:24+00:00","versionOfRecord":[],"versionCreatedAt":"2025-05-16 12:19:22","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6575802","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6575802","identity":"rs-6575802","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00