A deep learning-based early warning system for renal replacement therapy in the intensive care unit

preprint OA: gold CC-BY-4.0
📄 Open PDF Full text JSON View at publisher
Full text 159,867 characters · extracted from preprint-html · click to expand
A deep learning-based early warning system for renal replacement therapy in the intensive care unit | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article A deep learning-based early warning system for renal replacement therapy in the intensive care unit Behrooz Mamandipoor, Julian McAuley, Martin Krause, Ulrich Schmidt, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8419139/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background: Renal replacement therapy (RRT) as a life-saving intervention for acute kidney injury in the intensive care unit (ICU). The decision to initiate RRT remains highly complex and subjective. Accurate and timely predictions for the need of RRT initiation, duration of therapy, and subsequent clinical outcomes are crucial components of personalized care. Using time series patient data (vital signs, laboratory findings, medications, ventilator settings, intake/output, risk scores), we aimed to develop deep learning models for predicting need for RRT. Methods: Using data from Medical Information Mart for Intensive Care (MIMIC)-III, MIMIC-IV, and eICU, we trained/validated deep learning models for: (1) screening patients for prediction of the need for RRT after their first 12 hours in the ICU; (2) real-time dynamic prediction of impending RRT initiation; (3) prediction of RRT duration; and (4) prediction of mortality following RRT onset. Results: Here, we summarize the results of the first model aimed at screening patients for early prediction of RRT. In internal validation, area under the receiver operating characteristics curve (AUROC) was 0.90 (95% confidence interval [CI] 0.886–0.906) and area under the precision-recall curve (AUPRC) 0.53 was (95% CI 0.496–0.564) in MIMIC-III. Results were similar with MIMIC-IV and eICU. External validation on ICU patients admitted during the COVID period yielded an AUROC of 0.90 (95% CI 0.894–0.913) and an AUPRC of 0.57 (95% CI 0.534–0.602). Additional hospital-level external validation across eight individual eICU hospitals showed AUROCs ranging from 0.86 to 0.91 and AUPRCs ranging from 0.35 to 0.52 Discussion: By accurately identifying patients at high risk for RRT within the first 12 hours of ICU admission, the early prediction model could serve as a triage tool to prompt closer nephrology evaluation or optimization of fluid and hemodynamic management before overt renal failure develops. Figures Figure 1 Figure 2 Figure 3 Figure 4 Introduction Acute kidney injury (AKI) is a prevalent and severe complication in critically ill patients, frequently necessitating renal replacement therapy (RRT) as a life-saving intervention. 1 – 6 The clinical decision to initiate RRT, however, remains highly complex and subjective, often relying on individual clinician judgment. 7 – 13 This subjectivity can result in delayed treatment or unnecessary interventions, thereby adversely affecting patient outcomes and resource utilization. 14 Therefore, accurate and timely predictions for the need of RRT initiation, duration of therapy, and subsequent clinical outcomes such as mortality are crucial components of effective clinical management and personalized care in the intensive care unit (ICU). Traditional severity-of-illness scoring systems, including the Acute Physiology and Chronic Health Evaluation (APACHE) II, Sequential Organ Failure Assessment (SOFA), and Mortality Scoring system for AKI with continuous RRT (MOSAIC), have historically demonstrated limited accuracy and clinical utility for predicting RRT-related decisions and outcomes. 15 – 17 Leveraging ICU data and machine learning (ML) is one solution for improving prediction and management of RRT. However, reported models often focus on isolated clinical tasks rather than spanning the spectrum of clinical decision-making throughout the RRT care pathway. 18 – 27 Many approaches also rely solely on static, admission-based single-timepoint variables and therefore fail to capture the dynamic, evolving clinical trajectories typical of critically ill patients. 23 – 28 In addition, predictive tools specifically designed to estimate RRT duration—critical for resource planning—remain scarce. This gap likely stems from the absence of standardized RRT cessation protocols and universally accepted liberation thresholds (e.g., urine output or creatinine cutoffs) for discontinuing RRT, as well as a historical emphasis on predicting RRT initiation or mortality rather than treatment course length. 29 – 33 Furthermore, generalization across hospitals and over time remains challenging when models are validated primarily in single-center cohorts, 20–24 as temporal shifts—driven by evolving ICU protocols, resource allocation, and patient demographics, exemplified during the COVID-19 pandemic—can alter data distributions and degrade performance. 34 – 36 Predictive models often provide limited population- and patient-level interpretability, limiting their real-time clinical utility and actionable bedside explanations. 20 , 25 Finally, many studies have not adequately addressed issues of fairness and bias, raising concerns about disparate prediction performance across demographic subgroups, including gender and race/ethnicity. 37–42 Addressing these gaps is essential for developing robust, interpretable, and fair decision-support systems for RRT management in critical care. To address the outlined knowledge gaps, this study proposed a deep learning-based framework specifically designed to tackle four clinically relevant tasks for optimizing RRT management in critically ill patients: (1) screening patients for early prediction of the need for RRT after their first 12 hours in the ICU; (2) real-time dynamic prediction of impending RRT initiation within the next 24 hours, allowing ongoing patient assessment and proactive interventions; (3) prediction of RRT duration (if needed), classifying patients into short-term (< 48 hours) or prolonged (≥ 48 hours) treatment groups; and (4) prediction of mortality following RRT onset. Methods Study population The study was conducted in accordance with the Declaration of Helsinki. The Institutional Review Board (Human Research Protections Program) determined that this study was exempt from review and waived need for participant consent because it involved retrospective analysis of de-identified data. Three datasets were used: Medical Information Mart for Intensive Care (MIMIC)-III, MIMIC-IV, and the eICU Collaborative Research Database. The MIMIC-III database is a widely accessible critical care database that integrates de-identified clinical data from patients admitted to intensive care units at the Beth Israel Deaconess Medical Center in Boston, Massachusetts. This database encompasses records from 53,423 hospital admissions from 2001 to 2012. 43,44 MIMIC-IV extends this resource by including clinical data from more than 90,000 ICU admissions from 2008 to 2022 at the same institution. 45,46 The eICU Collaborative Research Database is a multi-center database specifically tailored to support intensive care research, containing high-granularity data from over 200,859 ICU admissions, pertaining to 139,367 distinct patients across 335 units in 208 hospitals in the United States between 2014 and 2015. 47,48 RRT even definition Use of RRT was a binary outcome that was identified from structured data from the MIMIC and eICU databases that explicitly indicate the use of it. In the MIMIC dataset, RRT-related activities were extracted from the “chartevents”, “inputevents”, and “procedureevents” tables using a curated list of item identifiers associated with continuous (CRRT) and intermittent dialysis. In the eICU database, RRT events were derived from the “intakeOutput” and “treatment” tables by detecting records with dialysis volumes or free-text descriptions containing relevant keywords (e.g., “dialysis,” “RRT”). Two post-processing steps were incorporated, based on prior clinical knowledge, to address incomplete documentation common in real-world ICU data. 49,50,51 A temporal imputation strategy was applied in which missing hours between two RRT positive observations were filled if the gap was less than or equal to six hours, under the assumption of ongoing therapy. Any RRT episode with a total duration of less than two hours was excluded to minimize the influence of spurious or fragmented documentation. Cohort inclusion and exclusion criteria We excluded ICU patients whose length of stay was less than four hours, as these short stays provided insufficient clinical data. Additionally, ICU stays exceeding one month were truncated at 30 days, limiting the analysis to the first month of data. This step was implemented to prevent bias arising from rare instances of exceptionally prolonged ICU stays. ICU stays lacking any physiological measurements were removed. Specifically, we excluded cases in which all vital sign data—including heart rate, respiratory rate, oxygen saturation (SpO 2 ), and temperature—were completely missing throughout the observation period. Likewise, stays with no recorded arterial blood gas and basic metabolic panel variables were excluded. Internal and external evaluation For each dataset (MIMIC-III, MIMIC-IV, and eICU), we divided them into training (70%), validation (10%), and test (20%) subsets for internal evaluation of our predictive models. Since it was crucial that each subset mirrored the overall dataset regarding RRT event and mortality rates, we initially computed these rates across the entire patient population and then used stratified sampling to preserve these proportions consistently within each subset. We further evaluated the generalizability of the models through external validation. First, we constructed an external test set from the eICU multi-center database by selecting ICU stays from eight distinct hospitals, each with more than 2,500 admissions. Patients from these hospitals were explicitly excluded from the training and validation subsets to prevent overlap. Second, we assessed temporal generalizability by training our models on MIMIC dataset (combining MIMIC-III and IV) admissions prior to 2019 and evaluating their performance externally on patients admitted during the COVID-19 pandemic period (2020–2022). Data from the pandemic period are considered externally distinct due to substantial shifts in ICU management protocols, resource allocation strategies, and patient characteristics. 34-36 Data preprocessing Two data modalities were incorporated in the model: time-invariant (static – does not change through time) data and time-variant (dynamic – changes through time) data. Time-invariant data included age, gender, race/ethnicity, body mass index, and admission type. A detailed description of pre-processing of time-invariant data is provided in Supplementary Materials 1 . Time-variant data included laboratory values, ventilator settings, clinical scores (e.g., Glasgow Coma Scale, clinical risk scores, etc.), clinical events (e.g., procedures, medications, etc.), and hemodynamics (e.g., heart rate, respiratory rate, oxygen saturation, blood pressure, etc.). Dynamic variables were binned into 1-hour non-overlapping windows, within which the mean for continuous variables or the mode for categorical variables were calculated when multiple values were recorded within a bin. A detailed description of pre-processing of time-variant data is provided in Supplementary Materials 2 . A detailed description of approaches to outlier removal and missing data imputation is provided in Supplementary Materials 3 . Feature generation involved transforming raw clinical data—prior to imputation—into structured hourly features suitable for predictive modeling. Seven distinct categories of features were computed for each hourly time point and are explained in detail in Supplementary Materials 4 . Problem formulation To operationalize our four clinical questions, we developed four variants of an early-warning system (EWS) for RRT in ICU patients: screeningEWS, dynamicEWS, durationEWS, and prognosisEWS. Each variant addresses a specific clinical prediction task: Early prediction of the need for RRT (screeningEWS): This model was designed as an admission-time screening tool to identify patients at high risk of requiring RRT or mortality at any point during their ICU stay. For each ICU admission, the model takes as input the clinical variables measured within the first 12 hours and outputs a binary label indicating whether the patient will require RRT or decease at any point during ICU stay. Real- time dynamic prediction of RRT initiation (dynamicEWS): This model provides continuous hourly predictions of a patient's likelihood of requiring RRT within the next 24 hours. Prediction of RRT duration (durationEWS): This model focuses on patients who have already started RRT and aims to predict the likely treatment duration by classifying the RRT course as short-term (≤48 hours) versus prolonged (>48 hours). Mortality prediction post-RRT onset (prognosisEWS): This model evaluates the risk of mortality at ICU discharge among patients who have initiated RRT, using clinical data from the 12-hour window surrounding the onset of RRT. Details of methods related to labeling of events are provided in Supplementary Materials 5 . Supervised learning of prediction models We implemented task-specific predictive models for each EWS task, using bidirectional long short-term memory architectures for sequence data classification (screeningEWS, durationEWS, prognosisEWS) and an XGBoost model for real-time hourly data classification (dynamicEWS). Details of models’ architecture and training process is reported in Supplementary Materials 6. System evaluation and performance metrics Performance metrics included the area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), F1-score, precision, specificity, and Brier score. To quantify the statistical uncertainty associated with each performance estimate, we calculated 95% confidence intervals using the pivot bootstrap method with 1,000 resamples drawn from the test dataset. Real-time model assessment and silencing policy Due to the sequential, time-dependent nature of the predictions in the dynamicEWS model, the total number of positive hours does not directly match the total number of unique RRT events. A single RRT event may have multiple positive early alerting opportunities in advance, especially if the event occurs later in the ICU admission (up to a maximum of 24 early-alert opportunities). Consequently, RRT events occurring later provide more opportunities for early detection compared with events occurring soon after admission. To assess the clinical usefulness of our predictive models, we focused on evaluating the model’s ability to detect actual RRT events (event sensitivity) while maintaining an acceptable false-alarm rate. We calculated recall as the proportion of all RRT events that triggered at least one alarm within the preceding 24-hour period, thereby indicating the model’s sensitivity in detecting true RRT events. Precision was computed as the proportion of alarms that correctly anticipated an event within the following 24 hours at a fixed event-based recall of 80%. Furthermore, to mitigate alarm fatigue resulting from frequent or redundant alerts by the dynamicEWS model, we implemented a silencing policy in conjunction with the early-warning system. After triggering an initial alarm, the system suppresses subsequent alarms for a 6-hour period, preventing repetitive notifications. Interpretability To support interpretability at both the population and patient levels, we used explanation methods tailored to each model type. For the long short term memory classifiers (screeningEWS, durationEWS, and prognosisEWS), we applied Integrated Gradients 52 to quantify feature-level attributions relative to a baseline input at the patient level and to summarize influential predictors across the cohort. For the XGBoost-based dynamicEWS model, we used TreeExplainer 53 within the SHapley Additive exPlanations (SHAP) framework 54 to generate population-level SHAP summary plots and rank globally important predictors. We also developed an hourly SHAP-based prioritization scheme for patient-specific explanations, which was integrated into an animated dashboard that displays real-time risk trajectories alongside the top contributing features and their temporal trends. Further details of the interpretability pipeline are provided in Supplementary Materials 7 . Bias and fairness We evaluated model bias and fairness across demographic subgroups defined by gender and race, using Male and White cohorts as baseline groups. We computed subgroup-specific discrimination (AUROC), predictive values (positive and negative predictive values), and error-rate metrics (true/false positive/negative rates), and quantified disparities relative to baseline using both absolute differences (Δ) and disparity ratios (R). This audit operationalizes fairness objectives of equal opportunity, equalized odds, and predictive value parity, through subgroup comparisons. To prioritize clinically meaningful differences and avoid ratio artifacts from small baseline values (e.g., low false positive rate), we flagged disparities only when they exceeded both∣Δ∣>0.10 and ∣R−1∣>0.20, and summarized results using disparity heatmaps and histograms. Details of bias and fairness evaluation approach is reported in Supplementary Materials 8 . Results Study population A summary breakdown of the demographics across cohorts from each dataset are provided in Table 1 . In the MIMIC-III dataset, a total of 49,613 ICU admissions were analyzed after exclusion. The rate of RRT use was 6%, with a 20% mortality rate among these patients. The total number of RRT events was 5,643, with a median [quartiles] duration of 44 hours [16–107]. The median [quartiles] time to RRT from admission was 13 hours [2–40]. A detailed breakdown of demographic data for MIMIC-III is provided in Supplementary Materials 9 . In the MIMIC-IV dataset, a total of 65,707 ICU admissions were analyzed after exclusion. The rate of RRT use was 6%, with a 23% mortality rate among these patients. The total number of RRT events was 8,563, with a median [quartiles] duration of 41 hours [14–109]. The median [quartiles] time to RRT from admission was 14 hours [2–42]. A detailed breakdown of demographic data for MIMIC-IV is provided in Supplementary Materials 10 . In the eICU dataset, a total of 131,272 ICU admissions were analyzed after exclusion. The rate of RRT use was 3%, with an 18% mortality rate among these patients. The total number of RRT events was 5,729, with a median [quartiles] duration of 8 hours [6–16]. The median [quartiles] time to RRT from admission was 30 hours [6–72]. A detailed breakdown of demographic data for eICU is provided in Supplementary Materials 11 . In the COVID-19 pandemic period dataset, a total of 9,319 ICU admissions were analyzed after exclusion. The rate of RRT use was 7%, with a 41% mortality rate among these patients. The total number of RRT events was 1,548, with a median [quartiles] duration of 64 hours [24–164]. The median [quartiles] time to RRT from admission was 14 hours [2–49]. A detailed breakdown of demographic data for the pandemic period is provided in Supplementary Materials 12 . Early prediction of the need for RRT (screeningEWS) This model serves as a screening tool to identify patients at high risk of requiring RRT, using only clinical data collected within the first 12 hours after ICU admission. For the early screening task, patients who initiated RRT within the first 12 hours of ICU admission were excluded, resulting in the removal of 1,471 (3.2%) patients in MIMIC-III, 1,759 (3.4%) in MIMIC-IV, 1,062 (1.9%) in eICU, and 289 (4.1%) in the COVID-period cohort who required RRT. In internal validation, AUROC was 0.90 (95% CI 0.886–0.906) and AUPRC 0.53 was (95% CI 0.496–0.564) in MIMIC-III. The AUROC was 0.91 (95% CI 0.901–0.918) and the AUPRC was 0.55 (95% CI 0.516–0.578) in MIMIC-IV. In eICU, AUROC was 0.87 (95% CI 0.863–0.878) and AUPRC was 0.44 (95% CI 0.413–0.459). External validation on ICU patients admitted during the COVID period yielded an AUROC of 0.90 (95% CI 0.894–0.913) and an AUPRC of 0.57 (95% CI 0.534–0.602) ( Figure 1A ). Results for all performance metrics are provided in Supplementary Materials 13 . Additional hospital-level external validation across eight individual eICU hospitals (each with >2,500 ICU stays) showed robust performance, with AUROCs ranging from 0.86 to 0.91 and AUPRCs ranging from 0.35 to 0.52 ( Supplementary Materials 14 ). Population-level interpretability analysis of the screeningEWS model using the Integrated Gradients method revealed that SpO 2 , temperature, heart rate, respiratory rate, Glasgow Coma Scale, chloride, albumin, mean corpuscular hemoglobin concentration, anion gap, and FiO 2 were the most influential predictors of RRT initiation risk during the ICU stay. Supplementary Materials 15 provides the feature-importance rankings for each cohort, along with an illustrative example of individual patient–level explanations. Real-time dynamic prediction of RRT initiation (dynamicEWS) For the dynamicEWS task, we generated hourly prediction instances over the entire ICU stay for patients not currently receiving RRT. This resulted in 4.3 million hourly time points in MIMIC-III (1.5% positive), 6.2 million in MIMIC-IV (2.6% positive), 11 million in eICU (2.1% positive), and 0.75 million in the COVID-period cohort (3.9% positive), where positive labels indicated RRT initiation within the subsequent 24 hours. The XGBoost-based dynamicEWS model achieved excellent discriminative performance in predicting RRT initiation within the next 24 hours. In MIMIC-III, AUROC was 0.97 (95% CI 0.969–0.972) with an AUPRC of 0.40 (95% CI 0.387–0.402); in MIMIC-IV, AUROC was 0.94 (95% CI 0.934–0.937) with an AUPRC of 0.40 (95% CI 0.397–0.408). In eICU, AUROC was 0.89 (95% CI 0.891–0.893) with an AUPRC of 0.23 (95% CI 0.223–0.229). External validation on ICU patients admitted during the COVID period yielded an AUROC of 0.94 (95% CI 0.935–0.937) with an AUPRC of 0.45 (95% CI 0.449–0.460) ( Figure 1B ). Results for all performance metrics are reported in Supplementary Materials 16 . Hospital-level external validation across eight individual eICU hospitals also showed robust performance, with AUROCs ranging from 0.85 to 0.92 and AUPRCs ranging from 0.15 to 0.34 ( Supplementary Materials 17 ). Model performance varied over the course of the ICU stay and was not uniform across time. To characterize this temporal behavior, we evaluated AUROC, precision, and recall at the hourly data point level across days since ICU admission, as well as the alarm rate over early alerting windows of up to 24 hours before RRT initiation. As shown in Figure 2 , precision and recall increased over later ICU days, while AUROC remained high and largely stable, with only a slight decrease in the eICU cohort. In parallel, the overall alarm rate increased closer to RRT onset, indicating that the model triggered more alerts as patients approached the event. Together, these trends suggest that the model becomes progressively more effective at identifying impending RRT over the course of the ICU stay. In the MIMIC-IV cohort, the dynamicEWS model generated predictions for approximately 901,000 patient-hours, corresponding to 13,000 ICU patients and 1,472 unique RRT events, with 33,560 hours falling within predefined early alerting windows. Using a fixed decision threshold chosen to achieve a recall of 80% (95% CI 0.78–0.82), the model correctly identified 1,183 RRT events. At this operating point, the initial hourly alarm rate was 5.5% (50,086 alarms), which was reduced to 0.9% (8,912 alarms) after applying a 6-hour silencing policy while maintaining the same precision–recall performance. At this threshold, the model produced approximately 1.6 false-positive predictions for every true positive and would prompt a clinical assessment in approximately 24% of patient-days (about 24 alerts per 100 patient-days); assuming a standard 12-hour ICU working shift, this corresponds to roughly 0.12 alerts per patient per shift (about 12 alerts per 100 patients per shift). Model evaluations for the remaining patient cohorts are available in Supplementary Materials 18 . Population-level interpretability analysis of the dynamicEWS model using SHAP showed that maximum creatinine (since ICU admission and over the last 48 hours), urine output (maximum since admission and mean over the last 48 hours), maximum blood urea nitrogen (BUN) (since admission and over the last 48 hours), and severity scores were the most influential predictors of near-term RRT initiation risk. Supplementary Materials 19 provides population-level SHAP feature-importance rankings for each cohort. To illustrate how the model dynamically assesses RRT risk over time with real-time individual-level interpretability, Figure 3A presents an ICU admission in which dynamicEWS generated hourly predictions of future RRT risk across the entire stay, continuously updating its estimates as new clinical data become available. The SHAP-based visualization dashboard simultaneously displays the evolving risk trajectory and the most influential features at each time point, highlighting how specific physiological and laboratory changes drive the model’s predictions over time. Supplementary Materials 20 provides additional examples of real-time RRT risk assessment. We assessed model performance across demographic subgroups in the MIMIC-IV cohort, stratified by gender and race, to evaluate fairness. Overall, dynamicEWS showed comparable discrimination and error rates across most subgroups ( Figure 3B ). Notably, performance was improved in African American patients, in whom the model achieved approximately a 19% relative improvement in true positive rate and a corresponding reduction in false negative rate compared with the White patients. Fairness evaluations for the remaining cohorts and additional subgroup analyses are reported in Supplementary Materials 21 . Prediction of RRT duration (durationEWS) For the durationEWS task, we evaluated the risk of prolonged RRT using clinical data from the 12-hour window surrounding RRT initiation. After excluding very short RRT episodes shorter than the observation period, prolonged RRT (>48 hours) accounted for 48% of events in both MIMIC-III (1,359/2,822) and MIMIC-IV (1,705/3,562), 23% of events in eICU (756/3,311), and 66% of events in the COVID-period cohort (374/551). In MIMIC-III, AUROC was 0.86 (95% CI 0.828–0.889) with an AUPRC of 0.87 (95% CI 0.830–0.901), and in MIMIC-IV, AUROC was 0.90 (95% CI 0.873–0.919) with an AUPRC of 0.90 (95% CI 0.873–0.922). In eICU, AUROC was 0.81 (95% CI 0.785–0.844) with an AUPRC of 0.67 (95% CI 0.608–0.725). External validation in the COVID-period cohort yielded an AUROC of 0.83 (95% CI 0.795–0.871) and an AUPRC of 0.89 (95% CI 0.855–0.925) ( Figure 4A ). Full performance metrics are provided in Supplementary Materials 22 . Population-level interpretability analysis of the durationEWS model using the IG method showed that Glasgow Coma Scale, SpO 2 , FiO 2 , SAPS II risk score, oxygen saturation, temperature, bicarbonate, and base excess were the most influential predictors of prolonged RRT duration. Supplementary Materials 23 provides detailed feature-importance rankings for each cohort, along with an illustrative example of individual patient–level explanations. Mortality prediction post-RRT onset (prognosisEWS) For the prognosisEWS task, we evaluated the risk of mortality after RRT initiation using clinical data from the 12-hour window surrounding the first RRT initiation, restricting the cohort to ICU stays of at least 12 hours. This resulted in 2,816 patients in MIMIC-III with an ICU mortality rate of 20%, 3,649 patients in MIMIC-IV with a mortality rate of 23%, 2,800 patients in eICU with a mortality rate of 19%, and 605 patients in the COVID-period cohort with a mortality rate of 40%. The prognosisEWS model demonstrated strong discriminative performance across all datasets. In MIMIC-III, AUROC was 0.89 (95% CI 0.848–0.917) with an AUPRC of 0.72 (95% CI 0.638–0.797), and in MIMIC-IV, AUROC was 0.88 (95% CI 0.850–0.908) with an AUPRC of 0.70 (95% CI 0.622–0.763). In eICU, AUROC was 0.82 (95% CI 0.788–0.858) with an AUPRC of 0.59 (95% CI 0.517–0.671). External validation on ICU patients admitted during the COVID period yielded an AUROC of 0.84 (95% CI 0.809–0.874) and an AUPRC of 0.77 (95% CI 0.709–0.822) ( Figure 4B . Full performance metrics are reported in Supplementary Materials 24 . Population-level interpretability analysis of the prognosisEWS model using Integrated Gradients showed that age, Glasgow Coma Scale, blood pressure, temperature, and composite severity scores were the most influential predictors of post-RRT mortality risk. Supplementary Materials 25 provides detailed feature-importance rankings for each cohort, along with an illustrative example of individual patient–level explanations. Discussion We developed and validated predictive deep learning models designed to support four interrelated clinical decisions in RRT: (1) screeningEWS – early prediction of the need for RRT after the initial 12 hours in the ICU; (2) dynamicEWS – real-time dynamic prediction (hourly) of RRT initiation within the next 24 hours; (3) durationEWS – prediction of RRT duration; and (4) prognosisEWS – mortality prediction post-RRT onset. The optimal timing of RRT initiation has remained elusive despite decades of investigation. Major trials such as ELAIN, AKIKI, IDEAL-ICU, and STARRT-AKI have yielded conflicting conclusions regarding early versus delayed initiation, partly due to heterogeneous patient phenotypes and clinician discretion. 55 , 56 Our findings support an individualized approach: by accurately identifying patients at high risk for RRT within the first 12 hours of ICU admission, the early prediction model could serve as a triage tool to prompt closer nephrology evaluation or optimization of fluid and hemodynamic management before overt renal failure develops. Dynamic prediction further advances this paradigm by incorporating temporal trends, capturing physiologic deterioration not evident in static admission data. The real-time model’s ability to anticipate RRT initiation up to 24 hours before the clinical decision suggests meaningful opportunities for earlier intervention. Recent studies have increasingly explored the application of machine learning techniques to support personalized decision-making in RRT management. Li et al. developed an uplift modeling approach to estimate individual treatment effects of RRT in patients with sepsis-associated acute kidney injury, demonstrating heterogeneity in treatment responses and underscoring the necessity for individualized clinical decisions rather than uniform approaches. 20 Lee et al. further emphasized the importance of individualized treatment timing, demonstrating improved outcomes when continuous kidney replacement therapy was initiated within 6 hours of SA-AKI diagnosis compared with delayed initiation. 23 Similarly, França et al. utilized a random forest algorithm to predict the requirement for RRT in critically ill COVID-19 patients, achieving robust predictive performance (AUROC = 0.78) based on routinely collected clinical data at ICU admission. 25 Additionally, several studies have explored ML approaches to enhance mortality prediction following RRT initiation in critically ill patients. Kang et al. applied a random forest model and achieved superior discrimination compared with traditional scoring systems such as APACHE II and SOFA for ICU mortality prediction. 21 Chang et al. employed an XGBoost model in a multi-center cohort, surpassing conventional scoring methods (SOFA, nonrenal SOFA) in predicting 30-day mortality. 18 Their model’s interpretability was further augmented through SHAP, highlighting important predictors such as age, creatinine, platelet count, FiO 2 , anion gap, mean arterial pressure, respiratory rate, and vasopressor administration. Similarly, Hung et al. validated a gradient boosting machine model for predicting in-hospital mortality post-CRRT initiation. 22 Using SHAP analysis, they identified critical predictors such as APACHE II scores, albumin levels, age, potassium, creatinine, SpO 2 , and vasopressor use. Zamanzadeh et al. developed ML algorithms to predict short-term survival following CRRT initiation, underscoring variables such as creatinine levels, white blood cell counts, respiratory rate, and hemodynamic stability. 19 Furthermore, Li et al. developed an interpretable logistic regression model specifically targeting patients with sepsis-associated AKI undergoing RRT, which provided better predictive performance than traditional scoring systems (SOFA, SAPS-III). 24 Collectively, these studies underscore the clinical value of ML-based predictive models in improving resource allocation, risk stratification, and personalized management of RRT in ICU. Many prior prediction models rely on static feature windows (e.g., admission-only), which fail to capture the ICU as a dynamic physiologic process and therefore may miss evolving deterioration and cannot update risk as patient status changes. Our approach instead uses time-updated, hourly signals with a richer feature set (demographics, labs, vitals, medications, ventilator settings, and clinical scores) to reflect the real-time trajectory preceding RRT. Performance is further supported by a dedicated hourly feature-engineering pipeline that transforms raw clinical data into structured inputs by computing seven feature categories per hour, improving signal quality and providing consistent longitudinal representations beyond snapshot-based baselines. The integration of interpretable predictive modeling into ICU workflows could address two central challenges in critical care nephrology: inconsistent timing of RRT initiation and uncertainty about treatment trajectory. The duration model identified markers of prolonged therapy—particularly hemodynamic instability and multiorgan failure—providing insight into which patients may require extended extracorporeal support and additional resource planning. Similarly, the post-RRT mortality model outperformed traditional severity scores (APACHE IV, SOFA) and yielded clinically intuitive predictors such as lactate, mean arterial pressure, and recovery of urine output. Taken together, these models offer a foundation for an adaptive decision-support framework that evolves with patient status, guiding both initiation and de-escalation strategies. A major strength of our framework lies in its emphasis on explainability. Using SHAP value analysis, we demonstrated physiologic coherence between model predictions and established clinical determinants of kidney injury severity. This transparency is critical for bedside adoption, as black-box predictions without clinical rationale may erode clinician trust. Moreover, external validation across multiple ICUs and subgroup analyses confirmed consistent performance across demographic and diagnostic groups, suggesting algorithmic fairness—a prerequisite for equitable deployment in heterogeneous ICU populations. Several limitations merit discussion. First, because this was a retrospective study, causal inference cannot be established, and unmeasured confounders (e.g., nephrotoxic exposure, clinician decision thresholds) may influence model performance. Second, despite multicenter validation, local practice patterns and data definitions may limit generalizability to institutions with differing RRT protocols. Third, although dynamic predictions were generated hourly, real-time clinical integration and alert fatigue were not formally tested. Prospective trials are needed to determine whether early alerts improve patient outcomes, resource utilization, or decision-to-dialysis timing. Finally, while explainability techniques enhance interpretability, they do not fully resolve the ethical and operational questions surrounding automated clinical recommendations. Future work should focus on embedding these models into electronic health record systems to enable real-time risk visualization and clinician feedback. Combining predictive analytics with reinforcement learning may further refine individualized thresholds for RRT initiation or discontinuation. Integration with biomarkers of tubular injury or renal recovery could improve predictive precision. Ultimately, prospective interventional studies are needed to determine whether machine learning–guided RRT strategies improve survival, reduce unnecessary dialysis, or optimize ICU resource allocation. Our study demonstrates that interpretable, multi-stage decision support machine learning models can predict RRT initiation, duration, and outcomes with high accuracy and temporal adaptability. By aligning predictive performance with clinical interpretability, these tools offer a viable pathway toward personalized, data-driven RRT decision support in the ICU. Prospective evaluation will determine whether their implementation translates predictive accuracy into improved patient outcomes. Declarations Acknowledgements: Guarantor: Rodney A. Gabriel is the guarantor of the content of the manuscript, including the data and analysis Author contributions: BM was responsible for study design, data collection, data pre-processing, analysis, figure/table design, and manuscript preparation. CH was responsible for study design, analysis, figure/table design, and manuscript preparation. JM was responsible for study design, analysis, figure/table design, and manuscript preparation. MK was responsible for study design, figure/table design, and manuscript preparation. US was responsible for study design, figure/table design, and manuscript preparation. RG was responsible for study design, data collection, data pre-processing, analysis, figure/table design, and manuscript preparation Financial/non-financial disclosures: This study is funded by the National Institutes of Health 1OT2OD037995-01 Role of sponsors: no input or contributions from sponsors Consent to participate: The study was conducted in accordance with the Declaration of Helsinki. The Institutional Review Board (Human Research Protections Program) determined that this study was exempt from review and waived need for participant consent because it involved retrospective analysis of de-identified data. Summary of conflict of interest statements: This study is funded by the National Institutes of Health 1OT2OD037995-01 Funding information: This study is funded by the National Institutes of Health 1OT2OD037995-01 Notation of prior abstract publication/presentation: none Acknowledgements: BM was responsible for study design, data collection, data pre-processing, analysis, figure/table design, and manuscript preparation. CH was responsible for study design, analysis, figure/table design, and manuscript preparation. JM was responsible for study design, analysis, figure/table design, and manuscript preparation. MK was responsible for study design, figure/table design, and manuscript preparation. US was responsible for study design, figure/table design, and manuscript preparation. RG was responsible for study design, data collection, data pre-processing, analysis, figure/table design, and manuscript preparation Author Contributions: Hsu was involved in study design, model development and validation, statistical analysis, and preparation of manuscript. Vutukuru was involved in study design, model development and validation, and preparation of manuscript. Mudumbai was involved in study design, model validation, and preparation of manuscript. Texeira was involved in model validation and preparation of manuscript. Chen was involved in model validation and preparation of manuscript. Mehdipour was involved in study design and preparation of manuscript. Polston was involved in study design and preparation of manuscript. Marienfeld was involved in study design and preparation of manuscript. Gabriel was involved in study supervision, study design, model development and validation, statistical analysis, and preparation of manuscript. Competing Interest Statement: Gabriel’s institution has received product and/or funding for research purposes from Pacira Biosciences, Avanos Medical, Takeda, and Merck. Data availability: Data is available with user agreement access with the All of Us Research Program and the Veterans Affair dataset. Code availability: Code will be made available via public github repository Ethics declaration/Consent to Participate: The study was conducted in accordance with the Declaration of Helsinki. The Institutional Review Board determined that this study was exempt from review and waived need for participant consent because it involved retrospective analysis of de-identified data. Funding: This project was supported by Wellcome Leap as part of the Untangling Addiction Program References Hoste, E.A., Bagshaw, S.M., Bellomo, R., Cely, C.M., Colman, R., Cruz, D.N., Edipidis, K., Forni, L.G., Gomersall, C.D., Govil, D. and Honoré, P.M., 2015. Epidemiology of acute kidney injury in critically ill patients: the multinational AKI-EPI study. Intensive care medicine, 41, pp.1411-1423. Tolwani, A., 2012. Continuous renal-replacement therapy for acute kidney injury. New England Journal of Medicine, 367(26), pp.2505-2514. Griffin, B.R., Liu, K.D. and Teixeira, J.P., 2020. Critical care nephrology: core curriculum 2020. American Journal of Kidney Diseases, 75(3), pp.435-452. Uchino, S., Kellum, J.A., Bellomo, R., Doig, G.S., Morimatsu, H., Morgera, S., Schetz, M., Tan, I., Bouman, C., Macedo, E. and Gibney, N., 2005. Acute renal failure in critically ill patients: a multinational, multicenter study. Jama, 294(7), pp.813-818. Wald, R., Adhikari, N.K., Smith, O.M., Weir, M.A., Pope, K., Cohen, A., Thorpe, K., McIntyre, L., Lamontagne, F., Soth, M. and Herridge, M., 2015. Comparison of standard and accelerated initiation of renal replacement therapy in acute kidney injury. Kidney international, 88(4), pp.897-904. Zarbock, A., Kellum, J.A., Schmidt, C., Van Aken, H., Wempe, C., Pavenstädt, H., Boanta, A., Gerß, J. and Meersch, M., 2016. Effect of early vs delayed initiation of renal replacement therapy on mortality in critically ill patients with acute kidney injury: the ELAIN randomized clinical trial. Jama, 315(20), pp.2190-2199. Kellum, J.A., Mehta, R.L., Angus, D.C., Palevsky, P. and Ronco, C., 2002. The first international consensus conference on continuous renal replacement therapy. Kidney international, 62(5), pp.1855-1863. Ostermann, M., Joannidis, M., Pani, A., Floris, M., De Rosa, S., Kellum, J.A., Ronco, C. and 17th Acute Disease Quality Initiative (ADQI) Consensus Group, 2016. Patient selection and timing of continuous renal replacement therapy. Blood purification, 42(3), pp.224-237. Connor Jr, M.J. and Karakala, N., 2017. Continuous renal replacement therapy: reviewing current best practice to provide high-quality extracorporeal therapy to critically ill patients. Advances in Chronic Kidney Disease, 24(4), pp.213-218. An, J.N., Kim, S.G. and Song, Y.R., 2021. When and why to start continuous renal replacement therapy in critically ill patients with acute kidney injury. Kidney Research and Clinical Practice, 40(4), p.566. Disease, K.J.K.I.S., 2012. Improving global outcomes (KDIGO) acute kidney injury work group: KDIGO clinical practice guideline for acute kidney injury. Kidney Int Suppl, 2(1), pp.1-138. Gaudry, S., Hajage, D., Schortgen, F., Martin-Lefevre, L., Pons, B., Boulet, E., Boyer, A., Chevrel, G., Lerolle, N., Carpentier, D. and De Prost, N., 2016. Initiation strategies for renal-replacement therapy in the intensive care unit. New England Journal of Medicine, 375(2), pp.122-133. Barbar, S.D., Clere-Jehl, R., Bourredjem, A., Hernu, R., Montini, F., Bruyère, R., Lebert, C., Bohé, J., Badie, J., Eraldi, J.P. and Rigaud, J.P., 2018. Timing of renal-replacement therapy in patients with acute kidney injury and sepsis. New England Journal of Medicine, 379(15), pp.1431-1442. Clark, E.G. and Bagshaw, S.M., 2015, January. Unnecessary renal replacement therapy for acute kidney injury is harmful for renal recovery. In Seminars in dialysis (Vol. 28, No. 1, pp. 6-11). da Hora Passos, R., Ramos, J.G.R., Mendonça, E.J.B., Miranda, E.A., Dutra, F.R.D., Coelho, M.F.R., Pedroza, A.C., Correia, L.C.L., Batista, P.B.P., Macedo, E. and Dutra, M.M., 2017. A clinical score to predict mortality in septic acute kidney injury patients requiring continuous renal replacement therapy: the HELENICC score. BMC anesthesiology, 17, pp.1-8. Kim, Y., Park, N., Kim, J., Kim, D.K., Chin, H.J., Na, K.Y., Joo, K.W., Kim, Y.S., Kim, S. and Han, S.S., 2019. Development of a new mortality scoring system for acute kidney injury with continuous renal replacement therapy. Nephrology, 24(12), pp.1233-1240. Demirjian, S., Chertow, G.M., Zhang, J.H., O'Connor, T.Z., Vitale, J., Paganini, E.P., Palevsky, P.M. and VA/NIH Acute Renal Failure Trial Network, 2011. Model to predict mortality in critically ill adults with acute kidney injury. Clinical Journal of the American Society of Nephrology, 6(9), pp.2114-2120. Chang, H.H., Chiang, J.H., Wang, C.S., Chiu, P.F., Abdel-Kader, K., Chen, H., Siew, E.D., Yabes, J., Murugan, R., Clermont, G. and Palevsky, P.M., 2022. Predicting mortality using machine learning algorithms in patients who require renal replacement therapy in the critical care unit. Journal of Clinical Medicine, 11(18), p.5289. Zamanzadeh, D., Feng, J., Petousis, P., Vepa, A., Sarrafzadeh, M., Karumanchi, S.A., Bui, A.A. and Kurtz, I., 2024. Data-driven prediction of continuous renal replacement therapy survival. Nature Communications, 15(1), p.5440. Li, G., Li, B., Song, B., Liu, D., Sun, Y., Ju, H., Xu, X., Mao, J. and Zhou, F., 2024. Uplift modeling to predict individual treatment effects of renal replacement therapy in sepsis-associated acute kidney injury patients. Scientific Reports, 14(1), p.5833. Kang, M.W., Kim, J., Kim, D.K., Oh, K.H., Joo, K.W., Kim, Y.S. and Han, S.S., 2020. Machine learning algorithm to predict mortality in patients undergoing continuous renal replacement therapy. Critical Care, 24, pp.1-9. Hung, P.S., Lin, P.R., Hsu, H.H., Huang, Y.C., Wu, S.H. and Kor, C.T., 2022. Explainable machine learning-based risk prediction model for in-hospital mortality after continuous renal replacement therapy initiation. Diagnostics, 12(6), p.1496. Lee, Y., Seo, J.H., Seong, J., Ahn, S.M., Han, M., Lee, J.A., Kim, J.H., Ahn, J.Y., Jeong, S.J., Choi, J.Y. and Yeom, J.S., 2024. Impact of Early Continuous Kidney Replacement Therapy in Patients With Sepsis-Associated Acute Kidney Injury: An Analysis of the MIMIC-IV Database. Journal of Korean medical science, 39(43). Li, C., Zhao, K., Ren, Q., Chen, L., Zhang, Y., Wang, G. and Xie, K., 2024. Development and validation of a model for predicting in-hospital mortality in patients with sepsis-associated kidney injury receiving renal replacement therapy: a retrospective cohort study based on the MIMIC-IV database. Frontiers in Cellular and Infection Microbiology, 14, p.1488505. França, A.R., Rocha, E., Bastos, L.S., Bozza, F.A., Kurtz, P., Maccariello, E., e Silva, J.R.L. and Salluh, J.I., 2024. Development and validation of a machine learning model to predict the use of renal replacement therapy in 14,374 patients with COVID-19. Journal of Critical Care, 80, p.154480. Zhang, Q., Zheng, P., Hong, Z., Li, L., Liu, N., Bian, Z., Chen, X., Wu, H. and Zhao, S., 2024. Machine learning in risk prediction of continuous renal replacement therapy after coronary artery bypass grafting surgery in patients. Clinical and Experimental Nephrology, 28(8), pp.811-821. Li, K., Li, Y., Gao, Q., Xu, L., Hu, Q., Ji, B. and Gao, G., 2025. Machine learning in risk prediction of continuous renal replacement therapy after surgical repair of acute type A aortic dissection. Journal of Cardiothoracic and Vascular Anesthesia. Zhuang, C., Hu, R., Li, K., Liu, Z., Bai, S., Zhang, S. and Wen, X., 2025. Machine learning prediction models for mortality risk in sepsis-associated acute kidney injury: evaluating early versus late CRRT initiation. Frontiers in Medicine, 11, p.1483710. Khwaja, A., 2012. KDIGO clinical practice guidelines for acute kidney injury. Nephron Clinical Practice, 120(4), pp.c179-c184. Cerdá, J., Liu, K.D., Cruz, D.N., Jaber, B.L., Koyner, J.L., Heung, M., Okusa, M.D. and Faubel, S., 2015. Promoting kidney function recovery in patients with AKI requiring RRT. Clinical Journal of the American Society of Nephrology, 10(10), pp.1859-1867. Sheng, S., Li, A., Liu, X., Shen, T., Zhou, W., Lv, X., Shen, Y., Wang, C., Ma, Q., Qu, L. and Ma, S., 2024. Factors and machine learning models for predicting successful discontinuation of continuous renal replacement therapy in critically ill patients with acute kidney injury: a retrospective cohort study based on MIMIC-IV database. BMC nephrology, 25(1), p.407. Raina, R., Kashani, K., Sethi, S.K., Mok, Q., Kakajiwala, A., Parikh, S.M., Doshi, K., Hu, J., Alhasan, K., de Sousa Tavares, M. and Yap, H.K., 2025. Liberation from continuous renal replacement therapy due to renal recovery in adults and children: a literature review and Delphi consensus on clinical practice. Critical Care, 29(1), p.287. Katulka, R.J., Al Saadon, A., Sebastianski, M., Featherstone, R., Vandermeer, B., Silver, S.A., Gibney, R.N., Bagshaw, S.M. and Rewa, O.G., 2020. Determining the optimal time for liberation from renal replacement therapy in critically ill patients: a systematic review and meta-analysis (DOnE RRT). Critical Care, 24(1), p.50. Dhala, A., Gotur, D., Hsu, S.H.-L., Uppalapati, A., Hernandez, M., Alegria, J. and Masud, F. (2021) ‘A Year of Critical Care: The Changing Face of the ICU During COVID-19’, Methodist DeBakey Cardiovascular Journal, 17(5), p. 31-42. Available at: https://doi.org/10.14797/mdcvj.1041. Supady A, Curtis JR, Abrams D, Lorusso R, Bein T, Boldt J, Brown CE, Duerschmied D, Metaxa V, Brodie D. Allocating scarce intensive care resources during the COVID-19 pandemic: practical challenges to theoretical frameworks. Lancet Respir Med. 2021 Apr;9(4):430-434. doi: 10.1016/S2213-2600(20)30580-4. Epub 2021 Jan 12. PMID: 33450202; PMCID: PMC7837018. Meijs, D.A., van Kuijk, S.M., Wynants, L., Stessel, B., Mehagnoul-Schipper, J., Hana, A., Scheeren, C.I., Bergmans, D.C., Bickenbach, J., Vander Laenen, M. and Smits, L.J., 2022. Predicting COVID-19 prognosis in the ICU remained challenging: external validation in a multinational regional cohort. Journal of Clinical Epidemiology, 152, pp.257-268. Franklin, G., Stephens, R., Piracha, M., Tiosano, S., Lehouillier, F., Koppel, R. and Elkin, P.L., 2024. The sociodemographic biases in machine learning algorithms: a biomedical informatics perspective. Life, 14(6), p.652. Hasanzadeh, F., Josephson, C.B., Waters, G., Adedinsewo, D., Azizi, Z. and White, J.A., 2025. Bias recognition and mitigation strategies in artificial intelligence healthcare applications. NPJ Digital Medicine, 8(1), p.154. Kumar, A., Aelgani, V., Vohra, R., Gupta, S.K., Bhagawati, M., Paul, S., Saba, L., Suri, N., Khanna, N.N., Laird, J.R. and Johri, A.M., 2024. Artificial intelligence bias in medical system designs: a systematic review. Multimedia Tools and Applications, 83(6), pp.18005-18057. Chen, Z., Liu, X., Yang, Q., Wang, Y.J., Miao, K., Gong, Z., Yu, Y., Leonov, A., Liu, C., Feng, Z. and Chuan-Peng, H., 2023. Evaluation of risk of bias in neuroimaging-based artificial intelligence models for psychiatric diagnosis: a systematic review. JAMA Network Open, 6(3), pp.e231671-e231671. Obermeyer, Z., Powers, B., Vogeli, C. and Mullainathan, S., 2019. Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), pp.447-453. Butt, S., Butt, H. and Gnanappiragasam, D., 2021. Unintentional consequences of artificial intelligence in dermatology for patients with skin of colour. Clinical and Experimental Dermatology, 46(7), pp.1333-1334. Johnson, Alistair, Pollard, Tom, and Roger Mark. "MIMIC-III Clinical Database" (version 1.4). PhysioNet (2016). https://doi.org/10.13026/C2XW26. Johnson, A. E. W., Pollard, T. J., Shen, L., Lehman, L. H., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L. A., & Mark, R. G. (2016). MIMIC-III, a freely accessible critical care database. Scientific Data, 3, 160035. Johnson, A.E.W., Bulgarelli, L., Shen, L. et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data 10, 1 (2023). https://doi.org/10.1038/s41597-022-01899-x. Johnson, A., Bulgarelli, L., Pollard, T., Gow, B., Moody, B., Horng, S., Celi, L. A., and Mark, R. (2024) 'MIMIC-IV' (version 3.1), PhysioNet. Available at: https://doi.org/10.13026/kpb9-mt58. Pollard, T., Johnson, A., Raffa, J. et al. The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Sci Data 5, 180178 (2018). https://doi.org/10.1038/sdata.2018.178. Pollard, Tom, Johnson, Alistair, Raffa, Jesse, Celi, Leo Anthony, Badawi, Omar, and Roger Mark. "eICU Collaborative Research Database" (version 2.0). PhysioNet (2019). https://doi.org/10.13026/C2WM1R. Raimann, F.J., König, C.J., Neef, V. and Flinspach, A.N., 2024, July. “Mind the Gap”—differences between documentation and reality on intensive care units: a quantitative observational study. In Healthcare (Vol. 12, No. 15, p. 1481). MDPI. Hyland, S.L., Faltys, M., Hüser, M., Lyu, X., Gumbsch, T., Esteban, C., Bock, C., Horn, M., Moor, M., Rieck, B. and Zimmermann, M., 2020. Early prediction of circulatory failure in the intensive care unit using machine learning. Nature medicine, 26(3), pp.364-373. Lyu, X., Fan, B., Hüser, M., Hartout, P., Gumbsch, T., Faltys, M., Merz, T.M., Rätsch, G. and Borgwardt, K., 2024. An empirical study on KDIGO-defined acute kidney injury prediction in the intensive care unit. Bioinformatics, 40(Supplement_1), pp.i247-i256. Sundararajan, Mukund, Ankur Taly, and Qiqi Yan. "Axiomatic attribution for deep networks." In International conference on machine learning, pp. 3319-3328. PMLR, 2017. Lundberg, S.M., Erion, G., Chen, H., DeGrave, A., Prutkin, J.M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N. and Lee, S.I., 2020. From local explanations to global understanding with explainable AI for trees. Nature machine intelligence, 2(1), pp.56-67. Lundberg, S.M. and Lee, S.I., 2017. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30. Zarbock, A., Kellum, J.A., Schmidt, C., Van Aken, H., Wempe, C., Pavenstädt, H., Boanta, A., Gerß, J. and Meersch, M., 2016. Effect of early vs delayed initiation of renal replacement therapy on mortality in critically ill patients with acute kidney injury: the ELAIN randomized clinical trial. Jama, 315(20), pp.2190-2199. Starrt-Aki Investigators, 2020. Timing of initiation of renal-replacement therapy in acute kidney injury. New England Journal of Medicine, 383(3), pp.240-251. Table Table 1. Demographics and data distribution of each dataset used in the study. Covid-19 corresponds to the portion of the MIMIC datasets during the pandemic (2020-2022). Abbreviations: LOS, length of stay; MIMIC, Medical Information Mart for Intensive Care; RRT, renal replacement therapy MIMIC-III MIMIC-IV eICU COVID-19 era Variable No RRT RRT No RRT RRT No RRT RRT No RRT RRT Total 46,616 2,997 61,795 3,912 128,416 3,499 8,652 667 Age (years), median [quartiles] 66 [53,78] 63 [51,73] 66 [55,78] 64 [53,73] 66 [54,77] 64 [54,73] 66 [55,76] 62 [50,73] Gender (Male), n (%) 26,249 (56.3) 1,720 (57.4) 34,404 (55.7) 2,332 (59.6) 69,562 (54.2) 1,970 (56.3) 5059 (58.5) 407 (61.0) Race, n (%) Asian 1080 (2.3) 69 (2.3) 1820 (2.9) 118 (3.0) 2148 (1.7) 50 (1.4) 628 (7.3) 36 (5.4) African American 4039 (8.7) 676 (22.6) 6314 (10.2) 773 (19.8) 14225 (11.1) 605 (17.3) 496 (5.7) 60 (9.0) Hispanic 1561 (3.3) 139 (4.6) 2352 (3.8) 196 (5.0) 4670 (3.6) 220 (6.3) 491 (5.7) 35 (5.2) Other 6075 (13.0) 318 (10.5) 8534 (14) 616 (15.8) 8021 (6.2) 284 (8.2) 2463 (28.5) 223 (33.4) White 33861 (72.6) 1795 (59.9) 42675 (69.1) 2209 (56.5) 99352 (77.4) 2340 (66.9) 4574 (52.9) 313 (46.9) Ethnicity, n (%) Hispanic 1561 (3.3) 139 (4.6) 2352 (3.8) 196 (5.0) 4670 (3.6) 220 (6.3) 491 (5.7) 35 (5.2) Non Hispanic 45055 (96.7) 2858 (95.4) 59443 (96.2) 3716 (95.0) 123746 (96.4) 3279 (93.7) 8161 (94.3) 632 (94.8) LOS (days), median [quartiles] 2.1 [1.2,4.0] 3.8 [2.0,8.4] 2.0 [1.2,3.4] 3.9 [1.9,9.0] 2.3 [1.6,4.0] 4.1 [2.2,8.7] 2.1 [1.2,4.2] 5.3 [2.2,12] Inpatient mortality, n (%) 3220 (6.9) 604 (20.2) 3513 (5.7) 911 (23.3) 6174 (4.8) 644 (18.4) 609 (7.0) 276 (41.4) Additional Declarations Competing interest reported. This study is funded by the National Institutes of Health 1OT2OD037995-01 Supplementary Files SupplementaryRRT121525final.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8419139","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":567460129,"identity":"2255c53e-3635-4f61-b561-747932b8fa04","order_by":0,"name":"Behrooz Mamandipoor","email":"","orcid":"","institution":"University of California, San Diego","correspondingAuthor":false,"prefix":"","firstName":"Behrooz","middleName":"","lastName":"Mamandipoor","suffix":""},{"id":567460130,"identity":"55fe4454-878b-4e38-93e0-7aede99fc73b","order_by":1,"name":"Julian McAuley","email":"","orcid":"","institution":"University of California, San Diego","correspondingAuthor":false,"prefix":"","firstName":"Julian","middleName":"","lastName":"McAuley","suffix":""},{"id":567460131,"identity":"0f2e3f97-a60f-4202-bf5f-9a49f06cd948","order_by":2,"name":"Martin Krause","email":"","orcid":"","institution":"University of California, San Diego","correspondingAuthor":false,"prefix":"","firstName":"Martin","middleName":"","lastName":"Krause","suffix":""},{"id":567460132,"identity":"431ca85a-5f17-426f-9c5c-09f1ba84a537","order_by":3,"name":"Ulrich Schmidt","email":"","orcid":"","institution":"University of California, San Diego","correspondingAuthor":false,"prefix":"","firstName":"Ulrich","middleName":"","lastName":"Schmidt","suffix":""},{"id":567460133,"identity":"8e8b6b1b-f59d-4ca3-9f70-a374823ad4b2","order_by":4,"name":"Chun-Nan Hsu","email":"","orcid":"","institution":"University of California, San Diego","correspondingAuthor":false,"prefix":"","firstName":"Chun-Nan","middleName":"","lastName":"Hsu","suffix":""},{"id":567460135,"identity":"f207f8fe-389e-43a3-832a-6f59d3418075","order_by":5,"name":"Rodney A Gabriel","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABEUlEQVRIiWNgGAWjYJACZgYGCQY+9h4gs4AhASzEw8DA2EBICxvPGSDTwIBoLQwMbBI5RGoxbz9+8XPhHgs5Nsm3Bx/8MPiTxyCR/PjDGwYb2Q0HsGuROZNTLD3jmYQxm3ResmGPgUExg0SameQchjRjXFokGHISpHkOSCS2SeeYSfAYGCQ2SOSwMfMwHE7EqYX/TfJvoJb6Nskz5j//QLQwf+Zh+I9bi0T6MZAtCWwSPGbMUFsYpHkYDuDR8obNesYBCcM2nhxjaRkD48Q2nmdAvxgkG8/E6bD0x7cLDtTJ87OfMfz4pkIusZ8dFGIVdrJ9OLQAo8AAlc8GJg2wqIQD9gf4ZEfBKBgFo2AUMDAAAHYlVIhEvUbFAAAAAElFTkSuQmCC","orcid":"","institution":"University of California, San Diego","correspondingAuthor":true,"prefix":"","firstName":"Rodney","middleName":"A","lastName":"Gabriel","suffix":""}],"badges":[],"createdAt":"2025-12-21 19:08:28","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8419139/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8419139/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":99625758,"identity":"3606ef28-7ab4-4449-8091-c7103052e7c6","added_by":"auto","created_at":"2026-01-06 15:06:20","extension":"jpg","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1746222,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/e3ab8ebe188ee535a810b281.jpg"},{"id":99793400,"identity":"360c5b27-2da5-468d-bd1c-abba9b0cb3cf","added_by":"auto","created_at":"2026-01-08 13:31:32","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":4960539,"visible":true,"origin":"","legend":"","description":"","filename":"BMamandipoorRRT121525finalv2.docx","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/6b10a1d6ca6745c7037686a5.docx"},{"id":99794961,"identity":"b71ebf5f-6e59-4868-84ed-1b1dcc508791","added_by":"auto","created_at":"2026-01-08 13:36:44","extension":"jpg","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1879134,"visible":true,"origin":"","legend":"","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/2e0ce2823ad3ab4ae300769a.jpg"},{"id":99625760,"identity":"41cb212b-c18a-4bb0-842e-40b3486b16f6","added_by":"auto","created_at":"2026-01-06 15:06:20","extension":"docx","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":17131,"visible":true,"origin":"","legend":"","description":"","filename":"Table1.docx","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/c5e693b4f98f545a03c13f6f.docx"},{"id":99794701,"identity":"36b3267a-c8fe-4ead-8e38-2a4f064a7f76","added_by":"auto","created_at":"2026-01-08 13:35:59","extension":"jpg","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2618145,"visible":true,"origin":"","legend":"","description":"","filename":"Figure3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/1fa89fb93b1cad888c60b034.jpg"},{"id":99625768,"identity":"ddca4b41-948e-419f-8f45-592366a0ede7","added_by":"auto","created_at":"2026-01-06 15:06:20","extension":"jpg","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2118536,"visible":true,"origin":"","legend":"","description":"","filename":"Figure4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/d371897116be56b424df6736.jpg"},{"id":99794825,"identity":"db5b3570-0e89-4bd8-80b0-e4cb8e7352d6","added_by":"auto","created_at":"2026-01-08 13:36:23","extension":"json","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":8539,"visible":true,"origin":"","legend":"","description":"","filename":"bc231d5e71124208a948964ccd0bf1b9.json","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/49340e65ec169cf22af06765.json"},{"id":99625778,"identity":"96bcbc27-1adb-403b-9f55-272a57c6cff4","added_by":"auto","created_at":"2026-01-06 15:06:20","extension":"docx","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":10909716,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryRRT121525final.docx","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/d2644d93f75a33da9d0411f6.docx"},{"id":99793949,"identity":"c2814c55-9cbd-4c33-b927-e5315d5744a4","added_by":"auto","created_at":"2026-01-08 13:33:36","extension":"xml","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":159811,"visible":true,"origin":"","legend":"","description":"","filename":"bc231d5e71124208a948964ccd0bf1b91enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/598fafb55a0a88a73d5faa7d.xml"},{"id":99794289,"identity":"94c30f6b-a293-4696-97e2-a9e9f9748787","added_by":"auto","created_at":"2026-01-08 13:34:28","extension":"jpg","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1746222,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/3810d7f4ca4c2c6f1aaef071.jpg"},{"id":99625772,"identity":"18e74e60-a429-497c-a766-bed42ec98f41","added_by":"auto","created_at":"2026-01-06 15:06:20","extension":"jpg","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1879134,"visible":true,"origin":"","legend":"","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/86071012bb62762981caa6d3.jpg"},{"id":99625763,"identity":"27ded5d6-b654-4412-9a7b-4e10695de07b","added_by":"auto","created_at":"2026-01-06 15:06:20","extension":"jpg","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2618145,"visible":true,"origin":"","legend":"","description":"","filename":"Figure3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/083c157c4226274aea707d51.jpg"},{"id":99794104,"identity":"5eb2ef89-82fa-4650-9306-fcf966dfdaf5","added_by":"auto","created_at":"2026-01-08 13:33:58","extension":"jpg","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2118536,"visible":true,"origin":"","legend":"","description":"","filename":"Figure4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/bc9f669628d75e1850692394.jpg"},{"id":99625777,"identity":"58119ce8-679b-407d-abd0-7f2435e1a6f1","added_by":"auto","created_at":"2026-01-06 15:06:20","extension":"png","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":252427,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure1.png","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/410e5c682f4465edd97ad20d.png"},{"id":99625775,"identity":"bd355067-78ea-42b4-a56f-1178757ad34f","added_by":"auto","created_at":"2026-01-06 15:06:20","extension":"png","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":337976,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure2.png","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/b92bd2d7bbf5912bdf47bce1.png"},{"id":99793047,"identity":"b5278116-a33f-441d-ba94-1c5f18d34651","added_by":"auto","created_at":"2026-01-08 13:30:55","extension":"png","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":429865,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure3.png","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/d974922da224ee62a7f8508d.png"},{"id":99794773,"identity":"75d0c954-d498-4ff9-91a9-7e82c6069e92","added_by":"auto","created_at":"2026-01-08 13:36:14","extension":"png","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":307235,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure4.png","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/81d46bff4290d9cdef1f8dd2.png"},{"id":99625776,"identity":"e5928350-c0e3-492a-94cf-df2041c1aaab","added_by":"auto","created_at":"2026-01-06 15:06:20","extension":"xml","order_by":17,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":158441,"visible":true,"origin":"","legend":"","description":"","filename":"bc231d5e71124208a948964ccd0bf1b91structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/e9fd197b883fb9c16fca6285.xml"},{"id":99625770,"identity":"5e224ac8-7cce-4608-a01f-7057a32b7738","added_by":"auto","created_at":"2026-01-06 15:06:20","extension":"html","order_by":18,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":172671,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/01a04c2b635059e128b5f0e4.html"},{"id":99793200,"identity":"e495dc2b-f524-4038-abc3-fd22edfae492","added_by":"auto","created_at":"2026-01-08 13:31:10","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":1746222,"visible":true,"origin":"","legend":"\u003cp\u003ePerformance based on area under the receiver operating characteristics curve and area under the precision-recall curve on internal validation for A) ScreeningEWS – which predicts need of renal replacement therapy during hospitalization any time after the initial 12 hours in the ICU; and B) DynamicEWS – which continuously predicts need for renal replacement therapy on an hourly basis. Each plot contains 4 curves, each corresponding to different datasets used for analysis. Abbreviations: AUC, area under the curve; AP, average precision; EWS, early warning system.\u003c/p\u003e","description":"","filename":"Figure1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/cf08e268894fbe1d9ab2225e.jpg"},{"id":99794874,"identity":"9e43c254-46c6-4a86-9e78-30852a9a5191","added_by":"auto","created_at":"2026-01-08 13:36:32","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":1879134,"visible":true,"origin":"","legend":"\u003cp\u003eTo characterize this temporal behavior, we evaluated AUROC, precision, and recall at the hourly data point level across days since ICU admission, as well as the alarm rate over early alerting windows of up to 24 hours before RRT initiation. Precision and recall increased over later ICU days, while AUROC remained high and largely stable, with only a slight decrease in the eICU cohort. In parallel, the overall alarm rate increased closer to RRT onset, indicating that the model triggered more alerts as patients approached the event. Together, these trends suggest that the model becomes progressively more effective at identifying impending RRT over the course of the ICU stay. Abbreviations: AUROC, area under the receiver operating characteristics curve; ICU, intensive care unit; RRT, renal replacement therapy\u003c/p\u003e","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/41aabe65522c7941168bd15a.jpg"},{"id":99625755,"identity":"7304b7fd-8326-422b-b1dd-8ea15bc5ca14","added_by":"auto","created_at":"2026-01-06 15:06:20","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":2618145,"visible":true,"origin":"","legend":"\u003cp\u003eInterpretation of dynamicEWS. A) Illustration on how the model dynamically assesses RRT risk over time with real-time individual-level interpretability using dynamicEWS, which generates hourly predictions of future RRT risk across the entire stay, continuously updating its estimates as new clinical data become available. B) We assessed model performance across demographic subgroups in the MIMIC-IV cohort, stratified by gender and race, to evaluate fairness. Overall, dynamicEWS showed comparable discrimination and error rates across most subgroups. Abbreviation: EWS, early warning system; MIMIC, Medical Information Mart for Intensive Care; RRT, renal replacement therapy\u003c/p\u003e","description":"","filename":"Figure3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/78636ad65e175116865a2b83.jpg"},{"id":99793840,"identity":"5ee9558f-aa42-4129-9cb9-b203516106ea","added_by":"auto","created_at":"2026-01-08 13:32:39","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":2118536,"visible":true,"origin":"","legend":"\u003cp\u003ePerformance based on area under the receiver operating characteristics curve and area under the precision-recall curve on internal validation for A) Duration EWS – which predicts duration of RRT required; and B) PrognosisEWS – which predicts mortality after RRT initiation. Each plot contains 4 curves, each corresponding to different datasets used for analysis. Abbreviations: AUC, area under the curve; AP, average precision; EWS, early warning system.\u003c/p\u003e","description":"","filename":"Figure4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/aa553adb39cc8c341e95c035.jpg"},{"id":102748948,"identity":"335b38dc-bf5c-43e0-bd5a-a592e41e8b86","added_by":"auto","created_at":"2026-02-16 09:11:44","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":9406235,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/9ab83f49-c5ea-4d3e-a8d2-6d4f029c57cf.pdf"},{"id":99625779,"identity":"35c3fe9f-271d-4a65-a0a4-cecf53eb15d8","added_by":"auto","created_at":"2026-01-06 15:06:20","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":10909716,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryRRT121525final.docx","url":"https://assets-eu.researchsquare.com/files/rs-8419139/v1/10b869d2010aff806e1d56a6.docx"}],"financialInterests":"Competing interest reported. This study is funded by the National Institutes of Health 1OT2OD037995-01","formattedTitle":"\u003cp\u003eA deep learning-based early warning system for renal replacement therapy in the intensive care unit\u003c/p\u003e","fulltext":[{"header":"Introduction","content":"\u003cp\u003eAcute kidney injury (AKI) is a prevalent and severe complication in critically ill patients, frequently necessitating renal replacement therapy (RRT) as a life-saving intervention.\u003csup\u003e\u003cspan additionalcitationids=\"CR2 CR3 CR4 CR5\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e The clinical decision to initiate RRT, however, remains highly complex and subjective, often relying on individual clinician judgment.\u003csup\u003e\u003cspan additionalcitationids=\"CR8 CR9 CR10 CR11 CR12\" citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e This subjectivity can result in delayed treatment or unnecessary interventions, thereby adversely affecting patient outcomes and resource utilization.\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e Therefore, accurate and timely predictions for the need of RRT initiation, duration of therapy, and subsequent clinical outcomes such as mortality are crucial components of effective clinical management and personalized care in the intensive care unit (ICU). Traditional severity-of-illness scoring systems, including the Acute Physiology and Chronic Health Evaluation (APACHE) II, Sequential Organ Failure Assessment (SOFA), and Mortality Scoring system for AKI with continuous RRT (MOSAIC), have historically demonstrated limited accuracy and clinical utility for predicting RRT-related decisions and outcomes.\u003csup\u003e\u003cspan additionalcitationids=\"CR16\" citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eLeveraging ICU data and machine learning (ML) is one solution for improving prediction and management of RRT. However, reported models often focus on isolated clinical tasks rather than spanning the spectrum of clinical decision-making throughout the RRT care pathway.\u003csup\u003e\u003cspan additionalcitationids=\"CR19 CR20 CR21 CR22 CR23 CR24 CR25 CR26\" citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e Many approaches also rely solely on static, admission-based single-timepoint variables and therefore fail to capture the dynamic, evolving clinical trajectories typical of critically ill patients.\u003csup\u003e\u003cspan additionalcitationids=\"CR24 CR25 CR26 CR27\" citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e In addition, predictive tools specifically designed to estimate RRT duration\u0026mdash;critical for resource planning\u0026mdash;remain scarce. This gap likely stems from the absence of standardized RRT cessation protocols and universally accepted liberation thresholds (e.g., urine output or creatinine cutoffs) for discontinuing RRT, as well as a historical emphasis on predicting RRT initiation or mortality rather than treatment course length.\u003csup\u003e\u003cspan additionalcitationids=\"CR30 CR31 CR32\" citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u003c/sup\u003e Furthermore, generalization across hospitals and over time remains challenging when models are validated primarily in single-center cohorts,\u003csup\u003e20\u0026ndash;24\u003c/sup\u003e as temporal shifts\u0026mdash;driven by evolving ICU protocols, resource allocation, and patient demographics, exemplified during the COVID-19 pandemic\u0026mdash;can alter data distributions and degrade performance.\u003csup\u003e\u003cspan additionalcitationids=\"CR35\" citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e\u003c/sup\u003e Predictive models often provide limited population- and patient-level interpretability, limiting their real-time clinical utility and actionable bedside explanations.\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e,\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e Finally, many studies have not adequately addressed issues of fairness and bias, raising concerns about disparate prediction performance across demographic subgroups, including gender and race/ethnicity.\u003csup\u003e37\u0026ndash;42\u003c/sup\u003e Addressing these gaps is essential for developing robust, interpretable, and fair decision-support systems for RRT management in critical care.\u003c/p\u003e \u003cp\u003eTo address the outlined knowledge gaps, this study proposed a deep learning-based framework specifically designed to tackle four clinically relevant tasks for optimizing RRT management in critically ill patients: (1) screening patients for early prediction of the need for RRT after their first 12 hours in the ICU; (2) real-time dynamic prediction of impending RRT initiation within the next 24 hours, allowing ongoing patient assessment and proactive interventions; (3) prediction of RRT duration (if needed), classifying patients into short-term (\u0026lt;\u0026thinsp;48 hours) or prolonged (\u0026ge;\u0026thinsp;48 hours) treatment groups; and (4) prediction of mortality following RRT onset.\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003e\u003cem\u003eStudy population\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe study was conducted in accordance with the Declaration of Helsinki. The Institutional Review Board (Human Research Protections Program) determined that this study was exempt from review and waived need for participant consent because it involved retrospective analysis of de-identified data. Three datasets were used: Medical Information Mart for Intensive Care (MIMIC)-III, MIMIC-IV, and the eICU Collaborative Research Database. The MIMIC-III database is a widely accessible critical care database that integrates de-identified clinical data from patients admitted to intensive care units at the Beth Israel Deaconess Medical Center in Boston, Massachusetts. This database encompasses records from 53,423 hospital admissions from 2001 to 2012.\u003csup\u003e43,44\u003c/sup\u003e MIMIC-IV extends this resource by including clinical data from more than 90,000 ICU admissions from 2008 to 2022 at the same institution.\u003csup\u003e45,46\u003c/sup\u003e The eICU Collaborative Research Database is a multi-center database specifically tailored to support intensive care research, containing high-granularity data from over 200,859 ICU admissions, pertaining to 139,367 distinct patients across 335 units in 208 hospitals in the United States between 2014 and 2015.\u003csup\u003e47,48\u003c/sup\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eRRT even definition\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eUse of RRT was a binary outcome that was identified from structured data from the MIMIC and\u003c/p\u003e\n\u003cp\u003e\u0026nbsp;eICU databases that explicitly indicate the use of it. In the MIMIC dataset, RRT-related activities were extracted from the \u0026ldquo;chartevents\u0026rdquo;, \u0026ldquo;inputevents\u0026rdquo;, and \u0026ldquo;procedureevents\u0026rdquo; tables using a curated list of item identifiers associated with continuous (CRRT) and intermittent dialysis. In the eICU database, RRT events were derived from the \u0026ldquo;intakeOutput\u0026rdquo; and \u0026ldquo;treatment\u0026rdquo; tables by detecting records with dialysis volumes or free-text descriptions containing relevant keywords (e.g., \u0026ldquo;dialysis,\u0026rdquo; \u0026ldquo;RRT\u0026rdquo;). Two post-processing steps were incorporated, based on prior clinical knowledge, to address incomplete documentation common in real-world ICU data.\u003csup\u003e49,50,51\u003c/sup\u003e A temporal imputation strategy was applied in which missing hours between two RRT positive observations were filled if the gap was less than or equal to six hours, under the assumption of ongoing therapy. Any RRT episode with a total duration of less than two hours was excluded to minimize the influence of spurious or fragmented documentation.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eCohort inclusion and exclusion criteria\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eWe excluded ICU patients whose length of stay was less than four hours, as these short stays provided insufficient clinical data. Additionally, ICU stays exceeding one month were truncated at 30 days, limiting the analysis to the first month of data. This step was implemented to prevent bias arising from rare instances of exceptionally prolonged ICU stays. ICU stays lacking any physiological measurements were removed. Specifically, we excluded cases in which all vital sign data\u0026mdash;including heart rate, respiratory rate, oxygen saturation (SpO\u003csub\u003e2\u003c/sub\u003e), and temperature\u0026mdash;were completely missing throughout the observation period. Likewise, stays with no recorded arterial blood gas and basic metabolic panel variables were excluded.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eInternal and external evaluation\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eFor each dataset (MIMIC-III, MIMIC-IV, and eICU), we divided them into training (70%), validation (10%), and test (20%) subsets for internal evaluation of our predictive models. Since it was crucial that each subset mirrored the overall dataset regarding RRT event and mortality rates, we initially computed these rates across the entire patient population and then used stratified sampling to preserve these proportions consistently within each subset. We further evaluated the generalizability of the models through external validation. First, we constructed an external test set from the eICU multi-center database by selecting ICU stays from eight distinct hospitals, each with more than 2,500 admissions. Patients from these hospitals were explicitly excluded from the training and validation subsets to prevent overlap. Second, we assessed temporal generalizability by training our models on MIMIC dataset (combining MIMIC-III and IV) admissions prior to 2019 and evaluating their performance externally on patients admitted during the COVID-19 pandemic period (2020\u0026ndash;2022). Data from the pandemic period are considered externally distinct due to substantial shifts in ICU management protocols, resource allocation strategies, and patient characteristics.\u003csup\u003e34-36\u003c/sup\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eData preprocessing\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eTwo data modalities were incorporated in the model: time-invariant (static \u0026ndash; does not change through time) data and time-variant (dynamic \u0026ndash; changes through time) data. Time-invariant data included age, gender, race/ethnicity, body mass index, and admission type. A detailed description of pre-processing of time-invariant data is provided in \u003cstrong\u003eSupplementary Materials 1\u003c/strong\u003e. Time-variant data included laboratory values, ventilator settings, clinical scores (e.g., Glasgow Coma Scale, clinical risk scores, etc.), clinical events (e.g., procedures, medications, etc.), and hemodynamics (e.g., heart rate, respiratory rate, oxygen saturation, blood pressure, etc.). Dynamic variables were binned into 1-hour non-overlapping windows, within which the mean for continuous variables or the mode for categorical variables were calculated when multiple values were recorded within a bin. A detailed description of pre-processing of time-variant data is provided in \u003cstrong\u003eSupplementary Materials 2\u003c/strong\u003e. A detailed description of approaches to outlier removal and missing data imputation is provided in \u003cstrong\u003eSupplementary Materials 3\u003c/strong\u003e. Feature generation involved transforming raw clinical data\u0026mdash;prior to imputation\u0026mdash;into structured hourly features suitable for predictive modeling. Seven distinct categories of features were computed for each hourly time point and are explained in detail in \u003cstrong\u003eSupplementary Materials 4\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eProblem formulation\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eTo operationalize our four clinical questions, we developed four variants of an early-warning system (EWS) for RRT in ICU patients: screeningEWS, dynamicEWS, durationEWS, and prognosisEWS. Each variant addresses a specific clinical prediction task:\u0026nbsp;\u003c/p\u003e\n\u003col\u003e\n \u003cli\u003e\u003cem\u003eEarly prediction of the need for RRT\u0026nbsp;(screeningEWS):\u003c/em\u003e This model was designed as an admission-time screening tool to identify patients at high risk of requiring RRT or mortality at any point during their ICU stay. For each ICU admission, the model takes as input the clinical variables measured within the first 12 hours and outputs a binary label indicating whether the patient will require RRT or decease at any point during ICU stay.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003e\u003cem\u003eReal-\u003c/em\u003e\u003cem\u003etime\u0026nbsp;dynamic\u0026nbsp;prediction of RRT initiation\u0026nbsp;(dynamicEWS):\u003c/em\u003e This model provides continuous hourly predictions of a patient\u0026apos;s likelihood of requiring RRT within the next 24 hours.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003e\u003cem\u003ePrediction of RRT duration (durationEWS):\u003c/em\u003e This model focuses on patients who have already started RRT and aims to predict the likely treatment duration by classifying the RRT course as short-term (\u0026le;48 hours) versus prolonged (\u0026gt;48 hours).\u0026nbsp;\u003c/li\u003e\n \u003cli\u003e\u003cem\u003eMortality\u0026nbsp;\u003c/em\u003e\u003cem\u003eprediction\u0026nbsp;post-RRT\u0026nbsp;onset (prognosisEWS):\u003c/em\u003e This model evaluates the risk of mortality at ICU discharge among patients who have initiated RRT, using clinical data from the 12-hour window surrounding the onset of RRT.\u0026nbsp;\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eDetails of methods related to labeling of events are provided in \u003cstrong\u003eSupplementary Materials 5\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eSupervised learning of prediction models\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eWe implemented task-specific predictive models for each EWS task, using bidirectional long short-term memory architectures for sequence data classification (screeningEWS, durationEWS, prognosisEWS) and an XGBoost model for real-time hourly data classification (dynamicEWS). Details of models\u0026rsquo; architecture and training process is reported in \u003cstrong\u003eSupplementary Materials 6.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eSystem evaluation and performance metrics\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003ePerformance metrics included the area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), F1-score, precision, specificity, and Brier score. To quantify the statistical uncertainty associated with each performance estimate, we calculated 95% confidence intervals using the pivot bootstrap method with 1,000 resamples drawn from the test dataset.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eReal-time model assessment and silencing policy\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eDue to the sequential, time-dependent nature of the predictions in the dynamicEWS model, the total number of positive hours does not directly match the total number of unique RRT events. A single RRT event may have multiple positive early alerting opportunities in advance, especially if the event occurs later in the ICU admission (up to a maximum of 24 early-alert opportunities). Consequently, RRT events occurring later provide more opportunities for early detection compared with events occurring soon after admission. To assess the clinical usefulness of our predictive models, we focused on evaluating the model\u0026rsquo;s ability to detect actual RRT events (event sensitivity) while maintaining an acceptable false-alarm rate. We calculated recall as the proportion of all RRT events that triggered at least one alarm within the preceding 24-hour period, thereby indicating the model\u0026rsquo;s sensitivity in detecting true RRT events. Precision was computed as the proportion of alarms that correctly anticipated an event within the following 24 hours at a fixed event-based recall of 80%. Furthermore, to mitigate alarm fatigue resulting from frequent or redundant alerts by the dynamicEWS model, we implemented a silencing policy in conjunction with the early-warning system. After triggering an initial alarm, the system suppresses subsequent alarms for a 6-hour period, preventing repetitive notifications.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eInterpretability\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eTo support interpretability at both the population and patient levels, we used explanation methods tailored to each model type. For the long short term memory classifiers (screeningEWS, durationEWS, and prognosisEWS), we applied Integrated Gradients\u003csup\u003e52\u003c/sup\u003e to quantify feature-level attributions relative to a baseline input at the patient level and to summarize influential predictors across the cohort. For the XGBoost-based dynamicEWS model, we used TreeExplainer\u003csup\u003e53\u003c/sup\u003e within the SHapley Additive exPlanations (SHAP) framework\u003csup\u003e54\u003c/sup\u003e to generate population-level SHAP summary plots and rank globally important predictors. We also developed an hourly SHAP-based prioritization scheme for patient-specific explanations, which was integrated into an animated dashboard that displays real-time risk trajectories alongside the top contributing features and their temporal trends. Further details of the interpretability pipeline are provided in \u003cstrong\u003eSupplementary Materials 7\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eBias and fairness\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eWe evaluated model bias and fairness across demographic subgroups defined by gender and race, using Male and White cohorts as baseline groups. We computed subgroup-specific discrimination (AUROC), predictive values (positive and negative predictive values), and error-rate metrics (true/false positive/negative rates), and quantified disparities relative to baseline using both absolute differences (\u0026Delta;) and disparity ratios (R). This audit operationalizes fairness objectives of equal opportunity, equalized odds, and predictive value parity, through subgroup comparisons. To prioritize clinically meaningful differences and avoid ratio artifacts from small baseline values (e.g., low false positive rate), we flagged disparities only when they exceeded both∣\u0026Delta;∣\u0026gt;0.10 and\u0026nbsp;∣R\u0026minus;1∣\u0026gt;0.20, and summarized results using disparity heatmaps and histograms. Details of bias and fairness evaluation approach is reported in \u003cstrong\u003eSupplementary Materials 8\u003c/strong\u003e.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003e\u003cem\u003eStudy population\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eA summary breakdown of the demographics across cohorts from each dataset are provided in \u003cstrong\u003eTable 1\u003c/strong\u003e. In the MIMIC-III dataset, a total of 49,613 ICU admissions were analyzed after exclusion. The rate of RRT use was 6%, with a 20% mortality rate among these patients. The total number of RRT events was 5,643, with a median [quartiles] duration of 44 hours [16\u0026ndash;107]. The median [quartiles] time to RRT from admission was 13 hours [2\u0026ndash;40]. A detailed breakdown of demographic data for MIMIC-III is provided in \u003cstrong\u003eSupplementary Materials 9\u003c/strong\u003e. In the MIMIC-IV dataset, a total of 65,707 ICU admissions were analyzed after exclusion. The rate of RRT use was 6%, with a 23% mortality rate among these patients. The total number of RRT events was 8,563, with a median [quartiles] duration of 41 hours [14\u0026ndash;109]. The median [quartiles] time to RRT from admission was 14 hours [2\u0026ndash;42]. A detailed breakdown of demographic data for MIMIC-IV is provided in \u003cstrong\u003eSupplementary Materials 10\u003c/strong\u003e. In the eICU dataset, a total of 131,272 ICU admissions were analyzed after exclusion. The rate of RRT use was 3%, with an 18% mortality rate among these patients. The total number of RRT events was 5,729, with a median [quartiles] duration of 8 hours [6\u0026ndash;16]. The median [quartiles] time to RRT from admission was 30 hours [6\u0026ndash;72]. A detailed breakdown of demographic data for eICU is provided in \u003cstrong\u003eSupplementary Materials 11\u003c/strong\u003e. In the COVID-19 pandemic period dataset, a total of 9,319 ICU admissions were analyzed after exclusion. The rate of RRT use was 7%, with a 41% mortality rate among these patients. The total number of RRT events was 1,548, with a median [quartiles] duration of 64 hours [24\u0026ndash;164]. The median [quartiles] time to RRT from admission was 14 hours [2\u0026ndash;49]. A detailed breakdown of demographic data for the pandemic period is provided in \u003cstrong\u003eSupplementary Materials 12\u003c/strong\u003e.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eEarly prediction of the need for RRT (screeningEWS)\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThis model serves as a screening tool to identify patients at high risk of requiring RRT, using only clinical data collected within the first 12 hours after ICU admission. For the early screening task, patients who initiated RRT within the first 12 hours of ICU admission were excluded, resulting in the removal of 1,471 (3.2%) patients in MIMIC-III, 1,759 (3.4%) in MIMIC-IV, 1,062 (1.9%) in eICU, and 289 (4.1%) in the COVID-period cohort who required RRT.\u003c/p\u003e\n\u003cp\u003eIn internal validation, AUROC was 0.90 (95% CI 0.886\u0026ndash;0.906) and AUPRC 0.53 was (95% CI 0.496\u0026ndash;0.564) in MIMIC-III. The AUROC was 0.91 (95% CI 0.901\u0026ndash;0.918) and the AUPRC was 0.55 (95% CI 0.516\u0026ndash;0.578) in MIMIC-IV. In eICU, AUROC was 0.87 (95% CI 0.863\u0026ndash;0.878) and AUPRC was 0.44 (95% CI 0.413\u0026ndash;0.459). External validation on ICU patients admitted during the COVID period yielded an AUROC of 0.90 (95% CI 0.894\u0026ndash;0.913) and an AUPRC of 0.57 (95% CI 0.534\u0026ndash;0.602) (\u003cstrong\u003eFigure 1A\u003c/strong\u003e). Results for all performance metrics are provided in \u003cstrong\u003eSupplementary Materials 13\u003c/strong\u003e. Additional hospital-level external validation across eight individual eICU hospitals (each with \u0026gt;2,500 ICU stays) showed robust performance, with AUROCs ranging from 0.86 to 0.91 and AUPRCs ranging from 0.35 to 0.52 (\u003cstrong\u003eSupplementary Materials 14\u003c/strong\u003e). Population-level interpretability analysis of the screeningEWS model using the Integrated Gradients method revealed that SpO\u003csub\u003e2\u003c/sub\u003e, temperature, heart rate, respiratory rate, Glasgow Coma Scale, chloride, albumin, mean corpuscular hemoglobin concentration, anion gap, and FiO\u003csub\u003e2\u003c/sub\u003e were the most influential predictors of RRT initiation risk during the ICU stay. \u003cstrong\u003eSupplementary Materials 15\u003c/strong\u003e provides the feature-importance rankings for each cohort, along with an illustrative example of individual patient\u0026ndash;level explanations.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eReal-time dynamic prediction of RRT initiation (dynamicEWS)\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eFor the dynamicEWS task, we generated hourly prediction instances over the entire ICU stay for patients not currently receiving RRT. This resulted in 4.3 million hourly time points in MIMIC-III (1.5% positive), 6.2 million in MIMIC-IV (2.6% positive), 11 million in eICU (2.1% positive), and 0.75 million in the COVID-period cohort (3.9% positive), where positive labels indicated RRT initiation within the subsequent 24 hours.\u003c/p\u003e\n\u003cp\u003eThe XGBoost-based dynamicEWS model achieved excellent discriminative performance in predicting RRT initiation within the next 24 hours. In MIMIC-III, AUROC was 0.97 (95% CI 0.969\u0026ndash;0.972) with an AUPRC of 0.40 (95% CI 0.387\u0026ndash;0.402); in MIMIC-IV, AUROC was 0.94 (95% CI 0.934\u0026ndash;0.937) with an AUPRC of 0.40 (95% CI 0.397\u0026ndash;0.408). In eICU, AUROC was 0.89 (95% CI 0.891\u0026ndash;0.893) with an AUPRC of 0.23 (95% CI 0.223\u0026ndash;0.229). External validation on ICU patients admitted during the COVID period yielded an AUROC of 0.94 (95% CI 0.935\u0026ndash;0.937) with an AUPRC of 0.45 (95% CI 0.449\u0026ndash;0.460) (\u003cstrong\u003eFigure 1B\u003c/strong\u003e). Results for all performance metrics are reported in \u003cstrong\u003eSupplementary Materials 16\u003c/strong\u003e. Hospital-level external validation across eight individual eICU hospitals also showed robust performance, with AUROCs ranging from 0.85 to 0.92 and AUPRCs ranging from 0.15 to 0.34 (\u003cstrong\u003eSupplementary Materials 17\u003c/strong\u003e).\u003c/p\u003e\n\u003cp\u003eModel performance varied over the course of the ICU stay and was not uniform across time. To characterize this temporal behavior, we evaluated AUROC, precision, and recall at the hourly data point level across days since ICU admission, as well as the alarm rate over early alerting windows of up to 24 hours before RRT initiation. As shown in \u003cstrong\u003eFigure 2\u003c/strong\u003e, precision and recall increased over later ICU days, while AUROC remained high and largely stable, with only a slight decrease in the eICU cohort. In parallel, the overall alarm rate increased closer to RRT onset, indicating that the model triggered more alerts as patients approached the event. Together, these trends suggest that the model becomes progressively more effective at identifying impending RRT over the course of the ICU stay.\u003c/p\u003e\n\u003cp\u003eIn the MIMIC-IV cohort, the dynamicEWS model generated predictions for approximately 901,000 patient-hours, corresponding to 13,000 ICU patients and 1,472 unique RRT events, with 33,560 hours falling within predefined early alerting windows. Using a fixed decision threshold chosen to achieve a recall of 80% (95% CI 0.78\u0026ndash;0.82), the model correctly identified 1,183 RRT events. At this operating point, the initial hourly alarm rate was 5.5% (50,086 alarms), which was reduced to 0.9% (8,912 alarms) after applying a 6-hour silencing policy while maintaining the same precision\u0026ndash;recall performance. At this threshold, the model produced approximately 1.6 false-positive predictions for every true positive and would prompt a clinical assessment in approximately 24% of patient-days (about 24 alerts per 100 patient-days); assuming a standard 12-hour ICU working shift, this corresponds to roughly 0.12 alerts per patient per shift (about 12 alerts per 100 patients per shift). Model evaluations for the remaining patient cohorts are available in\u003cstrong\u003e\u0026nbsp;Supplementary Materials 18\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003ePopulation-level interpretability analysis of the dynamicEWS model using SHAP showed that maximum creatinine (since ICU admission and over the last 48 hours), urine output (maximum since admission and mean over the last 48 hours), maximum blood urea nitrogen (BUN) (since admission and over the last 48 hours), and severity scores were the most influential predictors of near-term RRT initiation risk. \u003cstrong\u003eSupplementary Materials 19\u003c/strong\u003e provides population-level SHAP feature-importance rankings for each cohort. To illustrate how the model dynamically assesses RRT risk over time with real-time individual-level interpretability, \u003cstrong\u003eFigure 3A\u003c/strong\u003e presents an ICU admission in which dynamicEWS generated hourly predictions of future RRT risk across the entire stay, continuously updating its estimates as new clinical data become available. The SHAP-based visualization dashboard simultaneously displays the evolving risk trajectory and the most influential features at each time point, highlighting how specific physiological and laboratory changes drive the model\u0026rsquo;s predictions over time. \u003cstrong\u003eSupplementary Materials 20\u003c/strong\u003e provides additional examples of real-time RRT risk assessment.\u003c/p\u003e\n\u003cp\u003eWe assessed model performance across demographic subgroups in the MIMIC-IV cohort, stratified by gender and race, to evaluate fairness. Overall, dynamicEWS showed comparable discrimination and error rates across most subgroups (\u003cstrong\u003eFigure 3B\u003c/strong\u003e). Notably, performance was improved in African American patients, in whom the model achieved approximately a 19% relative improvement in true positive rate and a corresponding reduction in false negative rate compared with the White patients. Fairness evaluations for the remaining cohorts and additional subgroup analyses are reported in \u003cstrong\u003eSupplementary Materials 21\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003ePrediction of RRT duration (durationEWS)\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eFor the durationEWS task, we evaluated the risk of prolonged RRT using clinical data from the 12-hour window surrounding RRT initiation. After excluding very short RRT episodes shorter than the observation period, prolonged RRT (\u0026gt;48 hours) accounted for 48% of events in both MIMIC-III (1,359/2,822) and MIMIC-IV (1,705/3,562), 23% of events in eICU (756/3,311), and 66% of events in the COVID-period cohort (374/551).\u003c/p\u003e\n\u003cp\u003eIn MIMIC-III, AUROC was 0.86 (95% CI 0.828\u0026ndash;0.889) with an AUPRC of 0.87 (95% CI 0.830\u0026ndash;0.901), and in MIMIC-IV, AUROC was 0.90 (95% CI 0.873\u0026ndash;0.919) with an AUPRC of 0.90 (95% CI 0.873\u0026ndash;0.922). In eICU, AUROC was 0.81 (95% CI 0.785\u0026ndash;0.844) with an AUPRC of 0.67 (95% CI 0.608\u0026ndash;0.725). External validation in the COVID-period cohort yielded an AUROC of 0.83 (95% CI 0.795\u0026ndash;0.871) and an AUPRC of 0.89 (95% CI 0.855\u0026ndash;0.925) (\u003cstrong\u003eFigure 4A\u003c/strong\u003e). Full performance metrics are provided in \u003cstrong\u003eSupplementary Materials 22\u003c/strong\u003e. Population-level interpretability analysis of the durationEWS model using the IG method showed that Glasgow Coma Scale, SpO\u003csub\u003e2\u003c/sub\u003e, FiO\u003csub\u003e2\u003c/sub\u003e, SAPS II risk score, oxygen saturation, temperature, bicarbonate, and base excess were the most influential predictors of prolonged RRT duration. \u003cstrong\u003eSupplementary Materials 23\u003c/strong\u003e provides detailed feature-importance rankings for each cohort, along with an illustrative example of individual patient\u0026ndash;level explanations.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eMortality prediction post-RRT onset (prognosisEWS)\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eFor the prognosisEWS task, we evaluated the risk of mortality after RRT initiation using clinical data from the 12-hour window surrounding the first RRT initiation, restricting the cohort to ICU stays of at least 12 hours. This resulted in 2,816 patients in MIMIC-III with an ICU mortality rate of 20%, 3,649 patients in MIMIC-IV with a mortality rate of 23%, 2,800 patients in eICU with a mortality rate of 19%, and 605 patients in the COVID-period cohort with a mortality rate of 40%.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe prognosisEWS model demonstrated strong discriminative performance across all datasets. In MIMIC-III, AUROC was 0.89 (95% CI 0.848\u0026ndash;0.917) with an AUPRC of 0.72 (95% CI 0.638\u0026ndash;0.797), and in MIMIC-IV, AUROC was 0.88 (95% CI 0.850\u0026ndash;0.908) with an AUPRC of 0.70 (95% CI 0.622\u0026ndash;0.763). In eICU, AUROC was 0.82 (95% CI 0.788\u0026ndash;0.858) with an AUPRC of 0.59 (95% CI 0.517\u0026ndash;0.671). External validation on ICU patients admitted during the COVID period yielded an AUROC of 0.84 (95% CI 0.809\u0026ndash;0.874) and an AUPRC of 0.77 (95% CI 0.709\u0026ndash;0.822) (\u003cstrong\u003eFigure 4B\u003c/strong\u003e. Full performance metrics are reported in \u003cstrong\u003eSupplementary Materials 24\u003c/strong\u003e. Population-level interpretability analysis of the prognosisEWS model using Integrated Gradients showed that age, Glasgow Coma Scale, blood pressure, temperature, and composite severity scores were the most influential predictors of post-RRT mortality risk. \u003cstrong\u003eSupplementary Materials 25\u003c/strong\u003e provides detailed feature-importance rankings for each cohort, along with an illustrative example of individual patient\u0026ndash;level explanations.\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eWe developed and validated predictive deep learning models designed to support four interrelated clinical decisions in RRT: (1) screeningEWS \u0026ndash; early prediction of the need for RRT after the initial 12 hours in the ICU; (2) dynamicEWS \u0026ndash; real-time dynamic prediction (hourly) of RRT initiation within the next 24 hours; (3) durationEWS \u0026ndash; prediction of RRT duration; and (4) prognosisEWS \u0026ndash; mortality prediction post-RRT onset. The optimal timing of RRT initiation has remained elusive despite decades of investigation. Major trials such as ELAIN, AKIKI, IDEAL-ICU, and STARRT-AKI have yielded conflicting conclusions regarding early versus delayed initiation, partly due to heterogeneous patient phenotypes and clinician discretion.\u003csup\u003e\u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e,\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e\u003c/sup\u003e Our findings support an individualized approach: by accurately identifying patients at high risk for RRT within the first 12 hours of ICU admission, the early prediction model could serve as a triage tool to prompt closer nephrology evaluation or optimization of fluid and hemodynamic management before overt renal failure develops. Dynamic prediction further advances this paradigm by incorporating temporal trends, capturing physiologic deterioration not evident in static admission data. The real-time model\u0026rsquo;s ability to anticipate RRT initiation up to 24 hours before the clinical decision suggests meaningful opportunities for earlier intervention.\u003c/p\u003e \u003cp\u003eRecent studies have increasingly explored the application of machine learning techniques to support personalized decision-making in RRT management. Li et al. developed an uplift modeling approach to estimate individual treatment effects of RRT in patients with sepsis-associated acute kidney injury, demonstrating heterogeneity in treatment responses and underscoring the necessity for individualized clinical decisions rather than uniform approaches.\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e Lee et al. further emphasized the importance of individualized treatment timing, demonstrating improved outcomes when continuous kidney replacement therapy was initiated within 6 hours of SA-AKI diagnosis compared with delayed initiation.\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e Similarly, Fran\u0026ccedil;a et al. utilized a random forest algorithm to predict the requirement for RRT in critically ill COVID-19 patients, achieving robust predictive performance (AUROC\u0026thinsp;=\u0026thinsp;0.78) based on routinely collected clinical data at ICU admission.\u003csup\u003e\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e Additionally, several studies have explored ML approaches to enhance mortality prediction following RRT initiation in critically ill patients. Kang et al. applied a random forest model and achieved superior discrimination compared with traditional scoring systems such as APACHE II and SOFA for ICU mortality prediction.\u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u003c/sup\u003e Chang et al. employed an XGBoost model in a multi-center cohort, surpassing conventional scoring methods (SOFA, nonrenal SOFA) in predicting 30-day mortality.\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e Their model\u0026rsquo;s interpretability was further augmented through SHAP, highlighting important predictors such as age, creatinine, platelet count, FiO\u003csub\u003e2\u003c/sub\u003e, anion gap, mean arterial pressure, respiratory rate, and vasopressor administration. Similarly, Hung et al. validated a gradient boosting machine model for predicting in-hospital mortality post-CRRT initiation.\u003csup\u003e\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e Using SHAP analysis, they identified critical predictors such as APACHE II scores, albumin levels, age, potassium, creatinine, SpO\u003csub\u003e2\u003c/sub\u003e, and vasopressor use. Zamanzadeh et al. developed ML algorithms to predict short-term survival following CRRT initiation, underscoring variables such as creatinine levels, white blood cell counts, respiratory rate, and hemodynamic stability.\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e Furthermore, Li et al. developed an interpretable logistic regression model specifically targeting patients with sepsis-associated AKI undergoing RRT, which provided better predictive performance than traditional scoring systems (SOFA, SAPS-III).\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e Collectively, these studies underscore the clinical value of ML-based predictive models in improving resource allocation, risk stratification, and personalized management of RRT in ICU.\u003c/p\u003e \u003cp\u003eMany prior prediction models rely on static feature windows (e.g., admission-only), which fail to capture the ICU as a dynamic physiologic process and therefore may miss evolving deterioration and cannot update risk as patient status changes. Our approach instead uses time-updated, hourly signals with a richer feature set (demographics, labs, vitals, medications, ventilator settings, and clinical scores) to reflect the real-time trajectory preceding RRT. Performance is further supported by a dedicated hourly feature-engineering pipeline that transforms raw clinical data into structured inputs by computing seven feature categories per hour, improving signal quality and providing consistent longitudinal representations beyond snapshot-based baselines.\u003c/p\u003e \u003cp\u003eThe integration of interpretable predictive modeling into ICU workflows could address two central challenges in critical care nephrology: inconsistent timing of RRT initiation and uncertainty about treatment trajectory. The duration model identified markers of prolonged therapy\u0026mdash;particularly hemodynamic instability and multiorgan failure\u0026mdash;providing insight into which patients may require extended extracorporeal support and additional resource planning. Similarly, the post-RRT mortality model outperformed traditional severity scores (APACHE IV, SOFA) and yielded clinically intuitive predictors such as lactate, mean arterial pressure, and recovery of urine output. Taken together, these models offer a foundation for an adaptive decision-support framework that evolves with patient status, guiding both initiation and de-escalation strategies.\u003c/p\u003e \u003cp\u003eA major strength of our framework lies in its emphasis on explainability. Using SHAP value analysis, we demonstrated physiologic coherence between model predictions and established clinical determinants of kidney injury severity. This transparency is critical for bedside adoption, as black-box predictions without clinical rationale may erode clinician trust. Moreover, external validation across multiple ICUs and subgroup analyses confirmed consistent performance across demographic and diagnostic groups, suggesting algorithmic fairness\u0026mdash;a prerequisite for equitable deployment in heterogeneous ICU populations.\u003c/p\u003e \u003cp\u003eSeveral limitations merit discussion. First, because this was a retrospective study, causal inference cannot be established, and unmeasured confounders (e.g., nephrotoxic exposure, clinician decision thresholds) may influence model performance. Second, despite multicenter validation, local practice patterns and data definitions may limit generalizability to institutions with differing RRT protocols. Third, although dynamic predictions were generated hourly, real-time clinical integration and alert fatigue were not formally tested. Prospective trials are needed to determine whether early alerts improve patient outcomes, resource utilization, or decision-to-dialysis timing. Finally, while explainability techniques enhance interpretability, they do not fully resolve the ethical and operational questions surrounding automated clinical recommendations.\u003c/p\u003e \u003cp\u003eFuture work should focus on embedding these models into electronic health record systems to enable real-time risk visualization and clinician feedback. Combining predictive analytics with reinforcement learning may further refine individualized thresholds for RRT initiation or discontinuation. Integration with biomarkers of tubular injury or renal recovery could improve predictive precision. Ultimately, prospective interventional studies are needed to determine whether machine learning\u0026ndash;guided RRT strategies improve survival, reduce unnecessary dialysis, or optimize ICU resource allocation. Our study demonstrates that interpretable, multi-stage decision support machine learning models can predict RRT initiation, duration, and outcomes with high accuracy and temporal adaptability. By aligning predictive performance with clinical interpretability, these tools offer a viable pathway toward personalized, data-driven RRT decision support in the ICU. Prospective evaluation will determine whether their implementation translates predictive accuracy into improved patient outcomes.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgements:\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eGuarantor:\u0026nbsp;\u003c/strong\u003eRodney A. Gabriel is the guarantor of the content of the manuscript, including the data and analysis\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor contributions:\u0026nbsp;\u003c/strong\u003eBM was responsible for study design, data collection, data pre-processing, analysis, figure/table design, and manuscript preparation. CH was responsible for study design, analysis, figure/table design, and manuscript preparation. JM was responsible for study design, analysis, figure/table design, and manuscript preparation. MK was responsible for study design, figure/table design, and manuscript preparation. US was responsible for study design, figure/table design, and manuscript preparation. RG was responsible for study design, data collection, data pre-processing, analysis, figure/table design, and manuscript preparation\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFinancial/non-financial disclosures:\u0026nbsp;\u003c/strong\u003eThis study is funded by the National Institutes of Health 1OT2OD037995-01\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eRole of sponsors:\u0026nbsp;\u003c/strong\u003eno input or contributions from sponsors\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent to participate:\u0026nbsp;\u003c/strong\u003eThe study was conducted in accordance with the Declaration of Helsinki. The Institutional Review Board (Human Research Protections Program) determined that this study was exempt from review and waived need for participant consent because it involved retrospective analysis of de-identified data.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eSummary of conflict of interest statements:\u0026nbsp;\u003c/strong\u003eThis study is funded by the National Institutes of Health 1OT2OD037995-01\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding information:\u0026nbsp;\u003c/strong\u003eThis study is funded by the National Institutes of Health 1OT2OD037995-01\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eNotation of prior abstract publication/presentation:\u0026nbsp;\u003c/strong\u003enone\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements:\u0026nbsp;\u003c/strong\u003eBM was responsible for study design, data collection, data pre-processing, analysis, figure/table design, and manuscript preparation. CH was responsible for study design, analysis, figure/table design, and manuscript preparation. JM was responsible for study design, analysis, figure/table design, and manuscript preparation. MK was responsible for study design, figure/table design, and manuscript preparation. US was responsible for study design, figure/table design, and manuscript preparation. RG was responsible for study design, data collection, data pre-processing, analysis, figure/table design, and manuscript preparation\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Contributions:\u0026nbsp;\u003c/strong\u003eHsu was involved in study design, model development and validation, statistical analysis, and preparation of manuscript. Vutukuru was involved in study design, model development and validation, and preparation of manuscript. Mudumbai was involved in study design, model validation, and preparation of manuscript. Texeira was involved in model validation and preparation of manuscript. Chen was involved in model validation and preparation of manuscript. Mehdipour was involved in study design and preparation of manuscript. Polston was involved in study design and preparation of manuscript. Marienfeld was involved in study design and preparation of manuscript. Gabriel was involved in study supervision, study design, model development and validation, statistical analysis, and preparation of manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting Interest Statement:\u0026nbsp;\u003c/strong\u003eGabriel\u0026rsquo;s institution has received product and/or funding for research purposes from Pacira Biosciences, Avanos Medical, Takeda, and Merck.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability:\u0026nbsp;\u003c/strong\u003eData is available with user agreement access with the All of Us Research Program and the Veterans Affair dataset.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCode availability:\u0026nbsp;\u003c/strong\u003eCode will be made available via public github repository\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics declaration/Consent to Participate:\u0026nbsp;\u003c/strong\u003eThe study was conducted in accordance with the Declaration of Helsinki. The Institutional Review Board determined that this study was exempt from review and waived need for participant consent because it involved retrospective analysis of de-identified data.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding:\u0026nbsp;\u003c/strong\u003eThis project was supported by Wellcome Leap as part of the Untangling Addiction Program\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eHoste, E.A., Bagshaw, S.M., Bellomo, R., Cely, C.M., Colman, R., Cruz, D.N., Edipidis, K., Forni, L.G., Gomersall, C.D., Govil, D. and Honor\u0026eacute;, P.M., 2015. Epidemiology of acute kidney injury in critically ill patients: the multinational AKI-EPI study. Intensive care medicine, 41, pp.1411-1423.\u003c/li\u003e\n \u003cli\u003eTolwani, A., 2012. Continuous renal-replacement therapy for acute kidney injury. New England Journal of Medicine, 367(26), pp.2505-2514.\u003c/li\u003e\n \u003cli\u003eGriffin, B.R., Liu, K.D. and Teixeira, J.P., 2020. Critical care nephrology: core curriculum 2020. American Journal of Kidney Diseases, 75(3), pp.435-452.\u003c/li\u003e\n \u003cli\u003eUchino, S., Kellum, J.A., Bellomo, R., Doig, G.S., Morimatsu, H., Morgera, S., Schetz, M., Tan, I., Bouman, C., Macedo, E. and Gibney, N., 2005. Acute renal failure in critically ill patients: a multinational, multicenter study. Jama, 294(7), pp.813-818.\u003c/li\u003e\n \u003cli\u003eWald, R., Adhikari, N.K., Smith, O.M., Weir, M.A., Pope, K., Cohen, A., Thorpe, K., McIntyre, L., Lamontagne, F., Soth, M. and Herridge, M., 2015. Comparison of standard and accelerated initiation of renal replacement therapy in acute kidney injury. Kidney international, 88(4), pp.897-904.\u003c/li\u003e\n \u003cli\u003eZarbock, A., Kellum, J.A., Schmidt, C., Van Aken, H., Wempe, C., Pavenst\u0026auml;dt, H., Boanta, A., Ger\u0026szlig;, J. and Meersch, M., 2016. Effect of early vs delayed initiation of renal replacement therapy on mortality in critically ill patients with acute kidney injury: the ELAIN randomized clinical trial. Jama, 315(20), pp.2190-2199.\u003c/li\u003e\n \u003cli\u003eKellum, J.A., Mehta, R.L., Angus, D.C., Palevsky, P. and Ronco, C., 2002. The first international consensus conference on continuous renal replacement therapy. Kidney international, 62(5), pp.1855-1863.\u003c/li\u003e\n \u003cli\u003eOstermann, M., Joannidis, M., Pani, A., Floris, M., De Rosa, S., Kellum, J.A., Ronco, C. and 17th Acute Disease Quality Initiative (ADQI) Consensus Group, 2016. Patient selection and timing of continuous renal replacement therapy. Blood purification, 42(3), pp.224-237.\u003c/li\u003e\n \u003cli\u003eConnor Jr, M.J. and Karakala, N., 2017. Continuous renal replacement therapy: reviewing current best practice to provide high-quality extracorporeal therapy to critically ill patients. Advances in Chronic Kidney Disease, 24(4), pp.213-218.\u003c/li\u003e\n \u003cli\u003eAn, J.N., Kim, S.G. and Song, Y.R., 2021. When and why to start continuous renal replacement therapy in critically ill patients with acute kidney injury. Kidney Research and Clinical Practice, 40(4), p.566.\u003c/li\u003e\n \u003cli\u003eDisease, K.J.K.I.S., 2012. Improving global outcomes (KDIGO) acute kidney injury work group: KDIGO clinical practice guideline for acute kidney injury. Kidney Int Suppl, 2(1), pp.1-138.\u003c/li\u003e\n \u003cli\u003eGaudry, S., Hajage, D., Schortgen, F., Martin-Lefevre, L., Pons, B., Boulet, E., Boyer, A., Chevrel, G., Lerolle, N., Carpentier, D. and De Prost, N., 2016. Initiation strategies for renal-replacement therapy in the intensive care unit. New England Journal of Medicine, 375(2), pp.122-133.\u003c/li\u003e\n \u003cli\u003eBarbar, S.D., Clere-Jehl, R., Bourredjem, A., Hernu, R., Montini, F., Bruy\u0026egrave;re, R., Lebert, C., Boh\u0026eacute;, J., Badie, J., Eraldi, J.P. and Rigaud, J.P., 2018. Timing of renal-replacement therapy in patients with acute kidney injury and sepsis. New England Journal of Medicine, 379(15), pp.1431-1442.\u003c/li\u003e\n \u003cli\u003eClark, E.G. and Bagshaw, S.M., 2015, January. Unnecessary renal replacement therapy for acute kidney injury is harmful for renal recovery. In Seminars in dialysis (Vol. 28, No. 1, pp. 6-11).\u003c/li\u003e\n \u003cli\u003eda Hora Passos, R., Ramos, J.G.R., Mendon\u0026ccedil;a, E.J.B., Miranda, E.A., Dutra, F.R.D., Coelho, M.F.R., Pedroza, A.C., Correia, L.C.L., Batista, P.B.P., Macedo, E. and Dutra, M.M., 2017. A clinical score to predict mortality in septic acute kidney injury patients requiring continuous renal replacement therapy: the HELENICC score. BMC anesthesiology, 17, pp.1-8.\u003c/li\u003e\n \u003cli\u003eKim, Y., Park, N., Kim, J., Kim, D.K., Chin, H.J., Na, K.Y., Joo, K.W., Kim, Y.S., Kim, S. and Han, S.S., 2019. Development of a new mortality scoring system for acute kidney injury with continuous renal replacement therapy. Nephrology, 24(12), pp.1233-1240.\u003c/li\u003e\n \u003cli\u003eDemirjian, S., Chertow, G.M., Zhang, J.H., O\u0026apos;Connor, T.Z., Vitale, J., Paganini, E.P., Palevsky, P.M. and VA/NIH Acute Renal Failure Trial Network, 2011. Model to predict mortality in critically ill adults with acute kidney injury. Clinical Journal of the American Society of Nephrology, 6(9), pp.2114-2120.\u003c/li\u003e\n \u003cli\u003eChang, H.H., Chiang, J.H., Wang, C.S., Chiu, P.F., Abdel-Kader, K., Chen, H., Siew, E.D., Yabes, J., Murugan, R., Clermont, G. and Palevsky, P.M., 2022. Predicting mortality using machine learning algorithms in patients who require renal replacement therapy in the critical care unit. Journal of Clinical Medicine, 11(18), p.5289.\u003c/li\u003e\n \u003cli\u003eZamanzadeh, D., Feng, J., Petousis, P., Vepa, A., Sarrafzadeh, M., Karumanchi, S.A., Bui, A.A. and Kurtz, I., 2024. Data-driven prediction of continuous renal replacement therapy survival. Nature Communications, 15(1), p.5440.\u003c/li\u003e\n \u003cli\u003eLi, G., Li, B., Song, B., Liu, D., Sun, Y., Ju, H., Xu, X., Mao, J. and Zhou, F., 2024. Uplift modeling to predict individual treatment effects of renal replacement therapy in sepsis-associated acute kidney injury patients. Scientific Reports, 14(1), p.5833.\u003c/li\u003e\n \u003cli\u003eKang, M.W., Kim, J., Kim, D.K., Oh, K.H., Joo, K.W., Kim, Y.S. and Han, S.S., 2020. Machine learning algorithm to predict mortality in patients undergoing continuous renal replacement therapy. Critical Care, 24, pp.1-9.\u003c/li\u003e\n \u003cli\u003eHung, P.S., Lin, P.R., Hsu, H.H., Huang, Y.C., Wu, S.H. and Kor, C.T., 2022. Explainable machine learning-based risk prediction model for in-hospital mortality after continuous renal replacement therapy initiation. Diagnostics, 12(6), p.1496.\u003c/li\u003e\n \u003cli\u003eLee, Y., Seo, J.H., Seong, J., Ahn, S.M., Han, M., Lee, J.A., Kim, J.H., Ahn, J.Y., Jeong, S.J., Choi, J.Y. and Yeom, J.S., 2024. Impact of Early Continuous Kidney Replacement Therapy in Patients With Sepsis-Associated Acute Kidney Injury: An Analysis of the MIMIC-IV Database. Journal of Korean medical science, 39(43).\u003c/li\u003e\n \u003cli\u003eLi, C., Zhao, K., Ren, Q., Chen, L., Zhang, Y., Wang, G. and Xie, K., 2024. Development and validation of a model for predicting in-hospital mortality in patients with sepsis-associated kidney injury receiving renal replacement therapy: a retrospective cohort study based on the MIMIC-IV database. Frontiers in Cellular and Infection Microbiology, 14, p.1488505.\u003c/li\u003e\n \u003cli\u003eFran\u0026ccedil;a, A.R., Rocha, E., Bastos, L.S., Bozza, F.A., Kurtz, P., Maccariello, E., e Silva, J.R.L. and Salluh, J.I., 2024. Development and validation of a machine learning model to predict the use of renal replacement therapy in 14,374 patients with COVID-19. Journal of Critical Care, 80, p.154480.\u003c/li\u003e\n \u003cli\u003eZhang, Q., Zheng, P., Hong, Z., Li, L., Liu, N., Bian, Z., Chen, X., Wu, H. and Zhao, S., 2024. Machine learning in risk prediction of continuous renal replacement therapy after coronary artery bypass grafting surgery in patients. Clinical and Experimental Nephrology, 28(8), pp.811-821.\u003c/li\u003e\n \u003cli\u003eLi, K., Li, Y., Gao, Q., Xu, L., Hu, Q., Ji, B. and Gao, G., 2025. Machine learning in risk prediction of continuous renal replacement therapy after surgical repair of acute type A aortic dissection. Journal of Cardiothoracic and Vascular Anesthesia.\u003c/li\u003e\n \u003cli\u003eZhuang, C., Hu, R., Li, K., Liu, Z., Bai, S., Zhang, S. and Wen, X., 2025. Machine learning prediction models for mortality risk in sepsis-associated acute kidney injury: evaluating early versus late CRRT initiation. Frontiers in Medicine, 11, p.1483710.\u003c/li\u003e\n \u003cli\u003eKhwaja, A., 2012. KDIGO clinical practice guidelines for acute kidney injury. Nephron Clinical Practice, 120(4), pp.c179-c184.\u003c/li\u003e\n \u003cli\u003eCerd\u0026aacute;, J., Liu, K.D., Cruz, D.N., Jaber, B.L., Koyner, J.L., Heung, M., Okusa, M.D. and Faubel, S., 2015. Promoting kidney function recovery in patients with AKI requiring RRT. Clinical Journal of the American Society of Nephrology, 10(10), pp.1859-1867.\u003c/li\u003e\n \u003cli\u003eSheng, S., Li, A., Liu, X., Shen, T., Zhou, W., Lv, X., Shen, Y., Wang, C., Ma, Q., Qu, L. and Ma, S., 2024. Factors and machine learning models for predicting successful discontinuation of continuous renal replacement therapy in critically ill patients with acute kidney injury: a retrospective cohort study based on MIMIC-IV database. BMC nephrology, 25(1), p.407.\u003c/li\u003e\n \u003cli\u003eRaina, R., Kashani, K., Sethi, S.K., Mok, Q., Kakajiwala, A., Parikh, S.M., Doshi, K., Hu, J., Alhasan, K., de Sousa Tavares, M. and Yap, H.K., 2025. Liberation from continuous renal replacement therapy due to renal recovery in adults and children: a literature review and Delphi consensus on clinical practice. Critical Care, 29(1), p.287.\u003c/li\u003e\n \u003cli\u003eKatulka, R.J., Al Saadon, A., Sebastianski, M., Featherstone, R., Vandermeer, B., Silver, S.A., Gibney, R.N., Bagshaw, S.M. and Rewa, O.G., 2020. Determining the optimal time for liberation from renal replacement therapy in critically ill patients: a systematic review and meta-analysis (DOnE RRT). Critical Care, 24(1), p.50.\u003c/li\u003e\n \u003cli\u003eDhala, A., Gotur, D., Hsu, S.H.-L., Uppalapati, A., Hernandez, M., Alegria, J. and Masud, F. (2021) \u0026lsquo;A Year of Critical Care: The Changing Face of the ICU During COVID-19\u0026rsquo;, Methodist DeBakey Cardiovascular Journal, 17(5), p. 31-42. Available at: https://doi.org/10.14797/mdcvj.1041.\u003c/li\u003e\n \u003cli\u003eSupady A, Curtis JR, Abrams D, Lorusso R, Bein T, Boldt J, Brown CE, Duerschmied D, Metaxa V, Brodie D. Allocating scarce intensive care resources during the COVID-19 pandemic: practical challenges to theoretical frameworks. Lancet Respir Med. 2021 Apr;9(4):430-434. doi: 10.1016/S2213-2600(20)30580-4. Epub 2021 Jan 12. PMID: 33450202; PMCID: PMC7837018.\u003c/li\u003e\n \u003cli\u003eMeijs, D.A., van Kuijk, S.M., Wynants, L., Stessel, B., Mehagnoul-Schipper, J., Hana, A., Scheeren, C.I., Bergmans, D.C., Bickenbach, J., Vander Laenen, M. and Smits, L.J., 2022. Predicting COVID-19 prognosis in the ICU remained challenging: external validation in a multinational regional cohort. Journal of Clinical Epidemiology, 152, pp.257-268.\u003c/li\u003e\n \u003cli\u003eFranklin, G., Stephens, R., Piracha, M., Tiosano, S., Lehouillier, F., Koppel, R. and Elkin, P.L., 2024. The sociodemographic biases in machine learning algorithms: a biomedical informatics perspective. Life, 14(6), p.652.\u003c/li\u003e\n \u003cli\u003eHasanzadeh, F., Josephson, C.B., Waters, G., Adedinsewo, D., Azizi, Z. and White, J.A., 2025. Bias recognition and mitigation strategies in artificial intelligence healthcare applications. NPJ Digital Medicine, 8(1), p.154.\u003c/li\u003e\n \u003cli\u003eKumar, A., Aelgani, V., Vohra, R., Gupta, S.K., Bhagawati, M., Paul, S., Saba, L., Suri, N., Khanna, N.N., Laird, J.R. and Johri, A.M., 2024. Artificial intelligence bias in medical system designs: a systematic review. Multimedia Tools and Applications, 83(6), pp.18005-18057.\u003c/li\u003e\n \u003cli\u003eChen, Z., Liu, X., Yang, Q., Wang, Y.J., Miao, K., Gong, Z., Yu, Y., Leonov, A., Liu, C., Feng, Z. and Chuan-Peng, H., 2023. Evaluation of risk of bias in neuroimaging-based artificial intelligence models for psychiatric diagnosis: a systematic review. JAMA Network Open, 6(3), pp.e231671-e231671.\u003c/li\u003e\n \u003cli\u003eObermeyer, Z., Powers, B., Vogeli, C. and Mullainathan, S., 2019. Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), pp.447-453.\u003c/li\u003e\n \u003cli\u003eButt, S., Butt, H. and Gnanappiragasam, D., 2021. Unintentional consequences of artificial intelligence in dermatology for patients with skin of colour. Clinical and Experimental Dermatology, 46(7), pp.1333-1334.\u003c/li\u003e\n \u003cli\u003eJohnson, Alistair, Pollard, Tom, and Roger Mark. \u0026quot;MIMIC-III Clinical Database\u0026quot; (version 1.4). PhysioNet (2016). https://doi.org/10.13026/C2XW26.\u003c/li\u003e\n \u003cli\u003eJohnson, A. E. W., Pollard, T. J., Shen, L., Lehman, L. H., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L. A., \u0026amp; Mark, R. G. (2016). MIMIC-III, a freely accessible critical care database. Scientific Data, 3, 160035.\u003c/li\u003e\n \u003cli\u003eJohnson, A.E.W., Bulgarelli, L., Shen, L. et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data 10, 1 (2023). https://doi.org/10.1038/s41597-022-01899-x.\u003c/li\u003e\n \u003cli\u003eJohnson, A., Bulgarelli, L., Pollard, T., Gow, B., Moody, B., Horng, S., Celi, L. A., and Mark, R. (2024) \u0026apos;MIMIC-IV\u0026apos; (version 3.1), PhysioNet. Available at: https://doi.org/10.13026/kpb9-mt58.\u003c/li\u003e\n \u003cli\u003ePollard, T., Johnson, A., Raffa, J. et al. The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Sci Data 5, 180178 (2018). https://doi.org/10.1038/sdata.2018.178.\u003c/li\u003e\n \u003cli\u003ePollard, Tom, Johnson, Alistair, Raffa, Jesse, Celi, Leo Anthony, Badawi, Omar, and Roger Mark. \u0026quot;eICU Collaborative Research Database\u0026quot; (version 2.0). PhysioNet (2019). https://doi.org/10.13026/C2WM1R.\u003c/li\u003e\n \u003cli\u003eRaimann, F.J., K\u0026ouml;nig, C.J., Neef, V. and Flinspach, A.N., 2024, July. \u0026ldquo;Mind the Gap\u0026rdquo;\u0026mdash;differences between documentation and reality on intensive care units: a quantitative observational study. In Healthcare (Vol. 12, No. 15, p. 1481). MDPI.\u003c/li\u003e\n \u003cli\u003eHyland, S.L., Faltys, M., H\u0026uuml;ser, M., Lyu, X., Gumbsch, T., Esteban, C., Bock, C., Horn, M., Moor, M., Rieck, B. and Zimmermann, M., 2020. Early prediction of circulatory failure in the intensive care unit using machine learning. Nature medicine, 26(3), pp.364-373.\u003c/li\u003e\n \u003cli\u003eLyu, X., Fan, B., H\u0026uuml;ser, M., Hartout, P., Gumbsch, T., Faltys, M., Merz, T.M., R\u0026auml;tsch, G. and Borgwardt, K., 2024. An empirical study on KDIGO-defined acute kidney injury prediction in the intensive care unit. Bioinformatics, 40(Supplement_1), pp.i247-i256.\u003c/li\u003e\n \u003cli\u003eSundararajan, Mukund, Ankur Taly, and Qiqi Yan. \u0026quot;Axiomatic attribution for deep networks.\u0026quot; In International conference on machine learning, pp. 3319-3328. PMLR, 2017.\u003c/li\u003e\n \u003cli\u003eLundberg, S.M., Erion, G., Chen, H., DeGrave, A., Prutkin, J.M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N. and Lee, S.I., 2020. From local explanations to global understanding with explainable AI for trees. Nature machine intelligence, 2(1), pp.56-67.\u003c/li\u003e\n \u003cli\u003eLundberg, S.M. and Lee, S.I., 2017. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30.\u003c/li\u003e\n \u003cli\u003eZarbock, A., Kellum, J.A., Schmidt, C., Van Aken, H., Wempe, C., Pavenst\u0026auml;dt, H., Boanta, A., Ger\u0026szlig;, J. and Meersch, M., 2016. Effect of early vs delayed initiation of renal replacement therapy on mortality in critically ill patients with acute kidney injury: the ELAIN randomized clinical trial. Jama, 315(20), pp.2190-2199.\u003c/li\u003e\n \u003cli\u003eStarrt-Aki Investigators, 2020. Timing of initiation of renal-replacement therapy in acute kidney injury. New England Journal of Medicine, 383(3), pp.240-251.\u003c/li\u003e\n\u003c/ol\u003e"},{"header":"Table","content":"\u003cp\u003e\u003cstrong\u003eTable 1.\u0026nbsp;\u003c/strong\u003eDemographics and data distribution of each dataset used in the study. Covid-19 corresponds to the portion of the MIMIC datasets during the pandemic (2020-2022). Abbreviations: LOS, length of stay; MIMIC, Medical Information Mart for Intensive Care; RRT, renal replacement therapy\u003c/p\u003e\n\u003ctable border=\"0\" cellspacing=\"0\" cellpadding=\"0\" width=\"871\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 173px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMIMIC-III\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 173px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMIMIC-IV\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 173px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eeICU\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 173px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eCOVID-19 era\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eVariable\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNo RRT\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eRRT\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNo RRT\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eRRT\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNo RRT\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eRRT\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNo RRT\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eRRT\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eTotal\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e46,616\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2,997\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e61,795\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e3,912\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e128,416\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e3,499\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e8,652\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e667\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eAge (years), median [quartiles]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e66 [53,78]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e63 [51,73]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e66 [55,78]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e64 [53,73]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e66 [54,77]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e64 [54,73]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e66 [55,76]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e62 [50,73]\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eGender (Male), n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e26,249 (56.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e1,720 (57.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e34,404 (55.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2,332 (59.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e69,562 (54.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e1,970 (56.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e5059 (58.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e407 (61.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eRace, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eAsian\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e1080 (2.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e69 (2.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e1820 (2.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e118 (3.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2148 (1.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e50 (1.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e628 (7.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e36 (5.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eAfrican American\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e4039 (8.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e676 (22.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e6314 (10.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e773 (19.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e14225 (11.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e605 (17.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e496 (5.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e60 (9.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eHispanic\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e1561 (3.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e139 (4.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2352 (3.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e196 (5.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e4670 (3.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e220 (6.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e491 (5.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e35 (5.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eOther\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e6075 (13.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e318 (10.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e8534 (14)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e616 (15.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e8021 (6.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e284 (8.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2463 (28.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e223 (33.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eWhite\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e33861 (72.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e1795 (59.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e42675 (69.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2209 (56.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e99352 (77.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2340 (66.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e4574 (52.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e313 (46.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eEthnicity, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eHispanic\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e1561 (3.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e139 (4.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2352 (3.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e196 (5.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e4670 (3.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e220 (6.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e491 (5.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e35 (5.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eNon Hispanic\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e45055 (96.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2858 (95.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e59443 (96.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e3716 (95.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e123746 (96.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e3279 (93.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e8161 (94.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e632 (94.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eLOS (days), median [quartiles]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2.1 [1.2,4.0]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e3.8 [2.0,8.4]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2.0 [1.2,3.4]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e3.9 [1.9,9.0]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2.3 [1.6,4.0]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e4.1 [2.2,8.7]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e2.1 [1.2,4.2]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e5.3 [2.2,12]\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 177px;\"\u003e\n \u003cp\u003eInpatient mortality, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e3220 (6.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e604 (20.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e3513 (5.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e911 (23.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e6174 (4.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e644 (18.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e609 (7.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 87px;\"\u003e\n \u003cp\u003e276 (41.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-8419139/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8419139/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground: \u003c/strong\u003eRenal replacement therapy (RRT) as a life-saving intervention for acute kidney injury in the intensive care unit (ICU). The decision to initiate RRT remains highly complex and subjective. Accurate and timely predictions for the need of RRT initiation, duration of therapy, and subsequent clinical outcomes are crucial components of personalized care.\u003cstrong\u003e \u003c/strong\u003eUsing time series patient data (vital signs, laboratory findings, medications, ventilator settings, intake/output, risk scores), we aimed to develop deep learning models for predicting need for RRT.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethods: \u003c/strong\u003eUsing data from Medical Information Mart for Intensive Care (MIMIC)-III, MIMIC-IV, and eICU, we trained/validated deep learning models for: (1) screening patients for prediction of the need for RRT after their first 12 hours in the ICU; (2) real-time dynamic prediction of impending RRT initiation; (3) prediction of RRT duration; and (4) prediction of mortality following RRT onset.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults: \u003c/strong\u003eHere, we summarize the results of the first model aimed at screening patients for early prediction of RRT. In internal validation, area under the receiver operating characteristics curve (AUROC) was 0.90 (95% confidence interval [CI] 0.886–0.906) and area under the precision-recall curve (AUPRC) 0.53 was (95% CI 0.496–0.564) in MIMIC-III. Results were similar with MIMIC-IV and eICU. External validation on ICU patients admitted during the COVID period yielded an AUROC of 0.90 (95% CI 0.894–0.913) and an AUPRC of 0.57 (95% CI 0.534–0.602). Additional hospital-level external validation across eight individual eICU hospitals showed AUROCs ranging from 0.86 to 0.91 and AUPRCs ranging from 0.35 to 0.52\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDiscussion: \u003c/strong\u003eBy accurately identifying patients at high risk for RRT within the first 12 hours of ICU admission, the early prediction model could serve as a triage tool to prompt closer nephrology evaluation or optimization of fluid and hemodynamic management before overt renal failure develops.\u003c/p\u003e","manuscriptTitle":"A deep learning-based early warning system for renal replacement therapy in the intensive care unit","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-01-06 15:06:14","doi":"10.21203/rs.3.rs-8419139/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"a85e5faa-65c1-4c9a-abca-797019dae2a9","owner":[],"postedDate":"January 6th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-02-14T08:24:56+00:00","versionOfRecord":[],"versionCreatedAt":"2026-01-06 15:06:14","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8419139","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8419139","identity":"rs-8419139","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-21T05:10:58.409756+00:00
License: CC-BY-4.0