Comparison of static and dynamic random forests models for EHR data in the presence of competing risks: predicting central line-associated bloodstream infection

preprint OA: closed
Full text JSON View at publisher
Full text 13,598 characters · extracted from preprint-html · click to expand
Comparison of static and dynamic random forests models for EHR data in the presence of competing risks: predicting central line-associated bloodstream infection | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Comparison of static and dynamic random forests models for EHR data in the presence of competing risks: predicting central line-associated bloodstream infection Elena ALBU, Shan GAO, Pieter STIJNEN, Frank RADEMAKERS, Christel JANSSENS, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4527690/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Objective: Prognostic outcomes related to hospital admissions typically do not suffer from censoring, and can be modeled either categorically or as time-to-event. Competing events are common but often ignored. We compared the performance of random forest (RF) models to predict the risk of central line-associated bloodstream infections (CLABSI) using different outcome operationalizations. Methods: We included data from 27478 admissions to the University Hospitals Leuven, covering 30862 catheter episodes (970 CLABSI, 1466 deaths and 28426 discharges) to build static and dynamic RF models for binary (CLABSI vs no CLABSI), multinomial (CLABSI, discharge, death or no event), survival (time to CLABSI) and competing risks (time to CLABSI, discharge or death) outcomes to predict the 7-day CLABSI risk. We evaluated model performance across 100 train/test splits. Results: Performance of binary, multinomial and competing risks models was similar: AUROC was 0.74 for baseline predictions, rose to 0.78 for predictions at day 5 in the catheter episode, and decreased thereafter. Survival models overestimated the risk of CLABSI (E:O ratios between 1.2 and 1.6), and had AUROCs about 0.01 lower than other models. Binary and multinomial models had lowest computation times. Models including multiple outcome events (multinomial and competing risks) display a different internal structure compared to binary and survival models. Discussion and Conclusion: In the absence of censoring, complex modelling choices do not considerably improve the predictive performance compared to a binary model for CLABSI prediction in our studied settings. Survival models censoring the competing events at their time of occurrence should be avoided. random forests competing risks survival CLABSI EHR dynamic prediction Full Text Additional Declarations No competing interests reported. Supplementary Files supplmaterial.pdf Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4527690","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":314656323,"identity":"84e86560-4d83-48b0-b216-3cd77b9bb248","order_by":0,"name":"Elena ALBU","email":"","orcid":"","institution":"KU Leuven","correspondingAuthor":false,"prefix":"","firstName":"Elena","middleName":"","lastName":"ALBU","suffix":""},{"id":314656324,"identity":"f40cb0f8-9e0d-435e-861a-3da268d933b0","order_by":1,"name":"Shan GAO","email":"","orcid":"","institution":"KU Leuven","correspondingAuthor":false,"prefix":"","firstName":"Shan","middleName":"","lastName":"GAO","suffix":""},{"id":314656325,"identity":"0dae6974-f618-4df5-8e1a-20db1c752f36","order_by":2,"name":"Pieter STIJNEN","email":"","orcid":"","institution":"Universitair Ziekenhuis Leuven","correspondingAuthor":false,"prefix":"","firstName":"Pieter","middleName":"","lastName":"STIJNEN","suffix":""},{"id":314656326,"identity":"37cff14b-554b-4336-9e80-7886a73ee705","order_by":3,"name":"Frank RADEMAKERS","email":"","orcid":"","institution":"KU Leuven","correspondingAuthor":false,"prefix":"","firstName":"Frank","middleName":"","lastName":"RADEMAKERS","suffix":""},{"id":314656327,"identity":"01b38d40-b0a3-490d-881a-4fa98ea4d18d","order_by":4,"name":"Christel JANSSENS","email":"","orcid":"","institution":"Universitair Ziekenhuis Leuven","correspondingAuthor":false,"prefix":"","firstName":"Christel","middleName":"","lastName":"JANSSENS","suffix":""},{"id":314656328,"identity":"ef3d4fe8-157b-40a2-bdfd-c61b524f12f8","order_by":5,"name":"Veerle COSSEY","email":"","orcid":"","institution":"KU Leuven","correspondingAuthor":false,"prefix":"","firstName":"Veerle","middleName":"","lastName":"COSSEY","suffix":""},{"id":314656329,"identity":"edf05e0a-9085-45d6-be55-202acfbae1a4","order_by":6,"name":"Yves DEBAVEYE","email":"","orcid":"","institution":"Universitair Ziekenhuis Leuven","correspondingAuthor":false,"prefix":"","firstName":"Yves","middleName":"","lastName":"DEBAVEYE","suffix":""},{"id":314656330,"identity":"1df6fc68-1a58-4c5e-b1aa-5e5e88844a29","order_by":7,"name":"Laure WYNANTS","email":"","orcid":"","institution":"KU Leuven","correspondingAuthor":false,"prefix":"","firstName":"Laure","middleName":"","lastName":"WYNANTS","suffix":""},{"id":314656331,"identity":"d5b369c6-202e-444e-981d-4c113d758f5e","order_by":8,"name":"Ben VAN CALSTER","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABEklEQVRIie2PsWrDMBCGZQzWcpC1JhC/wglDXEPoswgKmfIAHTK4CNLFzWw/RvcOggNnCekaSCDO4i4d4iV0KKFKKaWDTOjWQd8gDn593H+MORz/EPyZ+Nc7wu/hd9Sl+MzLGBujGf6m0GUl4Y/ah+k26in+WrfPLzGSX/XhbmuizKqk+Ur6UDWiIBD3RbMZIgXjPiwbE2l7sfUEvSIgyQg8BXozQoJhWM7IRNKu7N6MciIZEd+rD70ySu8Yliej7OqOLYCsnZFEYkIxrU0xCK7a7LzFfn6aT1C3cxJP5pYy17dxSEF8fagI0txeLOFLUcsjRYPFQ3141zdivlD7tZzSIOH288/YE+j873A4HI6LfALzXmMqt4cC/wAAAABJRU5ErkJggg==","orcid":"","institution":"KU Leuven","correspondingAuthor":true,"prefix":"","firstName":"Ben","middleName":"VAN","lastName":"CALSTER","suffix":""}],"badges":[],"createdAt":"2024-06-04 11:29:46","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4527690/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4527690/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":64765416,"identity":"03f91834-ce62-4054-b6cd-67f717bd94fc","added_by":"auto","created_at":"2024-09-18 13:52:57","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":453209,"visible":true,"origin":"","legend":"","description":"","filename":"studycomparemodelsSpringerclean.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4527690/v1_covered_45deac0e-0e69-4556-a5e7-dae0f61ece17.pdf"},{"id":58721152,"identity":"632767fc-4e51-4322-888c-657aaaec1c7a","added_by":"auto","created_at":"2024-06-20 09:00:46","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":630227,"visible":true,"origin":"","legend":"","description":"","filename":"supplmaterial.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4527690/v1/ff6646f3762601c03384fa81.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Comparison of static and dynamic random forests models for EHR data in the presence of competing risks: predicting central line-associated bloodstream infection","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"random forests, competing risks, survival, CLABSI, EHR, dynamic prediction","lastPublishedDoi":"10.21203/rs.3.rs-4527690/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4527690/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eObjective: Prognostic outcomes related to hospital admissions typically do not suffer from censoring, and can be modeled either categorically or as time-to-event. Competing events are common but often ignored. We compared the performance of random forest (RF) models to predict the risk of central line-associated bloodstream infections (CLABSI) using different outcome operationalizations.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eMethods: We included data from 27478 admissions to the University Hospitals Leuven, covering 30862 catheter episodes (970 CLABSI, 1466 deaths and 28426 discharges) to build static and dynamic RF models for binary (CLABSI vs no CLABSI), multinomial (CLABSI, discharge, death or no event), survival (time to CLABSI) and competing risks (time to CLABSI, discharge or death) outcomes to predict the 7-day CLABSI risk. We evaluated model performance across 100 train/test splits.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eResults: Performance of binary, multinomial and competing risks models was similar: AUROC was 0.74 for baseline predictions, rose to 0.78 for predictions at day 5 in the catheter episode, and decreased thereafter. Survival models overestimated the risk of CLABSI (E:O ratios between 1.2 and 1.6), and had AUROCs about 0.01 lower than other models. Binary and multinomial models had lowest computation times. Models including multiple outcome events (multinomial and competing risks) display a different internal structure compared to binary and survival models.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eDiscussion and Conclusion: In the absence of censoring, complex modelling choices do not considerably improve the predictive performance compared to a binary model for CLABSI prediction in our studied settings. Survival models censoring the competing events at their time of occurrence should be avoided.\u003c/p\u003e","manuscriptTitle":"Comparison of static and dynamic random forests models for EHR data in the presence of competing risks: predicting central line-associated bloodstream infection","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-06-20 09:00:41","doi":"10.21203/rs.3.rs-4527690/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"13d79ffb-302b-498a-9ede-1f54728eee31","owner":[],"postedDate":"June 20th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-09-18T13:44:49+00:00","versionOfRecord":[],"versionCreatedAt":"2024-06-20 09:00:41","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4527690","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4527690","identity":"rs-4527690","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00