TOWARDS AN AI-DRIVEN REGISTRY FOR POSTOPERATIVE COMPLICA-TIONS: A PROOF-OF-CONCEPT STUDY EVALUATING THE OPPORTUNITIES AND CHALLENGES OF AI-MODELS

doi:10.1101/2025.04.07.25325369

TOWARDS AN AI-DRIVEN REGISTRY FOR POSTOPERATIVE COMPLICA-TIONS: A PROOF-OF-CONCEPT STUDY EVALUATING THE OPPORTUNITIES AND CHALLENGES OF AI-MODELS

2025 · doi:10.1101/2025.04.07.25325369

preprint OA: closed

📄 Open PDF Full text JSON View at publisher

Full text 4,502 characters · extracted from oa-doi-fallback · 4 sections · click to expand

Abstract

Background Continuous quality improvement is essential in surgery, with clinical registries and quality improvement programs (QIPs) playing a key role. Postoperative complications (PCs) require substantial resources to manage, yet traditional QIPs are expensive and often lays a significant labor burden on clinicians in data collection. Artificial intelligence (AI), particularly natural language processing (NLP), offers a potential solution by automating and streamlining these processes, but models can be optimized for optimal sensitivity or positive predictive value. This study aimed to develop a mock-up automated registry for PCs using NLP algorithms and evaluate the effects of optimization strategies for surgical quality control. We hypothesized using NLP to obtain longitudinal overviews of key quality metrics is feasible, but that optimization strategies impacted on the observed rate of PCs and thus how quality management and surveillance would be affected in a real-world setting.

Methods

We analyzed 100,505 surgical cases from 12 Danish hospitals between 2016 and 2022. Previously validated NLP models were applied to detect seven types of PCs, using two different threshold settings: a set of thresholds optimized for positive predictive value (PPV or Precision), referred to as F-score of 0.5, and a set of thresholds optimized for sensitivity, referred to as F-score of 2. Trends in PC rates over time were assessed, and hospital-level variations were examined using logistic regression models adjusted for age, sex, and comorbidity.

Results

The NLP models detected 8,512 or 15,892 PCs, depending on threshold selection, corresponding to total PC rates of 9.14% and 17.1%, respectively. Most PCs showed stable or increasing trends over time, regardless of threshold setting. Hospital-level analyses similarly revealed stable or rising PC rates in most institutions. Regression analyses demonstrated that threshold selection significantly influenced findings, impacting hospital comparisons.

Conclusion

This study demonstrates that NLP can be used for automated PC detection in surgical quality monitoring. However, threshold selection and additional performance metrics, such as precision-recall curves (PPV-Sensitivity curves), must be carefully considered to ensure reliable and meaningful results beyond traditional Receiver Operator Area Under the Curve (ROC AUC) evaluation. Competing Interest Statement Conflicts of interest: Authors MS, AB and AT are co-founders of the company Aiomic, a company specializing in AI models for health care systems. This work is for research only, and the specific models used in the study were not used commercially. Funding Statement By grant (#NNF19OC0055183) from the Novo Nordisk Foundation to MS Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes The details of the IRB/oversight body that provided approval or exemption for the research described are given below: The study was approved by the Institutional Review Board (IRB) overseeing retrospective pa-tient studies in Denmark, the Danish Patient Safety Board (Styrelsen for Patientsikkerhed, ap-proval #31-1521-182), and the Danish Capital Region Data Safety Board (Videnscenter for Dataanmeldelser, approval #P-2020-180). Since the study was retrospective, used de-identified data, and had no direct patient contact, Danish law did not require informed consent. I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall: last seen: 2026-06-16T06:25:30.133384+00:00