Intro
Acute pancreatitis (AP) is an inflammatory disease of the pancreas. It is the leading cause of hospitalization for gastrointestinal diseases worldwide, with a global total annual incidence of AP of 33.74 cases per 100,000 general population (95% CI: 23.33–48.81). 1 , 2 There was no statistically significant difference between males and females, and the disease mainly affected middle-aged or older adults. 3 , 4 AP is classified as mild, moderately severe, or severe depending on the extent of local damage in and around the pancreas and more importantly systemic damage to distal organs. 5 Moderate and severe AP is often accompanied by local or systemic inflammatory complications, which are more prone to systemic organ dysfunction and later organ failure (OF). 6 However, the incidence of OF varies widely among the reported patients with AP mainly due to differences in early diagnosis and early intervention.
AP patients with OF were classified by duration as persistent organ failure (POF) or transient organ failure. POF was defined as duration >48 h, while transient organ failure was ≤48 h. 6 The cause of death in almost all patients with AP is OF. As the death caused by OF almost accounts for the mortality of all patients with AP, the mortality of transient OF is about 1.4%–10%, while the overall mortality OF POF is >40%. 7 Patients with POF have a high risk of death in the first 2 weeks. 6 However, it is a pity that the diagnosis of POF is time-delayed, which can be seen from its definition. Even if the patient is complicated with OF on day 1 of occurrence, it will take 48 hours to confirm whether the patient has POF or not. Early prediction of POF remains a clinical challenge. At present, many single indicators have certain predictive effects on POF, the most common of which are Acute Physiology and Chronic Health Evaluation (APACHE) II score, beside index for severity in acute pancreatitis (BISAP) score, C-reactive protein (CRP), Interleukin 6 (IL-6), Systemic Inflammatory Response Syndrome (SIRS), etc. 7 . However, little work has been performed evaluating the predictive accuracy of these indicators.
Network meta-analysis was developed based on the traditional meta-analysis. Network meta-analysis is widely applied to evaluate a variety of intervention methods, and its striking feature is the use of rank probability and rank graphs to rank interventions. 8 Recently, contrast-based model network meta-analysis has been gradually applied to evaluate the accuracy of diagnostic tests. 9 , 10 Nyaga 11 developed the diagnostic network meta-analysis based on the ANOVA model in 2018, which had advantages in the straightforward interpretation of the indicators. In this study, we adopt ANOVA model network meta-analysis to comprehensively summarize POF predictors and evaluate the predictors’ prediction efficiency.
Methods
A comprehensive search of published studies of pancreatitis complicated by POF was performed. We searched PubMed, Cochrane Library, Embase, and Web of Science from the inception of each database to September 29, 2021. The strategy of combining theme words (Mesh in PubMed) with keywords (Entry terms in PubMed) was adopted. PubMed search strategies were as follows: (organ failure [Title/Abstract]) AND ( ( (AP [Title/Abstract]) OR ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (Pancreatitis [Title/Abstract])) OR (Pancreatitis, Acute Edematous [Title/Abstract])) OR (Acute Edematous Pancreatitides [Title/Abstract])) OR (Edematous Pancreatitides, Acute [Title/Abstract])) OR (Edematous Pancreatitis, Acute [Title/Abstract])) OR (Pancreatitides, Acute Edematous [Title/Abstract])) OR (Acute Edematous Pancreatitis [Title/Abstract])) OR (Pancreatic Parenchymal Edema [Title/Abstract])) OR (Edema, Pancreatic Parenchymal [Title/Abstract])) OR (Pancreatic Parenchymal Edemas [Title/Abstract])) OR (Parenchymal Edema, Pancreatic [Title/Abstract])) OR (Pancreatic Parenchyma with Edema [Title/Abstract])) OR (Pancreatitis, Acute [Title/Abstract])) OR (Acute Pancreatitis [Title/Abstract])) OR (Acute Pancreatitides [Title/Abstract])) OR (Pancreatitides, Acute [Title/Abstract])) OR (Peripancreatic Fat Necrosis [Title/Abstract])) OR (Fat Necrosis, Peripancreatic [Title/Abstract])) OR (Necrosis, Peripancreatic Fat [Title/Abstract])) OR (Peripancreatic Fat Necroses [Title/Abstract]))) OR (“Panc reatitis”[Mesh])).
Inclusion criteria included the following: (1) patients diagnosed with AP and POF were clearly defined; (2) sufficient information on the diagnostic value of 1 or more assessment indicators for POF; (3) reported in English; and (4) no restrictions on sex, age, or region.
Exclusion criteria included the following: (1) non-English literature; (2) Duplicate and irrelevant literature; (3) pure abstract paper; and (4) lack of true positive (TP), false positive (FP), false negative (FN), or true negative (TN).
In this study, 2 researchers (H.W., W.L.) independently conducted literature screening and data extraction and cross-checked after completion. If there is any dispute, a third researcher makes the judgment. The screening process mainly consists of 2 steps. First, the first round of screening is conducted according to the title and abstract to exclude reviews, in vitro trials, and studies with inconsistent topics. Second, download the literature conforming to the first round of screening, screen the included articles according to the full text, and determine whether to include the literature.
The extracted information mainly included study information (first author, publication year, country, etc.), patient information (number of patients, sex ratio, and mean age), evaluation indicators, a sample number of persistent organ failure, sensitivity, specificity, etc.
Selection of early prediction indicators included the following: (1) to avoid potential bias, we only further analyzed indicators reported in at least 2 studies; and (2) the indicators of TP, FP, FN and TN could be directly or indirectly extracted.
Two researchers (H.W. and W.L.) independently assessed the methodological quality of the included studies using quality assessment of diagnostic accuracy studies-2-2. 12 Disagreements in the quality assessments were resolved through discussion.
If necessary, a third independent researcher (M.L.) will join to make a final judgment. Quadas-2 consists of 4 crucial parts: summarize the review question, tailor the tool and produce review-specific guidance, construct a flow diagram for the primary study, and judge bias and applicability.
TP, FP, FN, and TN were calculated by a 2×2 contingency table of diagnostic tests if only sensitivity, specificity, the gold standard, and the number of POF cases were reported in the included study. If there are studies with multiple truncation values, TP, FP, FN, and TN with the best performance under truncation values are extracted. In addition, when only the ROC curve was included in the study, Origin software (2021 edition) was used to extract the sensitivity and specificity corresponding to the optimal threshold of the ROC curve and further calculate TP, FP, FN, and TN.
To evaluate the accuracy of each indicator, the ANOVA model was applied for network meta-analysis. 11 The ANOVA model could also rank diagnostic tests by calculating the superiority index. Compared with the method based on the diagnostic odds ratio (DOR), the superiority index comprehensively considers the sensitivity and specificity, especially the diagnostic tests with high sensitivity and low specificity or low sensitivity and high specificity having advantages over DOR. We conducted a meta-analysis on R software (version 4.1.1, RSTAN (Package Version 2.21.3)). Based on the gold standard of POF diagnosis as the correlation mode of each predictor, the network graph of each predictor was drawn in Stata software (version 15.0).
Results
A total of 4630 related studies were retrieved, and 23 13 – 35 studies were finally included for further analysis. The details of the literature screening are shown in Fig. 1 .
Flowchart of studies selection process.
The included studies were published between 2003 and 2021, and 16 were published between 2015 and 2021. The study included 10,393 patients with AP, of whom 2014 had POF. Fifteen of the included studies were retrospective, and the others were prospective. The studies included 10 predictors, namely APACHE II, BISAP, Ranson, high-density lipoprotein-cholesterol (HDL-C), Albumin (ALB), CRP, IL-6, IL-8, SIRS, and BUN (summarized in Table 1 ).
Characteristics of included studies.
The predictors column in Table 1 described the indicators reported in the included study, where 1–10 are APACHE II, BISAP, Ranson, HDL-C, ALB, CRP, IL-6, IL-8, SIRS, and BUN, respectively. ALB indicates albumin; BUN, blood urea nitrogen; CRP, C-reactive protein; HDL-C, high-density lipoprotein-cholesterol; IL-6, Interleukin 6; IL-8, Interleukin 8; SIRS, Systemic Inflammatory Response Syndrome.
QUADAS-2 was used for method quality evaluation, and the evaluation results are shown in Fig. 2 .
A, Risk of bias graph; B, Risk of bias summary.
In this study, the network meta-analysis method was used to systematically evaluate the early prediction efficacy of different factors on POF. The gold standard of POF was used as the correlation mode of all predictors, and the Network diagram of predictors was drawn. Ten predictors were screened out, and the number of patients with AP and included studies reporting APACHE II score and Ranson score ranked first and second, respectively, as shown in Fig. 3 .
Network diagram of each predictive indicator and gold standard of POF diagnosis. ALB indicates albumin; BUN, blood urea nitrogen; CRP, C-reactive protein; HDL-C, high-density lipoprotein-cholesterol; IL-8, Interleukin 8; SIRS, Systemic Inflammatory Response Syndrome.
The meta-analysis results showed that the DOR of each indicator for POF was greater than 1, and 95% CI was inclusive of 1, which indicated that indicators included in this study had predictive efficacy for POF. Among them, ALB had the largest DOR [16.92 (95% CI: 4.59–26.99)], with a sensitivity of 70.17% (95% CI: 48.52–79.58) and specificity of 85.97% (95% CI: 74.65–91.68). The DOR of HDL-C, Ranson Score, BISAP score, and APACHE II were 11.13, 11.30, 10.80, and 10.17, respectively. (Table 2 and Fig. 4 )
The sensitivity and specificity of each indicator were summarized.
ALB indicates albumin; BUN, blood urea nitrogen; CRP, C-reactive protein; HDL-C, high-density lipoprotein-cholesterol; IL-6, Interleukin 6; IL-8, Interleukin 8; SIRS, Systemic Inflammatory Response Syndrome.
The sensitivity and specificity of each indicator for early prediction of POF. ALB indicates albumin; BUN, blood urea nitrogen; CRP, C-reactive protein; HDL-C, high-density lipoprotein-cholesterol; IL-6, Interleukin 6; IL-8, Interleukin 8; SIRS, Systemic Inflammatory Response Syndrome.
The diagnostic sensitivity and specificity of the APACHE II score were used as reference objects to analyze other indexes’ relative sensitivity and specificity. Analysis showed that compared with APACHE II, the sensitivity of other indicators did not increase significantly, while ALB, CRP, and IL-6 had better specificity. ALB had the most specificity advantage over the APACHE II score. The relative sensitivity and specificity are summarized in Table 3 .
Relative sensitivity and specificity.
ALB indicates albumin; BUN, blood urea nitrogen; CRP, C-reactive protein; HDL-C, high-density lipoprotein-cholesterol; IL-6, Interleukin 6; IL-8, Interleukin 8; SIRS, Systemic Inflammatory Response Syndrome
Conclusion
The primary significance of this network meta-analysis is to summarize the early diagnostic indicators and efficacy of POF in patients with AP. Our findings show that ALB, HDL-C, Ranson Score, and BISAP Score are effective in the early prediction of POF in patients with AP, which can provide evidence for the development of effective POF early prediction systems (such as machine learning-based prediction models). However, due to the limitations of the extraction method of predictive indicators in this study, some effective indicators may not be included in this meta-analysis.
Discussion
In this study, the ANOVA model was used to implement a network meta-analysis of the early prediction efficacy of POF; a total of 10 early prediction indicators of POF in patients with AP were summarized. The meta-analysis results showed that the predictors for early diagnosis of 10 kinds of POF and their 95% CI lower limit were all greater than 1, indicating that these predictors had certain early diagnosis values for POF. According to the results of DOR sequencing, ALB seemed to have the best early prediction performance. In addition, we used APACHE II as a reference. Relative analysis showed that compared with APACHE II, the sensitivity of other indicators did not increase significantly, while ALB, CRP, and IL-6 had better specificity. ALB had the most specificity advantage over the APACHE II score.
This study showed that the APACHE II, BISAP, Ranson, and SIRS scoring systems had relatively high sensitivity for the early prediction of POF, which is of great significance for predicting POF in patients with AP. However, due to the complexity of these scoring systems, it is not easy to use them widely in clinical practice. 5 Il-6, IL-8, and CRP have been clinically evaluated to predict the outcome of acute pancreatitis. 36 This study showed that these indicators also had good accuracy in the early prediction of POF in patients with AP, which may be related to the severity of POF in AP. ALB seemed to have the highest DOR, but its sensitivity was lower than the other scoring systems. Recently, a systematic review by Yang et al 37 concluded that based on the diagnostic positive likelihood ratio, the Japanese Severity Scale and BISAP within 48 hours of admission and the Japanese Severity Scale and blood urea nitrogen within 48 hours of admission were the best predictors of POF. However, in our study, the prediction effect of blood urea nitrogen seemed not ideal, which may be due to the difference in the ranking method of prediction effectiveness. In summary, the early prediction of POF in patients with AP by all indicators included in this study seems to have only moderate accuracy, which indicates that although a single indicator prediction has good accuracy, it is more necessary to consider the development of a prediction system combining multiple indicators (such as a prediction model based on machine learning). Meanwhile, in addition to the early prediction indicators summarized in this study, there are other potentially effective prediction indicators. 38
This study had the following advantages: First, it was the first time to quantify the predictive efficacy of each index for the occurrence of POF in patients with AP by network meta-analysis, and the DOR reflected the best index for the early diagnosis of POF. Second, the total number of research samples included in this meta-analysis was relatively large. However, this study also had the following limitations: First, there was a lack of diagnostic randomized controlled trials in the studies included in this meta-analysis, which may bring some bias to the judgment of the results; Second: The methodology of this study did not consider the optimal truncation value of each indicator; Third, the selection of included indicators may not be comprehensive enough, because indicators reported in only 1 study and indicators that could not directly or indirectly extract TP, FP, FN, and TN were included; Fourth, network meta-analysis based on ANOVA model can only provide the results of sensitivity, specificity, DOR, relative sensitivity, relative specificity, and diagnostic advantage index. Further studies are necessary to calculate other diagnostic indicators.
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.