Predicting failed trial of labor in advanced-age primiparas: development and validation of a clinical nomogram.

doi:10.1186/s12884-026-08794-y

Predicting failed trial of labor in advanced-age primiparas: development and validation of a clinical nomogram.

2026 · doi:10.1186/s12884-026-08794-y · PMID:41724959 · PMC13036918

OA: gold CC-BY-NC-ND-4.0

📄 Open PDF Full text JSON View on PubMed View at publisher

Full text 37,989 characters · extracted from pmc-nxml · 7 sections · click to expand

What

Due to the lack of validated objective risk assessment tools, clinical decision-making regarding the mode of delivery for advanced-age primiparas is challenging, resulting in a dual burden of unnecessary cesarean sections and failed trials of labor. This study aimed to develop and validate a predictive model to quantify the risk of labor failure in term, singleton, cephalic-presenting advanced-age primiparas.

Methods

This retrospective cohort study enrolled women who registered for prenatal care and delivered at Peking University First Hospital between October 2019 and September 2022. The inclusion criteria were: (1) term (gestational age ≥ 37 to < 42 weeks), singleton, cephalic-presenting pregnancy; (2) nulliparity; (3) live birth; (4) complete clinical data on pregnancy and delivery; (5) initiation of a trial of labor; and (6) maternal age ≥ 35 years at delivery. Exclusion criteria were: (1) scarred uterus (including history of previous cesarean section, myomectomy, or other procedures resulting in uterine myometrial scar); (2) elective cesarean section (including cesarean sections performed without a trial of labor, such as those due to severe funnel pelvis, central placenta previa, severe maternal medical conditions precluding a trial of labor, as well as cases where the parturient actively withdrew from the trial of labor during the process). This was a retrospective cohort study conducted using data extracted from the Beijing Maternal and Child Healthcare Network Information System and the electronic medical record system of Peking University First Hospital. Based on strict adherence to predefined inclusion and exclusion criteria, we identified a cohort of full-term, singleton, cephalic-presenting advanced-maternal-age primiparous women who underwent a TOL. This cohort served as the model development group for constructing a prediction model for the risk of TOL failure. This prediction model is intended for use at the onset of labor or when the decision to initiate a trial of labor is being made, incorporating baseline characteristics and early peripartum conditions known at that time. To evaluate the generalizability and robustness of the developed model, an independent external validation cohort was established. This validation cohort comprised 601 full-term, singleton, advanced-maternal-age primiparous women who underwent delivery at the Daxing Branch of Peking University First Hospital between May 2024 and April 2025. The temporal and institutional separation of this cohort ensured minimal overlap with the development cohort, thereby enabling an unbiased assessment of the model’s predictive performance in a distinct clinical population. Model performance was evaluated in terms of discrimination, calibration, and clinical utility. Candidate predictor variables were selected a priori based on clinical relevance and existing literature. They encompassed maternal demographics (age, height) and pre-pregnancy body mass index (BMI), pregnancy characteristics (gestational age, gestational weight gain, conception method), intrapartum factors (labor induction, epidural analgesia), and a spectrum of maternal-fetal comorbidities (e.g., gestational hyperglycemia, hypertensive disorders, cardiac disease). Estimated fetal weight (EFW), obtained from the last ultrasound within one week prior to delivery, was also included. Detailed operational definitions and diagnostic criteria for all variables are available in Additional file 1. Labor induction was defined as the medically indicated initiation of labor prior to spontaneous onset using pharmacological agents (e.g., oxytocin, prostaglandins), mechanical methods, or artificial rupture of membranes [ 9 ]. The primary outcome was the result of the TOL, classified as either successful or failed. A successful TOL was defined as a vaginal delivery, encompassing both spontaneous vaginal delivery and operative vaginal delivery (including vacuum or forceps assistance). A failed TOL was defined as an unplanned cesarean delivery that occurred following the documented intention for a trial of labor, irrespective of whether it was performed before or after the establishment of labor. Importantly, this definition of failed TOL did not depend on the specific cervical dilation threshold used to define active labor (e.g., 4 cm vs. 6 cm). Maternal and neonatal outcomes were assessed as secondary endpoints. Maternal outcomes included postpartum blood loss, the duration of the second stage, and the total duration of labor (for vaginal deliveries). Severe postpartum hemorrhage was defined as an estimated blood loss of ≥ 1000 mL within 24 h after delivery [ 10 ]. Labor progress was monitored using the Friedman partograph [ 11 ]. Neonatal outcomes included birth weight, infant sex, Apgar scores at 1 and 5 min (categorized as ≤ 7 vs. >7), and admission to the neonatal intensive care unit (NICU). Statistical analyses were performed using SPSS 24.0 (IBM Corp., Armonk, NY, USA, released 2016) and R version 4.4.0 (R Foundation for Statistical Computing, Vienna, Austria, released 2024). A p -value < 0.05 was considered statistically significant. Continuous variables with normal distribution were described as mean ± standard deviation and compared using the independent samples t-test; non-normally distributed variables were described as median and interquartile range (IQR), and compared using the Mann-Whitney U test. Categorical variables were described as frequency (percentage) and compared using the chi-square test or Fisher’s exact test as appropriate. For the prediction model, the development cohort was used. Predictor variables were selected using least absolute shrinkage and selection operator (LASSO) regression, with 10-fold cross-validation to determine the optimal penalty parameter (λ). The selected variables were incorporated into a multivariable logistic regression model to build the final model. The relative importance of predictors in the final logistic regression model was assessed using the Wald chi-square statistic. Collinearity was assessed using variance inflation factors. A complete-case analysis was used for predictor variables in the final model. A visual nomogram was created based on the final model to allow for the straightforward calculation of the predicted probability of a failed TOL. The receiver operating characteristic (ROC) curves were plotted, and the areas under the curves (AUC) were calculated to evaluate the model’s discrimination in both the development and validation cohorts. Calibration was assessed with calibration plots and the Hosmer-Lemeshow test. Clinical utility was evaluated using decision curve analysis (DCA). Internal validation was performed via bootstrap resampling (1000 replicates).

Results

The model development cohort included 1344 patients, with the selection process detailed in Fig. 1 . An independent external validation cohort comprised 601 patients. Baseline characteristics for both cohorts are presented in Table 1 . Statistically significant differences ( p < 0.05) were observed between cohorts for several variables, including epidural analgesia and labor induction. These differences reflect inherent clinical practice variations and provide a robust test of the model’s generalizability in the external validation cohort. Fig. 1 Flowchart of study population selection for the model development cohort Flowchart of study population selection for the model development cohort Table 1 Comparison of baseline characteristics and outcomes between the model development cohort and the external validation cohort Baseline Characteristic Model Development Cohort( n = 1344) External Validation Cohort ( n = 601) p TOL Outcome 0.123 Success 906 (67.4) 427 (71.0) Failure 438 (32.6) 174 (29.0) Obstetric History 0.010 None 937 (69.7) 383 (63.7) Present 407 (30.3) 218 (36.3) Conception Method 0.584 Spontaneous Conception 1052 (78.3) 463 (77.0) Assisted Reproductive Technology 292 (21.7) 138 (23.0) Epidural Analgesia < 0.001 No 356 (26.5) 114 (19.0) Yes 988 (73.5) 487 (81.0) Labor Induction < 0.001 No 546 (40.6) 184 (30.6) Yes 798 (59.4) 417 (69.4) Gestational Hyperglycemia < 0.001 No 877 (65.3) 457 (76.0) Yes 467 (34.7) 144 (24.0) -Gestational Diabetes Mellitus 432 (32.1) 129 (21.5) -Pregestational Diabetes Mellitus 35 (2.6) 15 (2.5) Hypertensive Disorders of Pregnancy 0.001 No 1192 (88.7) 499 (83.0) Yes 152 (11.3) 102 (17.0) Cardiac Disease in Pregnancy 0.627 No 1327 (98.7) 591 (98.3) Yes 17 (1.3) 10 (1.7) Thyroid Disease 0.002 No 1089 (81.0) 450 (74.9) Yes 255 (19.0) 151 (25.1) Liver Dysfunction 0.922 No 1326 (98.7) 594 (98.8) Yes 18 (1.3) 7 (1.2) Antiphospholipid Syndrome 0.292 No 1329 (98.9) 590 (98.2) Yes 15 (1.1) 11 (1.8) PROM 0.176 No 944 (70.2) 403 (67.1) Yes 400 (29.8) 198 (32.9) Amniotic Fluid Volume 0.704 Normal 1300 (96.7) 584 (97.2) Abnormal 44 (3.3) 17 (2.8) - Oligohydramnios 38 (2.8) 14 (2.3) - Polyhydramnios 6 (0.4) 3 (0.5) Anemia 1.000 No 1241 (92.3) 555 (92.3) Yes 103 (7.7) 46 (7.7) Gestational Thrombocytopenia 0.057 No 1334 (99.3) 590 (98.2) Yes 10 (0.7) 11 (1.8) Polycystic Ovary Syndrome 0.375 No 1326 (98.7) 589 (98.0) Yes 18 (1.3) 12 (2.0) Uterine Fibroids < 0.001 No 1039 (77.3) 390 (64.9) Yes 305 (22.7) 211 (35.1) Asthma 0.025 No 1330 (99.0) 586 (97.5) Yes 14 (1.0) 15 (2.5) EFW (kg) 0.098 <2.5 31 (2.3) 11 (1.8) 2.5–3.9 1281 (95.3) 584 (97.2) ≥4.0 32 (2.4) 6 (1.0) Maternal Age (years) 36.5 ± 1.6 36.3 ± 1.5 0.003 Gestational Age (weeks) 39.6 (38.9, 40.3) 39.6 (38.7, 40.3) 0.560 Height (cm) 163.4 ± 5.1 163.3 ± 5.1 0.251 Pre-pregnancy BMI (kg/m²) 21.8 (20.0, 24.0) 21.9 (20.1, 24.0) 0.690 Gestational Weight Gain (kg) 13.6 ± 4.8 13.4 ± 5.5 0.359 Data are presented as n (%), mean ± standard deviation, or median (interquartile range) Abbreviations: TOL Trial of Labor Outcome, PROM Premature Rupture of Membranes, EFW Estimated Fetal Weight, BMI Body Mass Index Comparison of baseline characteristics and outcomes between the model development cohort and the external validation cohort Data are presented as n (%), mean ± standard deviation, or median (interquartile range) Abbreviations: TOL Trial of Labor Outcome, PROM Premature Rupture of Membranes, EFW Estimated Fetal Weight, BMI Body Mass Index In the model development cohort, the TOL success rate among advanced maternal age women was 67.4% (906 of 1344), with a failure rate of 32.6% (438 of 1344). In the independent external validation cohort, the success rate was 71.0% (427 of 601), and the failure rate was 29.0% (174 of 601). A comparison of baseline characteristics by TOL outcome in the development cohort is presented in Table 2 . Relative to the successful TOL group, the failed TOL group exhibited a significantly different profile: later gestational age at delivery, shorter maternal height, higher pre-pregnancy BMI and gestational weight gain, a greater prevalence of conception by assisted reproductive technology, labor induction, hypertensive disorders of pregnancy, premature rupture of membranes (PROM), and abnormal amniotic fluid volume (detailed as oligohydramnios and polyhydramnios in Table 2 ), along with a lower rate of epidural analgesia use. Furthermore, the proportion of fetuses with an EFW ≥ 4.0 kg was significantly higher in the failure group. Table 2 Comparison of baseline characteristics by trial of labor outcome in the model development cohort Variable Trial of Labor Success ( n = 906) Trial of Labor Failure ( n = 438) p Maternal Age (years) 36.5 ± 1.6 36.5 ± 1.6 0.808 Gestational Age (weeks) 39.6 (38.7, 40.1) 39.9 (39.0, 40.3) 0.002 Obstetric History 0.212 None 642 (70.9) 295 (67.4) Present 264 (29.1) 143 (32.6) Height (cm) 163.5 ± 5.5 161.9 ± 5.3 < 0.001 Pre-pregnancy BMI (kg/m²) 21.6 (19.9, 23.6) 22.2 (20.3, 24.8) < 0.001 Gestational Weight Gain (kg) 13.2 ± 4.7 14.3 ± 5.0 0.001 Conception Method 0.110 Spontaneous Conception 721 (79.6) 331 (75.6) Assisted Reproductive Technology 185 (20.4) 107 (24.4) Epidural Analgesia < 0.001 No 175 (19.3) 181 (41.3) Yes 731 (80.7) 257 (58.7) Labor Induction < 0.001 No 425 (46.9) 121 (27.6) Yes 481 (53.1) 317 (72.4) Gestational Hyperglycemia 0.239 No 596 (65.8) 281 (64.2) Gestational Diabetes Mellitus 291 (32.1) 141 (32.2) Pregestational Diabetes Mellitus 19 (2.1) 16 (3.7) Hypertensive Disorders of Pregnancy < 0.001 No 833 (91.9) 359 (82.0) Yes 73 (8.1) 79 (18.0) Cardiac Disease in Pregnancy 0.296 No 892 (98.5) 435 (99.3) Yes 14 (1.5) 3 (0.7) Thyroid Disease 0.272 No 742 (81.9) 347 (79.2) Yes 164 (18.1) 91 (20.8) Liver Dysfunction 0.853 No 893 (98.6) 433 (98.9) Yes 13 (1.4) 5 (1.1) Antiphospholipid Syndrome 0.164 No 893 (98.6) 436 (99.5) Yes 13 (1.4) 2 (0.5) PROM < 0.001 No 668 (73.7) 276 (63.0) Yes 238 (26.3) 162 (37.0) Amniotic Fluid Volume < 0.001 Normal 890 (98.2) 410 (93.6) Oligohydramnios 13 (1.4) 25 (5.7) Polyhydramnios 3 (0.3) 3 (0.7) Anemia 0.129 No 844 (93.2) 397 (90.6) Yes 62 (6.8) 41 (9.4) Gestational Thrombocytopenia 0.736 No 900 (99.3) 434 (99.1) Yes 6 (0.7) 4 (0.9) Polycystic Ovary Syndrome 0.066 No 898 (99.1) 428 (97.7) Yes 8 (0.9) 10 (2.3) Uterine Fibroids 0.478 No 706 (77.9) 333 (76.0) Yes 200 (22.1) 105 (24.0) Asthma 1.000 No 897 (99.0) 433 (98.9) Yes 9 (1.0) 5 (1.1) EFW (kg) < 0.001 <2.5 878 (96.9) 403 (92.0) 2.5 ~ 3.9 17 (1.9) 14 (3.2) ≥4.0 11 (1.2) 21 (4.8) Data are presented as n (%), mean ± standard deviation, or median (interquartile range) Abbreviations : BMI Body Mass Index, PROM Premature Rupture of Membranes, EFW Estimated Fetal Weight Comparison of baseline characteristics by trial of labor outcome in the model development cohort Data are presented as n (%), mean ± standard deviation, or median (interquartile range) Abbreviations : BMI Body Mass Index, PROM Premature Rupture of Membranes, EFW Estimated Fetal Weight Table 3 presents a comparative analysis of maternal and neonatal outcomes based on the results of the trial of labor. Women in the successful TOL group exhibited lower overall postpartum blood loss compared to those in the failed trial group; however, the incidence of severe postpartum hemorrhage did not differ significantly between the groups. Table 3 Comparison of maternal and neonatal outcomes by trial of labor outcome in the model development cohort Variable Trial of Labor Success ( n = 906) Trial of Labor Failure ( n = 438) p Postpartum Hemorrhage Volume < 0.001 <500 ml 657(72.5) a 259(59.1) b ≥500 and < 1000 ml 216(23.8) a 168(38.4) b ≥1000 ml 33(3.6) a 11(2.5) a Volume (ml), median (interquartile range) 400.0 (320.0,525.5) 400.0 (334.5,529.3) 0.012 Neonatal Birth Weight (kg) < 0.001 <2.5 20(2.2) a 13(3.0) a 2.5–3.9 874(96.5) a 402(91.8) b ≥4.0 12(1.3) a 23(5.3) b Weight (grams), mean ± SD 3232.3 ± 353.5 3361.9 ± 420.5 < 0.001 1-minute Apgar Score ≤ 7 26(2.9) 21(4.8) 0.072 5-minute Apgar Score ≤ 7 1(0.1) 0(0.0) 1.000 NICU Admission Rate 183(20.2) 89(20.3) 0.959 Data are presented as n (%), mean ± standard deviation, or median (interquartile range). Values within a row with different superscript letters (a, b) differ significantly ( p < 0.05). Abbreviations : NICU Neonatal Intensive Care Unit Comparison of maternal and neonatal outcomes by trial of labor outcome in the model development cohort Data are presented as n (%), mean ± standard deviation, or median (interquartile range). Values within a row with different superscript letters (a, b) differ significantly ( p < 0.05). Abbreviations : NICU Neonatal Intensive Care Unit Regarding neonatal characteristics and outcomes, a higher proportion of neonates in the success group were of normal birthweight. The success group also had a significantly higher proportion of female infants compared to the failure group (51.8% vs. 45.9%, p = 0.043). In terms of neonatal health outcomes, no significant differences were observed between the groups in 1-minute or 5-minute Apgar scores ≤ 7 or in the rate of NICU admission. Among women who achieved a vaginal delivery, the median duration of the second stage of labor was 42.0 (IQR: 21.0, 74.0) minutes, and the median total duration of labor was 695.0 (IQR: 444.5, 952.8) minutes. In the development cohort of 1344 patients, LASSO regression was used for variable selection to mitigate overfitting, and the λ.1se value was selected. The variable selection path and cross-validation error profile are shown in Fig. 2 a and b, respectively. Subsequently, these variables were used to develop a multivariable logistic regression model predicting a failed trial of labor. The final model, which comprised 11 predictors (detailed in Table 4 ), was then subjected to a variable importance analysis based on the Wald χ² statistic to quantify each predictor’s relative contribution. The analysis clearly showed that epidural analgesia was the strongest predictor. Fig. 2 Feature selection using LASSO regression. a Coefficient profiles of candidate variables. The vertical dashed line marks the optimal penalty (λ) chosen by cross-validation, where the final set of non-zero coefficients (retained features) is determined. b Cross-validation curve for λ selection. This study selected λ.1se (right vertical dashed line), which refined the predictor set from 22 variables under λ.min to 11 predictors. This achieved an optimal balance between model complexity and parsimony without a significant loss in predictive accuracy Feature selection using LASSO regression. a Coefficient profiles of candidate variables. The vertical dashed line marks the optimal penalty (λ) chosen by cross-validation, where the final set of non-zero coefficients (retained features) is determined. b Cross-validation curve for λ selection. This study selected λ.1se (right vertical dashed line), which refined the predictor set from 22 variables under λ.min to 11 predictors. This achieved an optimal balance between model complexity and parsimony without a significant loss in predictive accuracy Table 4 Multivariable analysis of trial of labor outcome in the model development cohort Variable Regression Coefficient Odds Ratio (OR) 95% CI Lower Limit 95% CI Upper Limit Wald χ² statistic p Gestational Weight Gain (kg) 0.057 1.058 1.030 1.088 16.276 < 0.001 Pre-pregnancy BMI (kg/m²) 0.082 1.086 1.043 1.132 15.561 < 0.001 Height (cm) -0.070 0.933 0.911 0.955 32.723 < 0.001 Gestational Age(weeks) 0.350 1.419 1.234 1.636 23.735 < 0.001 Estimated Fetal Weight ≥4.0 kg 1.274 3.577 1.500 8.430 9.108 0.003 2.5–3.9 kg (Ref) - 1.000 - - - - <2.5 kg 0.706 2.025 0.882 4.600 2.841 0.092 Anemia 0.705 2.025 1.276 3.191 9.148 0.002 Abnormal Amniotic Fluid Volume 1.379 3.972 2.019 8.026 15.526 < 0.001 Premature Rupture of Membranes 0.577 1.780 1.316 2.413 13.920 < 0.001 Hypertensive Disorders of Pregnancy 0.895 2.447 1.634 3.669 18.850 < 0.001 Labor Induction 0.305 1.356 1.002 1.837 3.881 0.049 Epidural Analgesia -1.108 0.330 0.249 0.437 59.495 < 0.001 Intercept -5.758 0.003 - - - - Abbreviations : OR Odds Ratio, CI Confidence Interval, BMI Body Mass Index, EFW Estimated Fetal Weight, PROM Premature Rupture of Membranes. The reference category for EFW is 2.5–3.9 kg Multivariable analysis of trial of labor outcome in the model development cohort Abbreviations : OR Odds Ratio, CI Confidence Interval, BMI Body Mass Index, EFW Estimated Fetal Weight, PROM Premature Rupture of Membranes. The reference category for EFW is 2.5–3.9 kg Multivariable analysis revealed that greater gestational weight gain, higher pre-pregnancy BMI, greater gestational age at delivery, estimated fetal weight ≥ 4.0 kg, anemia, abnormal amniotic fluid volume, PROM, hypertensive disorders of pregnancy, and labor induction were independent risk factors for failed TOL (OR > 1); whereas taller maternal height and use of epidural analgesia were protective factors (OR < 1). In the final model, EFW < 2.5 kg was retained as a predictor; however, its association with failed TOL did not reach statistical significance ( p = 0.092). Based on these variables, a prediction nomogram was constructed (Fig. 3 ) to visually demonstrate the contribution weight of each predictor and provide a practical tool for calculating individualized risk probabilities. Fig. 3 Nomogram for the Prediction Model of Trial of Labor Outcome in Women of Advanced Maternal Age. Maternal Age For each predictor variable, locate the patient's specific value on the corresponding axis. Draw a vertical line upwards to the top "Points" axis to read the individual score assigned for that variable. Sum the individual scores for all variables to obtain the Total Points. Locate the Total Points on the "Total Points" axis. Then, draw a vertical line downwards to the bottom "Pr(failed TOL)" axis to read the final estimated probability. TOL, Trial of Labor Nomogram for the Prediction Model of Trial of Labor Outcome in Women of Advanced Maternal Age. Maternal Age For each predictor variable, locate the patient's specific value on the corresponding axis. Draw a vertical line upwards to the top "Points" axis to read the individual score assigned for that variable. Sum the individual scores for all variables to obtain the Total Points. Locate the Total Points on the "Total Points" axis. Then, draw a vertical line downwards to the bottom "Pr(failed TOL)" axis to read the final estimated probability. TOL, Trial of Labor The model demonstrated good discriminative ability in the development cohort (AUC = 0.751). Bootstrap internal validation with 1,000 resamples confirmed model robustness, yielding an optimism-corrected AUC of 0.741(95%CI:0.740–0.750), which indicated minimal overfitting. A decision threshold of 0.329 was selected, as it provided a balanced performance with both sensitivity and specificity approximating 70% (69.2% and 70.0%, respectively).Good calibration was observed, as supported by the calibration plot and a non-significant Hosmer-Lemeshow test (χ²=11.312, p = 0.185). DCA demonstrated that across a wide threshold probability range of 20% to 80%, the use of our model to guide clinical decisions provided a superior net clinical benefit compared to both the “cesarean section for all” and “trial of labor for all” strategies (Fig. 4 ). Fig. 4 Internal validation and external validation of the prediction model: ROC Curves, Calibration Curves, and Decision Curves. a Internal Validation ROC Curve. b Internal Validation Calibration Curve. c Internal Validation Decision Curve. d External Validation ROC Curve. e External Validation Calibration Curve. f External Validation Decision Curve Internal validation and external validation of the prediction model: ROC Curves, Calibration Curves, and Decision Curves. a Internal Validation ROC Curve. b Internal Validation Calibration Curve. c Internal Validation Decision Curve. d External Validation ROC Curve. e External Validation Calibration Curve. f External Validation Decision Curve In the independent validation cohort ( n = 601), the model demonstrated strong generalizability, achieving an AUC of 0.759 (95%CI:0.716–0.803) despite differences in clinical characteristics (Table 1 ). Predictor variables, including epidural analgesia (OR: 0.069; 95% CI: 0.040–0.116; p < 0.001) and labor induction (OR: 2.063; 95% CI: 1.243–3.501; p = 0.006), remained statistically significant with effect directions consistent with the development cohort, demonstrating good robustness. Both the calibration curve and the Hosmer-Lemeshow test (χ²=12.416, p = 0.134) indicated good calibration. DCA further confirmed that the model provided a clear net benefit for clinical decision-making across a threshold probability range of 20% to 80% (Fig. 4 ).

Background

With ongoing socioeconomic development and shifting societal attitudes toward childbearing, there has been a progressive delay in the average maternal age at first birth. The proportion of advanced maternal age primiparas has shown a steady upward trend since 2018, reaching 3.04% by 2023 in China [ 1 ]. In clinical practice, women with an expected delivery age ≥ 35 years are typically defined as advanced-age parturients, a standard primarily based on evidence indicating declining fertility and increased risks of genetic abnormalities such as fetal chromosomal anomalies from this age onward [ 2 ]. For advanced maternal age primiparas, the decision regarding the mode of delivery represents a complex and clinically significant dilemma. In practice, some individuals—particularly those who have experienced a difficult conception or who exhibit heightened anxiety regarding the unpredictability of labor—may express a preference for elective cesarean delivery as a perceived strategy to mitigate peripartum risks. However, a primary cesarean section is associated with a range of long-term adverse sequelae. These include increased risks of serious obstetric complications in subsequent pregnancies, such as uterine rupture and placenta accreta spectrum disorders [ 3 ], as well as gynecological morbidities, including post-cesarean uterine niche [ 4 ]and iatrogenic endometriosis [ 5 ]. Conversely, advanced-age primiparous women face a substantially higher likelihood of a failed trial of labor (TOL) [ 6 ], and a greater risk of maternal and neonatal complications than their younger nulliparous counterparts [ 7 ]. Moreover, when TOL fails, it consequently subjects the mother to considerable physical and psychological stress and places the neonate at an increased risk of adverse perinatal outcomes compared to a successful vaginal delivery. This dilemma underscores that accurately predicting TOL outcome is a critical clinical necessity, essential for mitigating the double burden faced by advanced-age primiparas. An effective tool would allow clinicians to distinguish between women with a high versus low probability of success, thereby enabling personalized perinatal management. For those predicted to have a low risk of failure, vaginal delivery can be encouraged, improving immediate birth outcomes and reducing primary cesarean sections—a key step toward mitigating serious long-term sequelae such as placenta accreta spectrum in future pregnancies [ 8 ]. Conversely, for women identified as high-risk, the option of a planned cesarean delivery can be discussed proactively, optimizing healthcare resource allocation and potentially improving perinatal outcomes. However, there is currently a notable absence of validated, quantitative risk assessment tools designed specifically to predict TOL failure among term, singleton, cephalic-presenting advanced-age primiparas. To address this gap, this retrospective cohort study focused specifically on this population. We systematically analyzed the outcomes of their TOL and identified key influencing factors to develop and validate a clinically applicable prediction model. The goal was to enable the early identification of high-risk individuals, thereby guiding evidence-based, personalized counselling and decision-making on delivery mode.

Discussion

In total, 11 clinically available variables were included in the final model; 10 were statistically significant predictors, and the lowest EFW category (< 2.5 kg) was retained in the final model for clinical completeness and to avoid extreme extrapolation, maintaining symmetry with the macrosomia (≥ 4.0 kg) variable, despite not reaching statistical significance in this cohort. Our model identified key factors linked to TOL failure in advanced-age primiparas, with findings that both align and contrast with the literature. Consistent with prior evidence [ 12 – 14 ], pre-pregnancy BMI, gestational weight gain, and labor induction were risk factors, whereas epidural analgesia was protective—a finding supported by a U.S. prospective study attributing the induction risk to intrapartum factors [ 15 ]. However, our results diverge from a meta-analysis in older women that found no cesarean risk with elective induction [ 16 ]; this may reflect our strictly medical-indication-based induction cohort per WHO guidelines [ 17 ]. Similarly, a Japanese study in women ≥ 40 years reported no effect of epidural analgesia [ 18 ], a discrepancy potentially due to population or protocol differences. Although the strong protective association observed in our study is mechanistically plausible (e.g., via reduction of pain and anxiety [ 19 ] ), it must be emphasized that this observational finding cannot establish causality and is likely influenced by unmeasured confounding factors such as labor management style, patient preferences, or institutional protocols. Therefore, in our model, epidural analgesia serves primarily as a robust predictive marker rather than a directly modifiable biological factor. Its strong association nonetheless underscores it as a clinically relevant factor worthy of further prospective study aimed at improving TOL success rates. Beyond the factors discussed above, our model also identified fetal macrosomia (EFW ≥ 4.0 kg), abnormal amniotic fluid volume, and hypertensive disorders as independent risk factors, while greater maternal height was protective. These findings reinforce established physiological principles: fetal macrosomia increases the risk of cephalopelvic disproportion, while maternal height is inversely correlated with cesarean risk [ 20 , 21 ]. The strong association of abnormal amniotic fluid volume with TOL failure is particularly noteworthy. This is corroborated by a meta-analysis linking idiopathic polyhydramnios to a significantly elevated risk of cesarean section (OR = 2.31) and fetal macrosomia (OR = 2.93) [ 22 ], while oligohydramnios has also been independently associated with higher cesarean rates [ 23 ]. It is noted that oligohydramnios may also pose a challenge for labor induction, potentially influencing the delivery pathway [ 24 ]. Furthermore, the presence of modifiable maternal conditions such as hypertensive disorders and anemia [ 25 ] underscores a critical opportunity for proactive clinical management. Our results demonstrate that optimizing the control of these conditions antepartum may represent a tangible pathway to improving the chance of TOL success. When compared to existing prediction models, our findings offer distinct insights. For instance, unlike the model by Dorwal et al. [ 26 ], we established hypertensive disorders as a key predictor. This highlights the unique risk profile of the advanced-age primiparous population and the specificity of our model. Similarly, a Dutch retrospective study focusing on women ≥ 40 years developed a prediction model for the primiparous subgroup (AUC = 0.74, optimism-corrected to 0.68) which included variables like maternal age, gestational age, BMI, and spontaneous labor onset [ 14 ]. Synthesizing these findings, our model suggests that a comprehensive management strategy for advanced-age primiparas with multiple high-risk factors (e.g., hypertensive disorders of pregnancy, PROM, abnormal amniotic fluid volume) could be explored, potentially including intensified intrapartum monitoring and strict control of gestational weight gain, with the goal of being evaluated for its impact on the success rate of vaginal delivery. Furthermore, the failed TOL group exhibited poorer control of postpartum blood loss. This suggests that early recognition and more proactive prophylactic use of uterotonics during labor might be investigated to mitigate postpartum hemorrhage risk. A principal strength of this study is the development and rigorous external validation of a clinical prediction model within an independent cohort exhibiting distinct clinical practices. The observed differences in variables such as epidural analgesia and labor induction rates between cohorts directly reflect the heterogeneity in obstetric analgesia philosophies, labor management strategies, and patient characteristics across different medical institutions, thereby enhancing the real-world validity of our findings. The slightly higher TOL success rate in the external cohort, despite differences in epidural analgesia and induction rates, likely reflects institutional differences in labor management philosophies and patient selection. These differences underscore both the need for local calibration and the potential generalizability of the model across heterogeneous settings. Crucially, the model sustained robust performance and demonstrated a positive net benefit across a wide range of probability thresholds, underscoring its strong potential for clinical translation and decision-making. Based on the decision curve analysis, which showed net benefit across a threshold range of 20% to 80%, this wide range indicates that the model adds clinical value across most realistic decision-making scenarios, from relatively low to high pre-test probabilities of TOL failure. We propose that predicted probabilities below approximately 20% might reasonably support proceeding with a TOL, whereas probabilities above 40–50% may prompt more detailed counselling about elective cesarean section. These thresholds are illustrative and should be adapted to individual patient preferences and institutional practice. Importantly, the model provides a static risk assessment applicable at the time of hospital admission or early labor planning. Its predictions should therefore serve as an initial guide to augment—not replace—the clinician’s ongoing dynamic assessment of intrapartum progress, fetal well-being, and maternal condition. Integrating this baseline prediction with real-time evaluation is essential for optimizing intrapartum care. This study has several limitations. First, its retrospective design may leave residual confounding unmeasured such as socioeconomic status and relies on the accuracy of clinical documentation, which may introduce potential misclassification bias. Furthermore, the exclusion of women who actively withdrew from the trial of labor may introduce selection bias, as reasons for withdrawal (e.g., intolerable pain, anxiety, or unrecorded clinical factors) could themselves be associated with a higher underlying risk of labor failure. Relatedly, the use of broadly defined composite variables (e.g., hypertensive disorders of pregnancy, labor induction, gestational hyperglycemia) without subclassification by clinical subtype or management strategy may obscure heterogeneity within these categories. Specifically regarding epidural analgesia, its retrospective assessment presents a particular challenge: the “No” analgesia group may include parturients who did not receive it due to rapid labor progression, a scenario clinically distinct from elective or contraindication-based non-receipt. Consequently, it is crucial to emphasize that the strong protective association observed for epidural analgesia, derived from observational data, should not be interpreted as evidence of causality. This relationship is likely subject to residual confounding by factors such as labor progression patterns and indication bias, and it highlights a predictive association worthy of future prospective study rather than a modifiable causal effect. Second, the model was developed and validated using cohorts from hospitals within a single healthcare system. Therefore, its generalizability to settings with substantially different obstetric practices, induction policies, epidural analgesia availability, or cesarean delivery thresholds may be limited and requires further external validation. Third, this model is designed to inform the initial decision to initiate a trial of labor, incorporating factors known at that point (e.g., planned induction, PROM). It does not incorporate dynamic intrapartum factors that evolve after labor is established, which constitutes a limitation for managing the ongoing course of labor. Finally, certain measurements (e.g., amniotic fluid index, EFW) were obtained by different clinicians using standard clinical protocols, introducing potential measurement variability. Additionally, pre-pregnancy weight was typically self-reported. Future well-designed multicenter prospective cohort studies are warranted to further validate and refine our prediction model. Subsequent studies could also investigate the integration of novel intrapartum ultrasound indices (e.g., angle of progression, rotation angle) to enhance predictive accuracy during labor [ 27 ]. Moreover, implementation research is needed to develop and evaluate clinical decision support systems based on this model, with the goal of optimizing delivery mode selection and improving perinatal outcomes in advanced-age primiparas.

Conclusions

This study successfully developed and validated a prediction model for failed trial of labor specifically designed for full-term, singleton, cephalic-presenting advanced-age primiparas. The model, which incorporates 11 clinically accessible predictor variables (10 of which were statistically significant), demonstrates robust discriminative performance and good calibration. It effectively identifies high-risk individuals, serving as an individualized and quantifiable tool to assist in prenatal counseling and decision-making regarding delivery mode. Clinical efforts may focus on optimizing gestational weight gain, correcting maternal anemia and promoting rational use of epidural analgesia, although causal inferences cannot be drawn from this observational study. Future research should focus on prospective, multi-center external validation to further assess generalizability and promote the model’s integration into routine clinical practice. The nomogram provides a static risk assessment intended for use at admission or during early labor planning. Its predictions should serve as an initial guide, integrated with the clinician’s continuous dynamic assessment of intrapartum factors such as labor progression, fetal heart rate patterns, and maternal condition.

Supplementary Material

Supplementary Material 1 Supplementary Material 1

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: pmc-nxml ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

SciLite annotations

chemicals 1

prostaglandin

Source provenance

europepmc: last seen: 2026-06-28T06:08:18.748782+00:00
scilite: last seen: 2026-06-28T09:31:30.222730+00:00
unpaywall: last seen: 2026-05-21T05:10:58.409756+00:00

License: CC-BY-NC-ND-4.0