Methods
This multicenter retrospective cohort study was conducted in the reproductive centers affiliated with three tertiary hospitals in southern China: the First Affiliated Hospital and the Third Affiliated Hospital of Guangxi Medical University (Nanning), and Yulin Maternal and Child Health Hospital (Yulin). This study included women who had completed at least two ovarian stimulation cycles between January 2016 and December 2023.
The diagnosis of DOR in this study required meeting at least two of the following three criteria [ 8 , 32 – 34 ] : [ 1 ] AMH < 1.1 ng/mL [ 2 ], AFC < 7, and [ 3 ] basal FSH ≥ 10 mIU/mL. A total of 2,883 patients diagnosed with DOR were included in this study. Figure 1 illustrates the patient inclusion and exclusion flowchart.
Fig. 1 Flowchart of study cohort selection. This figure outlines the process of applying inclusion and exclusion criteria to identify the final analysis cohort from the initial pool of IVF cycles collected from three hospitals
Flowchart of study cohort selection. This figure outlines the process of applying inclusion and exclusion criteria to identify the final analysis cohort from the initial pool of IVF cycles collected from three hospitals
Inclusion criteria:
Underwent IVF or intracytoplasmic sperm injection (ICSI). Met the diagnostic criteria for DOR as defined above. Completed at least two OS and OPU cycles within one year of the initial cycle.
Underwent IVF or intracytoplasmic sperm injection (ICSI).
Met the diagnostic criteria for DOR as defined above.
Completed at least two OS and OPU cycles within one year of the initial cycle.
Exclusion criteria:
Natural cycle OPU. Luteal phase stimulation cycles performed immediately after OPU within the same menstrual cycle (cycles initiated after natural ovulation were included). Cancelled OPU cycles due to premature ovulation or follicular luteinization. Couples diagnosed with chromosomal abnormalities. Endometriosis or adenomyosis. Fertility preservation because of malignant tumors. Cycles performed for preimplantation genetic testing (PGT).
Natural cycle OPU.
Luteal phase stimulation cycles performed immediately after OPU within the same menstrual cycle (cycles initiated after natural ovulation were included).
Cancelled OPU cycles due to premature ovulation or follicular luteinization.
Couples diagnosed with chromosomal abnormalities.
Endometriosis or adenomyosis.
Fertility preservation because of malignant tumors.
Cycles performed for preimplantation genetic testing (PGT).
The outcomes included initial ovarian response parameters (AFC and number of oocytes retrieved) and laboratory embryological parameters (number of 2PN zygotes, transferable embryos, and high-quality embryos).
All analyses were performed using R software (version 4.5.1). Covariate selection was guided by a directed acyclic graph (DAG; see Supplementary Fig. S1), which was constructed and analyzed using the Dagitty web-based tool (version 3.1, available at https://dagitty.net/ ) to identify the minimal sufficient adjustment set and minimize confounding bias. Prior to modeling, we assessed multicollinearity among the predictor variables, which led to the exclusion of total gonadotropin (Gn) dose due to high variance inflation (Supplementary Table S1). To ensure model stability and statistical power, data were grouped for analysis. Specifically, cycles 4–8 were merged into a single group (cycle ≥ 4) due to limited sample sizes, and ovarian stimulation protocols were categorized as detailed in the Results section due to the infrequent use of some regimens.
Missing values for estradiol (E2), luteinizing hormone (LH), and FSH (31 missing values each) were imputed with their respective medians due to skewed distributions (Supplementary Fig. S2). For AMH, which had a substantial proportion of missing data (1,225/6,651 cycles), multiple imputation was employed under the missing-at-random assumption, justified by the temporal pattern of AMH assay introduction across centers (Supplementary Fig. S3). The imputation model incorporated all outcome and predictor variables specified in the final analyses. The stability of the imputation and the plausible distribution of imputed values were confirmed (Supplementary Figs. S4 and S5).
A hierarchical model selection strategy was adopted for each count outcome. We prioritized comprehensive Generalized Linear Mixed Models (GLMMs) with random effects for center and patient to account for data clustering and repeated measures. Models that failed to converge or exhibited poor fit were systematically simplified—first by reducing the random effects structure, and if necessary, by reverting to a standard Generalized Linear Model (GLM). Final model selection was guided by the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), and diagnostic procedures assessing distributional fit (Kolmogorov-Smirnov test) and overdispersion. Descriptive statistics, a complete comparison of candidate models, and specifications of the final models are detailed in Supplementary Tables S2 and S3.
Subsequently, to formally assess the potential confounding role of regression to the mean (RTM), we conducted a simulation-based causal inference analysis. We specified a null model that contained only patient-level random intercepts and residual variance—explicitly omitting any cycle order effect—to capture the underlying data-generating process. Under this null model, we generated 10,000 simulated datasets. For each, we calculated the apparent treatment effects for cycles 2, 3, and 4 versus cycle 1, thereby constructing empirical distributions of effects expected solely from RTM and natural variation. This procedure was conducted for the overall cohort and repeated within pre-specified age strata. The observed effects from the primary GLMM analysis were then benchmarked against these RTM null distributions to evaluate statistical significance, with extensive sensitivity analyses performed to confirm the robustness of our conclusions.
A substantial proportion of cycles were excluded from the analysis of pregnancy outcomes due to factors precluding fresh embryo transfer. The predominant use of mild stimulation protocols often compromised endometrial receptivity. Furthermore, protocols such as Progestin-Primed Ovarian Stimulation (PPOS) and luteal-phase stimulation inherently prevent fresh transfers. Additionally, many patients opted for multiple consecutive stimulation cycles with embryo cryopreservation to accumulate embryos prior to further ovarian decline, thus deferring transfer. Consequently, only 1,511 cycles proceeded to fresh embryo transfer and were included in this analysis. Outcomes from frozen-thawed embryo transfer cycles were not included. This decision was made because embryos from multiple retrieval cycles were often pooled and cultured together to the blastocyst stage prior to transfer, making it impossible to attribute a subsequent pregnancy to a specific oocyte retrieval cycle.
Results
From an initial pool of 37,129 women who completed 73,538 IVF cycles during the study period, our final analytical cohort comprised 2,883 patients with DOR, who contributed a total of 6,651 treatment cycles.
The mean age at baseline was 37.78 ± 4.55 years, with an infertility duration of 5.00 ± 4.03 years. Basal characteristics included an AFC of 4.18 ± 1.52, FSH of 9.43 ± 4.13 IU/L, and AMH of 0.65 ± 0.34 ng/mL (Table 1 ).
Table 1 Baseline characteristics of patients in the initial IVF cycle ( N = 2,883) Characteristics Value Age (years) 37.78 ± 4.55 Infertility duration (years) 5.00 ± 4.03 Sterility type, n (%) Primary infertility 778 (26.99%) Secondary infertility 2,105 (73.01%) BMI 22.25 ± 2.92 bE 2 (pg/ml) 36.82 ± 18.26 bLH (IU/L) 3.61 ± 1.72 bFSH (IU/L) 9.43 ± 4.13 AMH (ng/mL) ( n = 2311, missing 572) 0.65 ± 0.34 AFC (n) 4.18 ± 1.52
Baseline characteristics of patients in the initial IVF cycle ( N = 2,883)
Patient retention across consecutive cycles is detailed in Supplementary Fig. S6 . The figure illustrates the consolidation of cycle orders, showing a substantial decline in sample size after the second cycle. Given the limited number of patients completing five or more cycles (collectively n = 46), which precluded stable statistical estimates, these higher orders (≥ 5) were merged with cycle 4 to form a composite “cycle ≥ 4” group for all subsequent analyses.
A similar grouping strategy was applied to ovarian stimulation protocols. The initial distribution and the consolidation process into final analytical categories are shown in Supplementary Fig. S7 . Mild stimulation was the most commonly used approach. To enhance statistical power, we consolidated the luteal phase long GnRH agonist, early follicular phase long GnRH agonist, ultra-long, and ultra-short protocols into a single “GnRH agonist protocol” category. Similarly, PPOS and luteal phase stimulation were merged into a “progestin-related stimulation protocols (PRSP)” category. All subsequent analyses were based on this grouped cohort.
To quantify the hypothesized mechanical activation of dormant follicles, we performed a mediation analysis assessing whether the improvement in oocyte yield in subsequent cycles was mediated by an increase in AFC. We estimated the 95% confidence intervals for the indirect effects by applying the bootstrap method with 1000 resamples to the multiply imputed datasets. An indirect effect was considered statistically significant if its 95% bootstrap CI excluded zero.
The analysis revealed a significant total effect when comparing the second cycle to the first, with an average increase of 0.47 oocytes (95% CI: 0.39 to 0.56, P < 0.001). This total effect was decomposed into a significant direct effect of 0.23 oocytes (95% CI: 0.14 to 0.30, P < 0.001) and a significant indirect effect of 0.24 oocytes (95% CI: 0.21 to 0.28, P < 0.001) mediated through AFC. The mediation proportion indicated that 51.4% of the total benefit in oocyte yield in the second cycle was attributable to the increase in AFC. In the third cycle, the total effect compared to the first cycle was attenuated but remained significant, with an increase of 0.16 oocytes (95% CI: 0.08 to 0.24, P < 0.05). The indirect effect via AFC, though smaller, remained statistically significant (0.07 oocytes, 95% CI: 0.04 to 0.11, P < 0.001), accounting for 45.0% of the total effect. By the fourth cycle, no significant effects were observed. The total, direct, and indirect effects were not statistically significant, with confidence intervals including zero (Table 2 ). The magnitude of the improvement in oocyte yield demonstrated a progressive attenuation with each successive cycle.
Table 2 Mediation analysis of the effect of the first oocyte retrieval on subsequent oocyte yield through AFC Cycle comparison Patients ( n ) Total effect Coeff. (95% CI) Direct effect Coeff. (95% CI) Indirect effect Coeff. (95% CI) Mediation proportion (%) Cycle 2 vs. 1 2,883 0.47 (0.39, 0.56) 1 0.23 (0.14, 0.30) 1 0.24 (0.21, 0.28) 1 51.4 Cycle 3 vs. 1 677 0.16 (0.08, 0.24) 1 0.09 (0.02, 0.17) 1 0.07 (0.04, 0.11) 1 45.0 Cycle 4 vs. 1 162 0.08 (-0.01, 0.16) 0.06 (-0.02, 0.14) 0.02 (-0.01, 0.05) 23.3 AFC, antral follicle count; CI, confidence interval All effects are presented as mean differences in the number of oocytes retrieved. The total effect represents the overall difference in oocyte yield compared to the first cycle. The direct effect is the residual difference after adjusting for the mediator AFC. The indirect effect is the mediation effect via AFC. 1 indicates that the 95% confidence interval does not include zero
Mediation analysis of the effect of the first oocyte retrieval on subsequent oocyte yield through AFC
AFC, antral follicle count; CI, confidence interval
All effects are presented as mean differences in the number of oocytes retrieved. The total effect represents the overall difference in oocyte yield compared to the first cycle. The direct effect is the residual difference after adjusting for the mediator AFC. The indirect effect is the mediation effect via AFC. 1 indicates that the 95% confidence interval does not include zero
Table 3 presents the results of the multivariable generalized linear mixed models, identifying factors independently associated with oocyte yield and embryological outcomes across subsequent stimulation cycles, while controlling for inter-cycle interval.
Table 3 Oocyte yield and embryological outcomes in subsequent cycles Fixed effects Oocytes Retrieved 2PN zygotes Transferable embryos High-quality embryos IRR (95% CI) P -value IRR (95% CI) P -value IRR (95% CI) P -value IRR (95% CI) P -value Age (years) 0.991 (0.987, 0.995) < 0.001 0.996 (0.991, 1.000) 0.080 0.996 (0.991, 1.001) 0.118 0.994 (0.987, 1.001) 0.082 Infertility duration (years) 1.000 (0.996, 1.004) 0.865 0.996 (0.991, 1.001) 0.126 0.992 (0.986, 0.997) 0.002 0.989 (0.982, 0.997) 0.005 BMI (kg/m²) 0.994 (0.988, 1.000) 0.031 0.989 (0.982, 0.996) 0.002 0.994 (0.986, 1.001) 0.100 1.000 (0.989, 1.011) 0.957 AMH (ng/mL) 1.734 (1.648, 1.824) < 0.001 1.563 (1.469, 1.664) < 0.001 1.510 (1.413, 1.615) < 0.001 1.580 (1.334, 1.870) < 0.001 FSH (IU/L) 0.967 (0.963, 0.972) < 0.001 0.969 (0.963, 0.974) < 0.001 0.972 (0.966, 0.978) < 0.001 0.973 (0.965, 0.981) < 0.001 Gn days 1.044 (1.039, 1.050) < 0.001 1.041 (1.034, 1.048) < 0.001 1.029 (1.022, 1.037) < 0.001 1.028 (1.018, 1.039) < 0.001 Cycle order (Ref: 1) 2 1.194 (1.149, 1.240) < 0.001 1.315 (1.254, 1.379) < 0.001 1.312 (1.243, 1.385) < 0.001 1.365 (1.223, 1.524) < 0.001 3 1.239 (1.092, 1.406) < 0.001 1.416 (1.217, 1.648) < 0.001 1.343 (1.132, 1.593) < 0.001 1.298 (1.090, 1.546) 0.003 ≥ 4 0.533 (0.169, 1.683) 0.283 0.591 (0.145, 2.404) 0.463 0.632 (0.156, 2.552) 0.519 1.186 (0.919, 1.530) 0.189 OS protocol (Ref: Mild stimulation) GnRH-a 1.481 (1.383, 1.585) < 0.001 1.512 (1.390, 1.645) < 0.001 1.506 (1.404, 1.616) < 0.001 1.424 (1.167, 1.736) < 0.001 GnRH-Ant 1.341 (1.275, 1.410) < 0.001 1.336 (1.257, 1.421) < 0.001 1.254 (1.192, 1.319) < 0.001 1.210 (1.066, 1.372) 0.003 PRSP 1.063 (1.004, 1.126) 0.037 1.061 (0.989, 1.138) 0.099 1.061 (0.992, 1.136) 0.085 1.044 (0.929, 1.173) 0.474 Inter-OPU interval (Ref: 1–3 months) 4–6 months 1.797 (0.558, 5.785) 0.326 1.731 (0.415, 7.211) 0.451 1.486 (0.356, 6.200) 0.587 N/A N/A 7–9 months 1.905 (0.597, 6.082) 0.276 1.920 (0.466, 7.907) 0.367 1.849 (0.451, 7.579) 0.393 N/A N/A 10–12 months 2.044 (0.642, 6.507) 0.226 1.892 (0.460, 7.777) 0.377 1.804 (0.441, 7.379) 0.412 N/A N/A Interaction: Interval × Order 4–6 months × Order 2 0.535 (0.166, 1.725) 0.295 0.546 (0.131, 2.279) 0.407 0.670 (0.160, 2.800) 0.583 N/A N/A 7–9 months × Order 2 0.526 (0.164, 1.681) 0.279 0.539 (0.131, 2.226) 0.393 0.585 (0.142, 2.405) 0.457 N/A N/A 10–12 months × Order 2 0.470 (0.147, 1.503) 0.203 0.518 (0.125, 2.138) 0.363 0.614 (0.149, 2.525) 0.499 N/A N/A 4–6 months × Order 3 0.494 (0.152, 1.606) 0.241 0.491 (0.117, 2.067) 0.332 0.603 (0.142, 2.549) 0.491 N/A N/A 7–9 months × Order 3 0.468 (0.145, 1.508) 0.204 0.432 (0.104, 1.798) 0.248 0.507 (0.122, 2.106) 0.350 N/A N/A 10–12 months × Order 3 0.433 (0.135, 1.393) 0.160 0.461 (0.111, 1.920) 0.287 0.551 (0.133, 2.290) 0.412 N/A N/A Random effects Variance SD Variance SD Variance SD Variance SD center ID 0.02 0.12 N/A N/A N/A N/A N/A N/A Patient ID 0.04 0.20 N/A N/A N/A N/A 0.156 0.395 Residual 2.43 1.56 1.57 1.25 1.25 1.12 0.762 0.873 IRR, incidence rate ratio; CI, confidence interval; BMI, body mass index; BMI, anti-Müllerian hormone; FSH, follicle-stimulating hormone; OPU, oocyte pick-up; Inter-OPU interval: interval measured in menstrual cycles between the repeated and initial oocyte retrieval; OS: ovarian stimulation; GnRH-Ant: GnRH antagonist protocol; PRSP: progestin-related stimulation protocols, including progestin-primed ovarian stimulation (PPOS) and luteal phase stimulation; GnRH-a, GnRH agonist protocol, including luteal phase long GnRH-a protocol, early follicular phase long GnRH-a protocol, and ultra-long protocol; Gn days: duration of gonadotropin stimulation in days; IVF, in vitro fertilization; ICSI, intracytoplasmic sperm injection Bold P-values indicate statistical significance ( P < 0.05). Interaction terms for Cycle order ≥ 4 are not shown due to limited data or model specification. Analysis for high-quality embryos included inter-OPU interval as a continuous variable due to limited event counts; interaction terms are not applicable
Oocyte yield and embryological outcomes in subsequent cycles
IRR, incidence rate ratio; CI, confidence interval; BMI, body mass index; BMI, anti-Müllerian hormone; FSH, follicle-stimulating hormone; OPU, oocyte pick-up; Inter-OPU interval: interval measured in menstrual cycles between the repeated and initial oocyte retrieval; OS: ovarian stimulation; GnRH-Ant: GnRH antagonist protocol; PRSP: progestin-related stimulation protocols, including progestin-primed ovarian stimulation (PPOS) and luteal phase stimulation; GnRH-a, GnRH agonist protocol, including luteal phase long GnRH-a protocol, early follicular phase long GnRH-a protocol, and ultra-long protocol; Gn days: duration of gonadotropin stimulation in days; IVF, in vitro fertilization; ICSI, intracytoplasmic sperm injection
Bold P-values indicate statistical significance ( P < 0.05). Interaction terms for Cycle order ≥ 4 are not shown due to limited data or model specification. Analysis for high-quality embryos included inter-OPU interval as a continuous variable due to limited event counts; interaction terms are not applicable
Cycle order and interval effects. A significant, progressive benefit was observed for the second and third cycles compared to the first cycle. The number of oocytes retrieved was significantly higher in the second (IRR = 1.194, 95% CI 1.149–1.240, P < 0.001) and third cycles (IRR = 1.239, 95% CI 1.092–1.406, P < 0.001). This improvement propagated through the laboratory, with the second and third cycles also yielding significantly higher numbers of 2PN zygotes, transferable embryos, and high-quality embryos (all P 0.15). This indicates that the magnitude of improvement in the second and third cycles remained consistent regardless of whether the subsequent retrieval was performed 1–3 months or up to 12 months after the initial cycle.
Key clinical and laboratory predictors. As expected, anti-Müllerian hormone (AMH) level was the strongest positive predictor for all outcomes (all P < 0.001), while higher basal FSH was consistently associated with poorer outcomes (all P < 0.001). A longer duration of gonadotropin stimulation was associated with higher yields across all outcomes (all P < 0.001). Among stimulation protocols, GnRH agonist and antagonist protocols were associated with significantly higher oocyte and embryo yields compared to the mild stimulation reference.
Random effects. The models accounted for substantial variability between patients, as indicated by the patient-level random intercept variance.
To isolate the effect of cycle order from potential confounding by changes in stimulation strategy, we analyzed a sub-cohort of patients who underwent identical ovarian stimulation protocols in their first and subsequent cycles (Supplementary Table S4 ). When the protocol was held constant, the second cycle demonstrated significantly improved outcomes across all embryological parameters compared to the first cycle, with increases in oocytes retrieved, 2PN zygotes, transferable embryos, and high-quality embryos (all P < 0.001). The benefit observed in the second cycle, however, showed considerable attenuation in the third cycle, where only oocyte yield remained significantly improved ( P = 0.048), while embryological outcomes were no longer significantly different from the first cycle (all P > 0.05).
Subgroup analysis across different age groups revealed that the benefit of repeated ovarian stimulation was most pronounced and sustained in patients aged 40 years and younger. In these patients, the second cycle was associated with substantial improvements across all outcomes, with significant benefits persisting into the third cycle in the younger subgroups (Supplementary Table S5 ). In contrast, for patients older than 40 years, the benefit was markedly limited, with only a modest improvement in oocyte yield observed in the second cycle ( P = 0.031) and no significant benefits in subsequent cycles or for embryo parameters.
When accounting for regression to the mean (RTM) effects, the apparent cycle improvements were substantially attenuated. As shown in Supplementary Table S6 and Supplementary Figs. S8 and S9, the RTM-expected effects consistently exceeded the observed effects across all age groups and cycle comparisons. The adjusted effects (observed minus RTM-expected) were uniformly negative, ranging from − 0.375 to -1.605 oocytes.
The discrepancy between observed and RTM-expected effects was consistent across all age strata (Table 4 ; Fig. 2 ). The adjusted effects (observed minus RTM-expected) were negative in all nine comparisons. The point estimates of these adjusted effects were most negative in older patients (> 40 years, ranging from − 1.181 to -0.876 oocytes) and least negative in younger patients (≤ 35 years, ranging from − 0.872 to -0.441 oocytes), with the middle age group showing intermediate values. Formal statistical testing confirmed that in no instance did the observed effect significantly differ from its RTM-expected counterpart (all p-values > 0.49), indicating that the apparent cycle improvements are statistically indistinguishable from what would be expected by chance through RTM alone.
Table 4 Comparison of observed effects vs. RTM-expected effects by age group Age Group Cycle comparison Observed effect (95% CI) RTM expected effect (95% CI) Adjusted effect P value ≤ 35 years Cycle 2 vs. 1 0.813 (0.813, 0.813) 1.254 (1.052, 1.469) -0.441 0.507 Cycle 3 vs. 1 0.394 (0.394, 0.394) 1.266 (1.065, 1.480) -0.872 0.505 Cycle 4 vs. 1 0.627 (0.627, 0.627) 1.268 (1.064, 1.468) -0.641 0.519 36–40 years Cycle 2 vs. 1 0.503 (0.503, 0.503) 1.254 (1.061, 1.464) -0.751 0.507 Cycle 3 vs. 1 0.322 (0.322, 0.322) 1.246 (1.039, 1.436) -0.924 0.506 Cycle 4 vs. 1 0.697 (0.697, 0.697) 1.251 (1.047, 1.452) -0.555 0.506 > 40 years Cycle 2 vs. 1 0.250 (0.250, 0.250) 1.266 (1.057, 1.485) -1.016 0.493 Cycle 3 vs. 1 0.381 (0.381, 0.381) 1.257 (1.054, 1.454) -0.876 0.502 Cycle 4 vs. 1 0.084 (0.084, 0.084) 1.265 (1.061, 1.464) -1.181 0.501 RTM, Regression to the Mean; CI, Confidence Interval
Comparison of observed effects vs. RTM-expected effects by age group
RTM, Regression to the Mean; CI, Confidence Interval
Fig. 2 Comparison of Observed Effects vs. RTM Expected Effects Across Age Groups. This dot plot illustrates the comparison of effect sizes across three age groups (≤ 35 years, 36–40 years, > 40 years) and three cycle comparisons (Cycle1-Cycle2, Cycle1-Cycle3, Cycle1-Cycle4). Different colors represent distinct effect types: green dots denote Adjusted Effect, blue dots denote Observed Effect, and red dots with error bars denote RTM Expected Effect (RTM = Regression to the Mean). The x-axis indicates the effect size, and the y-axis indicates the effect type and age group stratification
Comparison of Observed Effects vs. RTM Expected Effects Across Age Groups. This dot plot illustrates the comparison of effect sizes across three age groups (≤ 35 years, 36–40 years, > 40 years) and three cycle comparisons (Cycle1-Cycle2, Cycle1-Cycle3, Cycle1-Cycle4). Different colors represent distinct effect types: green dots denote Adjusted Effect, blue dots denote Observed Effect, and red dots with error bars denote RTM Expected Effect (RTM = Regression to the Mean). The x-axis indicates the effect size, and the y-axis indicates the effect type and age group stratification
The sensitivity analysis demonstrated the robustness of our primary findings across a wide range of assumptions regarding variance components (Table 5 ). Under all six scenarios examined, the RTM-expected effects remained greater than the observed effects (which ranged from 0.084 to 0.813 oocytes, as shown in Table 4 ). Within-patient variance had the most substantial impact on the magnitude of the RTM effect. Under the high within-patient variance scenario, RTM-expected effects reached 1.99 to 2.04 oocytes, whereas under the low within-patient variance scenario, they were 0.60 to 0.62 oocytes—still exceeding the maximum observed effect of 0.81 oocytes in all cases. Between-patient variance exhibited a more modest influence. Nevertheless, across all scenarios, the RTM-expected effects were consistently greater than 1.0 oocyte. This reinforces the primary conclusion that regression to the mean is a substantial, and likely dominant, contributor to the apparent cycle improvements.
Table 5 Sensitivity analysis of RTM effects under different variance component scenarios Age group Cycle comparison Base case Low within variance High within variance Low between variance High between variance Balanced variance ≤ 35 years Cycle 2 1.254 0.603 2.020 1.609 1.008 1.946 Cycle 3 1.266 0.619 2.025 1.606 1.028 1.950 Cycle 4 1.268 0.613 2.021 1.610 1.021 1.918 36–40 years Cycle 2 1.254 0.603 1.991 1.614 1.016 1.940 Cycle 3 1.246 0.609 2.004 1.607 1.013 1.957 Cycle 4 1.251 0.605 2.022 1.596 1.010 1.953 > 40 years Cycle 2 1.266 0.619 2.038 1.623 1.029 1.960 Cycle 3 1.257 0.615 2.007 1.614 1.016 1.960 Cycle 4 1.265 0.621 2.029 1.620 1.010 1.961 RTM, Regression to the Mean. Note. See Supplementary Table S6 for detailed variance component parameters
Sensitivity analysis of RTM effects under different variance component scenarios
RTM, Regression to the Mean. Note. See Supplementary Table S6 for detailed variance component parameters
We evaluated the robustness of our RTM estimates across six variance component scenarios (Supplementary Table S7), including variations in both within-patient (cycle-to-cycle) and between-patient variability. These scenarios were designed to test the sensitivity of our findings to different assumptions about the underlying variance structure of ovarian response.
A detailed breakdown of pregnancy outcomes is provided in Supplementary Fig. S10. Given the limited number of pregnancy events, outcomes were consolidated for analysis: live birth (including singleton, twin, and one heterotopic pregnancy managed successfully via laparoscopic surgery) were classified as Live Birth; early miscarriage, late miscarriage, biochemical pregnancy, and ectopic pregnancy were classified as Pregnancy Loss. Furthermore, as most pregnancies occurred in the first two cycles, all cycles beyond the first were aggregated for comparison (Cycle ≥ 2).
Analysis of these fresh embryo transfer cycles demonstrated superior outcomes in subsequent cycles compared to the initial cycle (Table 6 ). The clinical pregnancy rate was significantly higher in Cycle ≥ 2 (18.74%, 152/811) than in Cycle 1 (12.00%, 84/700) (χ² = 12.45, P < 0.001). This improvement translated into a significantly higher live birth rate (11.34% [92/811] vs. 6.57% [46/700]; χ² = 9.75, P = 0.002). Among cycles achieving a clinical pregnancy, the rate of pregnancy loss did not differ significantly between Cycle 1 (45.24%) and Cycle ≥ 2 (39.47%) (χ² = 0.52, P = 0.470).
Table 6 Pregnancy Outcomes in Cycle 1 vs. Subsequent Cycles (Cycles ≥2) Outcome Cycle 1 ( n = 700) Cycle ≥ 2 ( n = 811) χ² (df = 1) P value Pregnancy rate 12.00% (84/700) 18.74% (152/811) 12.453 <0.001 Live Birth rate 6.57% (46/700) 11.34% (92/811) 9.745 0.002 Pregnancy Loss rate 45.24% (38/84) 40.54% (60/152) 0.522 0.470 Values are presented as percentage (number of events/total). Pregnancy loss rate is calculated among those who achieved a pregnancy
Pregnancy Outcomes in Cycle 1 vs. Subsequent Cycles (Cycles ≥2)
Values are presented as percentage (number of events/total). Pregnancy loss rate is calculated among those who achieved a pregnancy
Discussion
Our study provides a critical methodological reappraisal of the presumed “ovarian awakening” phenomenon [ 31 ] in women with diminished ovarian reserve. The most compelling finding is the consistent superiority of the second and third cycles across all laboratory endpoints, independent of the time elapsed since the first retrieval. The lack of significant interaction between cycle order and inter-OPU interval is particularly informative. It demonstrates that the observed improvement is not stronger when cycles are performed in immediate succession, nor does it attenuate when cycles are spaced up to a year apart. While initial generalized linear mixed models aligned with certain clinical reports suggesting improved outcomes in subsequent cycles [ 35 ], comprehensive regression to the mean (RTM) analyses revealed that these apparent improvements are statistically indistinguishable from effects expected by chance alone. This finding offers a novel perspective on the discordance between clinical observations and animal studies warning of ovarian damage from repeated stimulation [ 23 , 36 – 38 ]. The RTM-expected effects substantially exceeded observed effects across all age strata, indicating that what appeared to be a biological phenomenon may primarily represent a statistical artifact. Consequently, our results compel a re-evaluation of the follicular activation hypothesis and underscore the imperative to control for RTM in the interpretation of sequential treatment cycles.
The most striking finding of our research is not the transient nature of an ovarian activation effect, but rather the profound influence of statistical regression on the interpretation of cycle-to-cycle outcomes. While we initially observed an apparent increase in oocytes retrieved in the second cycle, with over half of this increase statistically mediated through AFC, our comprehensive RTM analysis demonstrates that this entire observed effect falls within the range expected by chance alone. Previous interpretations of similar phenomena in oocyte donors [ 39 , 40 ] as evidence of biological enhancement must now be reconsidered in light of potential RTM confounding. The proposed biological mechanism—that mechanical stimulation from the first OPU activates dormant primordial follicles—while physiologically plausible, requires more rigorous controlled studies to distinguish it from the powerful statistical artifact revealed by our analysis.
While the proposed concept of in vitro activation (IVA) and the role of the Hippo signaling pathway provide a biologically plausible framework for understanding how mechanical manipulation could theoretically activate dormant follicles, our findings necessitate a critical reevaluation of this mechanistic interpretation. The RTM analysis demonstrates that the apparent “profound cycle-order effect” can be fully explained by statistical phenomena, independent of any biological pathway. Therefore, while the hypothesis that OPU-induced disruption of the Hippo pathway could stimulate follicular growth remains mechanistically intriguing [ 41 ], our results indicate that such proposed mechanisms cannot be inferred from uncontrolled cycle-to-cycle comparisons alone. Future investigations into these molecular pathways require study designs that explicitly account for and minimize the substantial confounding influence of regression to the mean.
Our study was specifically designed to rigorously quantify the contribution of regression to the mean (RTM), moving beyond merely considering it as a potential confounder. Contrary to the initial interpretation, our RTM simulations demonstrate that the statistical phenomenon is not just a contributor but rather the predominant explanation for the observed data pattern. The apparent dose-response-like attenuation and the statistical mediation through AFC—previously interpreted as hallmarks of a biological mechanism—are, in fact, fully compatible with a pure RTM model that incorporates no actual treatment effect. The consistent and substantial negative adjusted effects across all age groups and cycles indicate that the observed “improvements” are not merely superimposed upon RTM but are instead entirely subsumed by it. Therefore, the evidence collectively suggests that the perceived “ovarian awakening” effect is more parsimoniously explained by a profound statistical artifact, with no compelling evidence remaining to support the existence of a transient biological phenomenon.
Furthermore, our study defines the temporal boundaries of the “ovarian awakening” effect. The profound benefit observed in the second cycle attenuated by the third and was no longer significant by the fourth, particularly when the stimulation protocol was held constant. This attenuation of the effect does not invalidate the “ovarian awakening” concept but rather defines its scope. It suggests that the ovarian response to repeated stimulation is not limitless. We hypothesize that the initial mechanical and pharmacological stimulus may effectively recruit a cohort of previously dormant or less responsive follicles, leading to the marked improvement in the immediate subsequent cycles. However, the pool of such “awakenable” follicles is likely finite. After this initial reserve is mobilized, the ovary may revert to its baseline functional state, governed by the patient’s inherent ovarian reserve and age.
Our age-stratified RTM analysis reinterprets the apparent age-dependent pattern of benefits [ 4 , 42 , 43 ]. While the point estimates of the observed effects varied by age, the RTM-expected effects were substantial and consistent across all groups. The resulting adjusted effects (observed minus expected) were uniformly negative, and the observed effects in no age group significantly exceeded the RTM-expected range. This indicates that the apparent age-gradient is statistically fragile and that all patient groups are susceptible to this profound statistical artifact.Therefore, our data cannot differentiate between a genuine, age-modulated “ovarian awakening” effect and a pure RTM artifact; the current evidence is statistically compatible with both explanations.
The primary strength of this study lies in its rigorous quantification of RTM, moving beyond merely considering it as a potential confounder. Our analyses robustly demonstrate that RTM is a sufficient explanation for the observed cycle-to-cycle improvements. While we cannot definitively rule out a vanishingly small concurrent biological effect, the current evidence presents a scenario of statistical equivalence between a pure RTM explanation and one that combines RTM with a trivial biological effect. Therefore, the most precise interpretation is not that “ovarian awakening” is definitively absent, but that its purported magnitude is severely overstated in analyses that fail to account for RTM. Therefore, conclusively distinguishing between true biological effects and regression to the mean will require future study designs specifically aimed at controlling for this confounder. For instance, patients with DOR and a suboptimal initial response could be randomized to start a second cycle either immediately or after a predefined interval (e.g., 6 months). A significant outcome difference would support a biological effect, whereas similar improvements would suggest regression to the mean as the primary driver. Finally, since most embryos were cryopreserved and often pooled from multiple retrieval cycles for later transfer, it is impossible to accurately attribute a live birth to a specific cycle. Therefore, we only analyzed the outcomes of fresh embryo transfers.
Introduction
DOR is characterized by a reduced quantity and/or quality of oocytes [ 1 , 2 ]. Consequently, the limited oocyte yield often necessitates repeated IVF cycles to obtain a sufficient number of oocytes [ 3 , 4 ], raising concerns about potential adverse effects of consecutive OS and OPU on ovarian reserve, response, and embryo quality.
Unlike male mammals, female mammals are born with a fixed number of primary oocytes [ 5 , 6 ]. As women approach menopause or experience POI or ovarian failure, the number of antral follicles declines. Nevertheless, a limited number of inactive primordial follicles remain in the ovaries. These primordial follicles, existing in a quiescent state, serve as a latent reserve within the human ovary. Ovulation-inducing hormones are effective only for stimulating the growth of existing follicles, which are gonadotropin-dependent; they are ineffective for activating dormant primordial follicles [ 7 ].
Primordial follicle growth activation (PFGA) is the process through which these dormant primordial follicles are activated, transitioning into an active growth phase and ultimately maturing into fully developed follicles [ 8 ]. In the mammalian ovary, the PTEN/PI3K/AKT signaling pathway is essential for the activation and growth of primordial follicles [ 9 – 13 ]. The Hippo signaling pathway plays a critical role in regulating and maintaining proper organ size, and specifically prevents the activation of the primordial follicles within the ovary [ 14 – 16 ].
In vitro activation (IVA) of primordial follicles is a novel assisted reproductive technology that utilizes the underlying mechanism of primordial follicle activation, offering new fertility prospects for patients with premature ovarian failure (POF). The initial IVA was applied in POI patients. Ovariectomy was performed during laparoscopic surgery, and the ovarian cortex was carefully sectioned into strips for vitrification. After thawing, the ovarian strips were cultured with Akt activators for a period of two days, and then were carefully autotransplanted under laparoscopic guidance [ 15 , 17 ]. The combination of ovarian segmentation and AKT stimulation in IVA has successfully promoted follicular growth leading to successful live births in patients with POI [ 15 , 18 – 25 ]. Moreover, IVA can support infertile women of middle age by allowing them to utilize their own oocytes [ 26 , 27 ].
However, this technique requires two laparoscopic operations. In addition, the chemical agents may have potential negative consequences on oocytes quality [ 9 , 24 ]. The drug-free IVA technique was confirmed to be effective in follicle activation through ovarian biopsy/scratch, fragmentation, and autotransplantation to disrupt the Hippo pathway, also eliminates the need for a drug incubation period and requires only one surgery [ 24 , 28 , 29 ].
In patients with DOR, repeated oocyte retrieval cycles are often necessitated by suboptimal ovarian responsiveness. We hypothesize that the mechanical stimulation from ovarian cortex puncture during the initial oocyte retrieval procedure may mimic the physical principle underlying drug-free IVA, potentially disrupting the Hippo signaling pathway to activate dormant follicles. This concept is supported by a previous study demonstrating that needle puncturing of human ovarian tissue can alter angiogenic gene expression and improve follicle morphology, suggesting its potential to initiate follicular activation [ 30 ].
Intriguingly, this iatrogenic stimulus may tap into an evolutionarily conserved biological process. A review of model organisms has described the fundamental phenomenon of “oocyte awakening,” whereby dormant primordial follicles are activated in response to specific nutrient-sensing signals (e.g., mTOR, insulin) or noradrenergic cues [ 31 ]. Therefore, we propose a novel clinical hypothesis: the initial OPU procedure serves as a therapeutic trigger for a state of “ovarian awakening” in women with DOR, thereby enhancing follicular recruitment and improving oocyte yield and embryological outcomes in subsequent cycles. To date, however, a comprehensive comparative analysis of treatment outcomes between initial and subsequent IVF cycles in this patient population remains conspicuously absent from the literature, a gap this study aims to fill.
Supplementary Material
Supplementary Material 1. Fig S1. Directed Acyclic Graphs (DAGs) for AFC and IVF outcomes. Panels (a)–(f) depict causal frameworks for (a) antral follicle count, (b) oocytes retrieved, (c) normally fertilized oocytes, (d) transferable embryos, (e) high - quality embryos, and (f) pregnancy outcome, illustrating the relationships among various variables in the reproductive process.
Supplementary Material 1. Fig S1. Directed Acyclic Graphs (DAGs) for AFC and IVF outcomes. Panels (a)–(f) depict causal frameworks for (a) antral follicle count, (b) oocytes retrieved, (c) normally fertilized oocytes, (d) transferable embryos, (e) high - quality embryos, and (f) pregnancy outcome, illustrating the relationships among various variables in the reproductive process.
Supplementary Material 2. Fig. S2. Distribution and Q-Q plots of sex hormone levels. Panels (a), (b), and (c) display the distribution (left subplots) and quantile - quantile (Q - Q) plots (right subplots) for (a) follicle - stimulating hormone (FSH), (b) luteinizing hormone (LH), and (c) estradiol (E2) levels, respectively.
Supplementary Material 2. Fig. S2. Distribution and Q-Q plots of sex hormone levels. Panels (a), (b), and (c) display the distribution (left subplots) and quantile - quantile (Q - Q) plots (right subplots) for (a) follicle - stimulating hormone (FSH), (b) luteinizing hormone (LH), and (c) estradiol (E2) levels, respectively.
Supplementary Material 3. Fig. S3. Annual missing rate of AMH. The bar plot illustrates the annual count of missing Anti-Müllerian Hormone (AMH) measurements in the dataset.
Supplementary Material 3. Fig. S3. Annual missing rate of AMH. The bar plot illustrates the annual count of missing Anti-Müllerian Hormone (AMH) measurements in the dataset.
Supplementary Material 4. Fig. S4. Density plot of AMH after multiple imputation. Density plot illustrating the distribution of the AMH variable after multiple imputation. Different curves represent the density distribution of AMH values across multiple imputed datasets.
Supplementary Material 4. Fig. S4. Density plot of AMH after multiple imputation. Density plot illustrating the distribution of the AMH variable after multiple imputation. Different curves represent the density distribution of AMH values across multiple imputed datasets.
Supplementary Material 5. Fig. S5. Mean - iteration plot of AMH during multiple imputation. The left panel displays the mean values of AMH across iterations, and the right panel shows the standard deviation (sd) of AMH values across iterations.
Supplementary Material 5. Fig. S5. Mean - iteration plot of AMH during multiple imputation. The left panel displays the mean values of AMH across iterations, and the right panel shows the standard deviation (sd) of AMH values across iterations.
Supplementary Material 6. Fig. S6. Distribution of cycle orders before and after merging (n = 6,651). The figure compares the distribution of cycle orders in the study cohort. The left panel shows the original cycle orders, while the right panel shows the consolidated categories after merging. The merging process grouped orders ≥ 5 into a single category (order ≥ 4).
Supplementary Material 6. Fig. S6. Distribution of cycle orders before and after merging (n = 6,651). The figure compares the distribution of cycle orders in the study cohort. The left panel shows the original cycle orders, while the right panel shows the consolidated categories after merging. The merging process grouped orders ≥ 5 into a single category (order ≥ 4).
Supplementary Material 7. Fig. S7. Consolidation of ovarian stimulation protocol categories. Distribution of ovarian stimulation protocols in the study cohort before (left) and after (right) category merging. For analysis, original protocol categories were consolidated based on pharmacological mechanism: multiple GnRH agonist-based protocols (luteal phase long, early follicular phase long, ultra-long, and ultra-short) were merged into a single “GnRH agonist protocol” category, while progestin-primed protocols (PPOS and luteal phase stimulation) were combined into a “Progestin-related stimulation protocols (PRSP)” category.OS: ovarian stimulation.
Supplementary Material 7. Fig. S7. Consolidation of ovarian stimulation protocol categories. Distribution of ovarian stimulation protocols in the study cohort before (left) and after (right) category merging. For analysis, original protocol categories were consolidated based on pharmacological mechanism: multiple GnRH agonist-based protocols (luteal phase long, early follicular phase long, ultra-long, and ultra-short) were merged into a single “GnRH agonist protocol” category, while progestin-primed protocols (PPOS and luteal phase stimulation) were combined into a “Progestin-related stimulation protocols (PRSP)” category.OS: ovarian stimulation.
Supplementary Material 8. Fig. S8. RTM effect boxplot. Box Plot Elements: The box represents the interquartile range (IQR, 25th to 75th percentiles), the horizontal line within the box denotes the median, and the whiskers extend to 1.5×IQR. Individual dots represent outliers. X-axis: Cycle Comparisons (Cycle1-Cycle2, Cycle1-Cycle3, Cycle1-Cycle4). Y-axis: Effect Size (Purely Due to Regression to the Mean, RTM). Cycle Comparison Groups: Blue boxes for Cycle1-Cycle2, red boxes for Cycle1-Cycle3, and green boxes for Cycle1-Cycle4. Red Dashed Line: Denotes an effect size of 0.
Supplementary Material 8. Fig. S8. RTM effect boxplot. Box Plot Elements: The box represents the interquartile range (IQR, 25th to 75th percentiles), the horizontal line within the box denotes the median, and the whiskers extend to 1.5×IQR. Individual dots represent outliers. X-axis: Cycle Comparisons (Cycle1-Cycle2, Cycle1-Cycle3, Cycle1-Cycle4). Y-axis: Effect Size (Purely Due to Regression to the Mean, RTM). Cycle Comparison Groups: Blue boxes for Cycle1-Cycle2, red boxes for Cycle1-Cycle3, and green boxes for Cycle1-Cycle4. Red Dashed Line: Denotes an effect size of 0.
Supplementary Material 9. Fig. S9. Observed vs. RTM comparison. RTM-Only Effect: Gray dots, representing the effect size purely due to regression to the mean. Total Effect: Blue dots, representing the overall observed effect size. Direct Effect: Red dots, representing the direct component of the effect. Indirect Effect: Green dots, representing the indirect component of the effect. X-axis: Effect Size. Y-axis: Effect Type, stratified by three cycle comparisons (Cycle1-Cycle2, Cycle1-Cycle3, Cycle1-Cycle4).
Supplementary Material 9. Fig. S9. Observed vs. RTM comparison. RTM-Only Effect: Gray dots, representing the effect size purely due to regression to the mean. Total Effect: Blue dots, representing the overall observed effect size. Direct Effect: Red dots, representing the direct component of the effect. Indirect Effect: Green dots, representing the indirect component of the effect. X-axis: Effect Size. Y-axis: Effect Type, stratified by three cycle comparisons (Cycle1-Cycle2, Cycle1-Cycle3, Cycle1-Cycle4).
Supplementary Material 10. Fig. S10. Distribution of pregnancy outcomes following fresh embryo transfer. The bar chart illustrates the frequencies and corresponding percentages of different pregnancy outcomes observed in the fresh embryo transplanted cycles. Outcome categories include live birth (singleton and twin), early and late miscarriage, biochemical pregnancy, ectopic pregnancy, and heterotopic pregnancy. The percentage values are calculated based on the total number of transplanted cycles analyzed.
Supplementary Material 10. Fig. S10. Distribution of pregnancy outcomes following fresh embryo transfer. The bar chart illustrates the frequencies and corresponding percentages of different pregnancy outcomes observed in the fresh embryo transplanted cycles. Outcome categories include live birth (singleton and twin), early and late miscarriage, biochemical pregnancy, ectopic pregnancy, and heterotopic pregnancy. The percentage values are calculated based on the total number of transplanted cycles analyzed.
Supplementary Material 11. Table S1 Assessment of multicollinearity for predictors in IVF outcome models. IVF: In vitro fertilization; GVIF: Generalized variance inflation factor; BMI: body mass index; AMH: anti-Müllerian hormone; AFC: antral follicle count; FSH: follicle-stimulating hormone; OS: ovarian stimulation; Gn days: duration of gonadotropin stimulation in days; Inter-OPU interval: interval measured in menstrual cycles between the repeated and initial oocyte retrieval. *The table presents the Adjusted GVIF (GVIF^1/(2×Df)). For variables present in multiple models, the range of values across models is shown. A common threshold of Adjusted GVIF > 2 was used to indicate potential collinearity.
Supplementary Material 11. Table S1 Assessment of multicollinearity for predictors in IVF outcome models. IVF: In vitro fertilization; GVIF: Generalized variance inflation factor; BMI: body mass index; AMH: anti-Müllerian hormone; AFC: antral follicle count; FSH: follicle-stimulating hormone; OS: ovarian stimulation; Gn days: duration of gonadotropin stimulation in days; Inter-OPU interval: interval measured in menstrual cycles between the repeated and initial oocyte retrieval. *The table presents the Adjusted GVIF (GVIF^1/(2×Df)). For variables present in multiple models, the range of values across models is shown. A common threshold of Adjusted GVIF > 2 was used to indicate potential collinearity.
Supplementary Material 12. Table S2 Descriptive Statistics and Final Model Selection for Count Outcomes. B GLMM: Binomial generalized linear mixed model NB GLM: Negative binomial generalized linear model NB GLMM: Negative binomial generalized linear mixed model P GLM: Poisson generalized linear model P GLMM: Poisson generalized linear mixed model RE: Random effects (2RE: center + patient; 1RE: patient only).
Supplementary Material 12. Table S2 Descriptive Statistics and Final Model Selection for Count Outcomes. B GLMM: Binomial generalized linear mixed model NB GLM: Negative binomial generalized linear model NB GLMM: Negative binomial generalized linear mixed model P GLM: Poisson generalized linear model P GLMM: Poisson generalized linear mixed model RE: Random effects (2RE: center + patient; 1RE: patient only).
Supplementary Material 13. Table S3 Specifications of the final models for outcome variables.
Supplementary Material 13. Table S3 Specifications of the final models for outcome variables.
Supplementary Material 14. Table S4 IVF outcomes in subsequent cycles versus the first cycle under identical OS protocols. IVF: in vitro fertilization; OS: ovarian stimulation; IRR: incidence rate ratio; CI: confidence interval.Note. The exponentiated coefficients (95% confidence intervals) for cycle order are derived from fully adjusted multivariable mixed-effects models. Due to space limitations, only the results for cycle order are presented; results for the full set of covariates are available upon request.
Supplementary Material 14. Table S4 IVF outcomes in subsequent cycles versus the first cycle under identical OS protocols. IVF: in vitro fertilization; OS: ovarian stimulation; IRR: incidence rate ratio; CI: confidence interval.Note. The exponentiated coefficients (95% confidence intervals) for cycle order are derived from fully adjusted multivariable mixed-effects models. Due to space limitations, only the results for cycle order are presented; results for the full set of covariates are available upon request.
Supplementary Material 15. Table S5 Stratified analysis of IVF outcomes for subsequent cycles compared to cycle 1, by age group. IVF: in vitro fertilization; IRR: incidence rate ratio; CI: confidence interval.Note. The exponentiated coefficients (95% confidence intervals) for cycle order are derived from fully adjusted multivariable mixed-effects models. Due to space limitations, only the results for cycle order are presented; results for the full set of covariates are available upon request.
Supplementary Material 15. Table S5 Stratified analysis of IVF outcomes for subsequent cycles compared to cycle 1, by age group. IVF: in vitro fertilization; IRR: incidence rate ratio; CI: confidence interval.Note. The exponentiated coefficients (95% confidence intervals) for cycle order are derived from fully adjusted multivariable mixed-effects models. Due to space limitations, only the results for cycle order are presented; results for the full set of covariates are available upon request.
Supplementary Material 16. Table S6 Comparison of observed effects vs. RTM-expected effects in overall population. RTM: Regression to the mean. Adjusted effect = Observed effect - RTM expected effect. Simulation interval represents the 25th to 75th percentiles of 1000 RTM simulations.
Supplementary Material 16. Table S6 Comparison of observed effects vs. RTM-expected effects in overall population. RTM: Regression to the mean. Adjusted effect = Observed effect - RTM expected effect. Simulation interval represents the 25th to 75th percentiles of 1000 RTM simulations.
Supplementary Material 17. Table S7 Variance component parameters for sensitivity analysis scenarios. SD: Standard Deviation. The base case parameters were derived from empirical estimates of the study population. Alternative scenarios were selected to represent biologically plausible ranges of variance components.
Supplementary Material 17. Table S7 Variance component parameters for sensitivity analysis scenarios. SD: Standard Deviation. The base case parameters were derived from empirical estimates of the study population. Alternative scenarios were selected to represent biologically plausible ranges of variance components.