Methods
In this study, a retrospective analysis was performed on real-world data prospectively collected by the D·I·R, including women who underwent COS in Germany between 2017 and 2022.
The D·I·R provides results of reproductive medical treatments from all regions of Germany, which are published in annual reports. The D·I·R currently has 140 fertility centers that electronically report the data required for quality assessment of each treatment cycle initiated. Patient data are pseudonymized [ 12 ]. For this study, only data from centers using both hrFSH and rFSH for COS were included ( N = 74).
Data were collected from women who were treated with either follitropin delta (hrFSH) or recombinant follitropins (rFSH) for COS during their stimulation cycle of ART (IVF/ICSI). While PR and LBR per embryo transfer (ET) were calculated for women whose number of previous cycles was not specified (i.e., it could be their 1st, 2nd, 3rd, etc.), cumulative PR and LBR were calculated for naive patients who had no previous stimulation (1st oocyte pickup [OPU]). The analysis focused on women aged 24–45 years at the start of their COS cycle.
The outcomes were number of oocytes retrieved, PR and cumulative PR (per ET; followed up to 12/31/2022 and 12/31/2021, respectively), as well as LBR and cumulative LBR (per 1st OPU; followed up to 12/31/2021 and 12/31/2020, respectively). Pregnancy was defined as clinically determined intrauterine pregnancy including miscarriage. Biochemical pregnancies were not defined as “pregnant” and ectopic pregnancies were excluded.
PR and LBR were calculated according to the number of ET after excluding freeze-all cycles and cycles that ended without ET. Cumulative PR and cumulative LBR were calculated for all fresh and frozen/thawed embryo transfers (FET) after the 1st OPU. All fresh cycles with 1st OPU that ended in freeze-all or without cryopreservation were excluded from this cumulative analysis.
Due to the large differences in sample size, propensity score matching (PSM) was used to minimize potential confounders, including differences in age, pre-existing conditions, fertility factors, and other relevant variables. The statistical technique of propensity score matching (PSM) is employed to enhance the similarity between two groups. This process involves creating pairs of data points that exhibit the highest similarity across various chosen variables. PSM instead of inverse probability of treatment weighting (IPTW) was used as PSM involves pairing patients from different groups who have similar characteristics, ensuring balanced comparison pairs. IPTW retains all patients but relies on weighting, which can be misleading by appearing more accurate due to the larger sample size. In this data set, IPTW would leave one group disproportionately large, making statistical testing difficult. In particular, the hrFSH cohort is significantly smaller than the rFSH group, making PSM a more appropriate method to maintain balance between the groups. Consequently, the matching process was used to identify corresponding counterparts within the rFSH group for nearly every data point in the hrFSH cohort, keeping the hrFSH group nearly identical with few exceptions [ 13 ].
The following variables were used for matching: Age Year (2017, 2018, 2019, 2020) Stimulation protocol (agonist short, agonist long, antagonist) Procedure (IVF, ICS, ICSI combination, IVF/ICSI) Treatment indication (male, female) Pre-conditions (obesity, PCO, nicotine, malignancy) Patient sterility factors (limited oocyte reserve, endometriosis, hyperandrogenemia, age, malignant diseases, genetics, homosexuality, social freezing)
Age
Year (2017, 2018, 2019, 2020)
Stimulation protocol (agonist short, agonist long, antagonist)
Procedure (IVF, ICS, ICSI combination, IVF/ICSI)
Treatment indication (male, female)
Pre-conditions (obesity, PCO, nicotine, malignancy)
Patient sterility factors (limited oocyte reserve, endometriosis, hyperandrogenemia, age, malignant diseases, genetics, homosexuality, social freezing)
For the cumulative dataset on a per-patient basis, the following variables were used: Age at fresh cycle Stimulation protocol (agonist short, agonist long, antagonist) Procedure fresh cycle (IVF, ICS, ICSI combination, IVF/ICSI) Treatment indication (male, female) Only fresh Pre-conditions (obesity, PCO, nicotine, malignancy) Patient sterility factors (limited oocyte reserve, endometriosis, hyperandrogenemia, age, malignant diseases, genetics, homosexuality, social freezing)
Age at fresh cycle
Stimulation protocol (agonist short, agonist long, antagonist)
Procedure fresh cycle (IVF, ICS, ICSI combination, IVF/ICSI)
Treatment indication (male, female)
Only fresh
Pre-conditions (obesity, PCO, nicotine, malignancy)
Patient sterility factors (limited oocyte reserve, endometriosis, hyperandrogenemia, age, malignant diseases, genetics, homosexuality, social freezing)
The quality of the PSM is assessed by examining whether the standardized mean differences (SMD) for all variables used in the PSM are less than 0.1.
Oocyte count analysis was performed using a two-tailed, two-sample t-test, while Fisher's exact test was used for analysis of PR and LBR, including cumulative outcomes. The t-test is the standard method for analyzing metric variables and is particularly robust against outliers and deviations from normality. Despite the discrete nature of oocyte count data, the large sample size and sufficient number of observations justify the use of the t-test. For binary variables, Fisher's exact test is widely used, offering a more conservative alternative to the chi-squared test. Statistical analysis was performed with R software (version 4.3.0).
Results
A total of 113,936 stimulations were included in the study, of which 4,131 were carried out with hrFSH and 109,805 with rFSH. After 1:1 matching, 4,121 stimulations were included in each treatment group. Baseline characteristics of all included women (before and after matching) are shown in Table 1 .
Table 1 Baseline characteristics before and after matching Before matching After matching Characteristic hrFSH rFSH SMD hrFSH rFSH SMD Stimulations, n 4,131 109,805 4,121 4,121 Age, mean ± SD 33.9 ± 4.0 34.0 ± 4.3 −0.033 33.9 ± 4.0 33.9 ± 4.0 0.004 BMI, mean ± SD 24.8 ± 5.7 24.7 ± 5.4 N/A 24.8 ± 5.7 24.5 ± 5.3 N/A Protocol, n (%) GnRH-short agonist 120 (2.9%) 1229 (1.1%) 0.127 119 (2.9%) 108 (2.6%) 0.016 GnRH-long agonist 164 (4.0%) 9,075 (8.3%) −0.180 164 (4.0%) 163 (4.0%) 0.001 GnRH-antagonist 3,394 (82.2%) 92,220 (84.0%) −0.049 3,390 (82.3%) 3,405 (82.6%) −0.010 Without agonist/antagonist 453 (11.0%) 7,539 (6.9%) N/A 447 (10.8%) 429 (10.4%) N/A Day of transfer, n (%) 2/3 1,005 (35.6%) 33,331 (41.2%) N/A 1,005 (35.6) 1,130 (37.2%) N/A 5 1,580 (56.0%) 40,625 (50.3%) N/A 1,580 (56.0%) 1,608 (53.0%) N/A Embryo transfer, n (%) Single embryo transfer 1,385 (49.1%) 31,435 (38.9%) N/A 1,384 (49.1%) 1,496 (49.3%) N/A Double embryo transfer 1,410 (50.0%) 48,161 (59.6%) N/A 1,409 (50.0%) 1,518 (50.0%) N/A Implantation rate, mean ± SD 0.2 ± 0.4 0.2 ± 0.4 N/A 0.2 ± 0.4 0.2 ± 0.4 N/A Cycle for FET, n (%) a Natural cycle 423 (67.5%) 6,460 (61.8%) N/A 419 (67.3%) 256 (65.3%) N/A Modified natural cycle 50 (8.0%) 704 (6.7%) N/A 50 (8.0%) 27 (6.9%) N/A Programmed 154 (24.5%) 3,284 (31.4%) N/A 154 (24.7%) 109 (27.8%) N/A Therapy indication, n (%) Male 2,784 (67.4%) 71,555 (65.2%) 0.047 2,780 (67.5%) 2,810 (68.2%) −0.016 Female 1,791 (43.4%) 48,041 (43.8%) −0.009 1,785 (43.3%) 1,755 (42.6%) 0.015 Both 918 (22.2%) 23,611 (21.5%) N/A 916 (22.2%) 908 (22.0%) N/A Idiopathic 395 (9.6%) 9,297 (8.5%) N/A 393 (9.5%) 351 (8.5%) N/A Other 79 (1.9%) 4,523 (4.1%) N/A 79 (1.9%) 113 (2.7%) N/A Preconditions, n (%) Obesity 267 (6.5%) 9,363 (8.5%) −0.078 267 (6.5%) 259 (6.3%) 0.008 Thrombotic embolism 68 (1.6%) 646 (0.6%) N/A 68 (1.7%) 30 (0.7%) N/A Mental illness 33 (0.8%) 1049 (1.0%) N/A 33 (0.8%) 35 (0.8%) N/A Thyroid disease 396 (9.6%) 11,982 (10.9%) N/A 395 (9.6%) 482 (11.7%) N/A Disease of inner genitals 22 (0.5%) 1,702 (1.6%) N/A 22 (0.5%) 62 (1.5%) N/A Polycystic ovary syndrome 156 (3.8%) 4,006 (3.6%) 0.007 156 (3.8%) 155 (3.8%) 0.001 Nicotine consumption 292 (7.1%) 9,401 (8.6%) −0.056 292 (7.1%) 250 (6.1%) 0.041 Malignancy 15 (0.4%) 889 (0.8%) −0.059 15 (0.4%) 13 (0.3%) 0.008 Diabetes 45 (1.1%) 828 (0.8%) N/A 45 (1.1%) 33 (0.8%) N/A Hypertonicity 43 (1.0%) 1,282 (1.2%) N/A 43 (1.0%) 37 (0.9%) N/A Allergy 28 (0.7%) 668 (0.6%) N/A 28 (0.7%) 22 (0.5%) N/A Hyperandrogenaemia 37 (0.9%) 659 (0.6%) N/A 37 (0.9%) 37 (0.9%) N/A Other 823 (19.9%) 21,030 (19.2%) N/A 822 (19.9%) 783 (19.0%) N/A Not known 40 (1.0%) 1,587 (1.4%) N/A 40 (1.0%) 48 (1.2%) N/A Sterility factor, n (%) Limited oocyte reserve 31 (0.8%) 346 (0.3%) 0.060 31 (0.8%) 26 (0.6%) 0.015 Endometriosis 365 (8.8%) 10,896 (9.9%) −0.037 365 (8.9%) 353 (8.6%) 0.010 Hyperandrogenaemia 407 (9.9%) 7,095 (6.5%) 0.124 406 (9.9%) 417 (10.1%) −0.009 Cycle pathology 221 (5.3%) 10,077 (9.2%) N/A 221 (5.4%) 370 (9.0%) N/A Tube pathology 456 (11.0%) 12,391 (11.3%) N/A 456 (11.1%) 418 (10.1%) N/A Uterine sterility 131 (3.2%) 3,276 (3.0%) N/A 130 (3.2%) 119 (2.9%) N/A Age 136 (3.3%) 4,796 (4.4%) −0.056 136 (3.3%) 127 (3.1%) 0.012 Malignant diseases 4 (0.1%) 537 (0.5%) −0.073 4 (0.1%) 1 (0.0%) 0.030 Genetic factors 61 (1.5%) 281 (0.3%) 0.132 54 (1.3%) 30 (0.7%) 0.058 Homosexuality 23 (0.6%) 699 (0.6%) −0.010 23 (0.6%) 16 (0.4%) 0.025 Psychosocial factors 9 (0.2%) 187 (0.2%) N/A 9 (0.2%) 3 (0.1%) N/A Social freezing 23 (0.6%) 1,421 (1.3%) −0.077 23 (0.6%) 18 (0.4%) 0.017 Others 821 (19.9%) 31,815 (29.0%) N/A 820 (19.9%) 980 (23.8%) N/A Not known 79 (1.9%) 4,639 (4.2%) N/A 79 (1.9%) 342 (8.3%) N/A a Cumulative FET 2017–2022; BMI body mass index, FET frozen embryo transfer, FSH follicle-stimulating hormone, GnRH Gonadotropin hormone-releasing hormone, IU international unit, N/A not applicable (variable not included in the calculation of the propensity score), SD standard deviation, SMD standardized mean difference (within −0.1 and 0.1 are considered well balanced; if SMD was provided, the variable was included in the calculation of the propensity score)
Baseline characteristics before and after matching
a Cumulative FET 2017–2022; BMI body mass index, FET frozen embryo transfer, FSH follicle-stimulating hormone, GnRH Gonadotropin hormone-releasing hormone, IU international unit, N/A not applicable (variable not included in the calculation of the propensity score), SD standard deviation, SMD standardized mean difference (within −0.1 and 0.1 are considered well balanced; if SMD was provided, the variable was included in the calculation of the propensity score)
Overall, the two groups were comparable in the mean number of oocytes retrieved (hrFSH: 11.0 ± 7.2 vs. rFSH: 10.4 ± 7.1; Table 2 ).
Table 2 Outcomes before and after matching Before matching After matching Outcome hrFSH ( N = 4,131) rFSH ( N = 109,805) p -value hrFSH ( N = 4,121) rFSH ( N = 4,121) p -value Mean total FSH dose ± SD 121.3 ± 113.5 µg 1,931.9 ± 749.6 IU N/A 121.3 ± 113.6 µg 1,909.2 ± 760.8 IU N/A Duration of stimulation (days) ± SD 10.4 ± 2.1 9.8 ± 2.3 N/A 10.4 ± 2.1 9.8 ± 2.2 N/A Daily FSH dose ± SD 10.6 ± 2.4 µg 197.6 ± 63.5 IU N/A 10.6 ± 2.4 µg 194.7 ± 64.5 IU N/A Number of oocytes, mean ± SD 11.0 ± 7.2 10.4 ± 7.1 NS 11.0 ± 7.2 10.8 ± 7.3 NS Pregnancy rate, % 38.0 38.1 NS 38.0 38.1 NS Live birth rate, % 29.4 28.2 NS 29.4 30.5 NS Cumulative pregnancy rate, % 68.0 64.9 0.0447 68.3 64.9 NS Cumulative live birth rate, % 57.3 51.9 0.0093 57.4 50.7 0.017 Miscarriage rate, % 21.6 22.3 NS 21.7 18.5 NS N/A not available, NS not significant, SD standard deviation
Outcomes before and after matching
N/A not available, NS not significant, SD standard deviation
After excluding freeze-all cycles and cycles that ended without ET, there was no statistically significant difference in PR per ET between women who received hrFSH or those who received rFSH (38.0% vs. 36.8%; Table 2 ). Similarly, there was no significant difference in LBR per ET between the groups (29.4% vs. 28.2%; Table 2 ). However, when cumulative PR and LBR were considered, there were significant differences between the groups. Overall, the cumulative PR after the first puncture (including cryopreservation cycles generated from this cycle) was significantly higher with hrFSH stimulation than in the rFSH group (68.0% vs. 64.9%; p < 0.05; Fig. 1 a). Finally, the cumulative LBR after the first puncture was also significantly increased compared to rFSH (57.3% vs. 51.9%; p < 0.01; Fig. 1 b). Fig. 1 Results before propensity score matching: a ) cumulative PR ( n = number of punctures), b ) cumulative LBR ( n = number of punctures). Cumulative PR and cumulative LBR were calculated for all fresh and frozen/thawed embryo transfers after the first oocyte pickup. All fresh cycles with first oocyte pickup that ended in freeze-all or without cryopreservation were excluded from this cumulative analysis. Cumulative values were analyzed for patients with at least 1 pregnancy/live birth. a 43.1% in the hrFSH group and 35.6% in the rFSH group had more than 2 cycles; b 39.5% in the hrFSH group and 35.5% in the rFSH group had more than 2 cycles
Results before propensity score matching: a ) cumulative PR ( n = number of punctures), b ) cumulative LBR ( n = number of punctures). Cumulative PR and cumulative LBR were calculated for all fresh and frozen/thawed embryo transfers after the first oocyte pickup. All fresh cycles with first oocyte pickup that ended in freeze-all or without cryopreservation were excluded from this cumulative analysis. Cumulative values were analyzed for patients with at least 1 pregnancy/live birth. a 43.1% in the hrFSH group and 35.6% in the rFSH group had more than 2 cycles; b 39.5% in the hrFSH group and 35.5% in the rFSH group had more than 2 cycles
Due to the non-interventional nature of the study, it cannot be ruled out that the observed differences between the two groups were influenced or caused by various confounding factors. To reduce the potential imbalance in baseline characteristics between the groups, PSM was used. After 1:1 matching, 4,121 stimulations were included in each treatment group. Before matching, few differences were observed between the two treatment groups. The imbalance between the two groups was significantly reduced after PSM, and the SMDs of all variables were less than 0.1 (Fig. 2 and Table 1 ). Fig. 2 Standardized mean difference (SMD) of variables before (blue line) and after propensity score matching (pink line)
Standardized mean difference (SMD) of variables before (blue line) and after propensity score matching (pink line)
After matching, the mean number of retrieved oocytes remained comparable between the two groups (hrFSH: 11.0 ± 7.2 vs. rFSH: 10.8 ± 7.3; Table 2 ).
Similar to before matching, there was no statistically significant difference in PR per ET (hrFSH: 38.0% vs. rFSH: 38.1%; Table 2 ), nor was there a difference between the groups in LBR per ET (hrFSH: 29.4% vs. rFSH: 30.5%; Table 2 ).
After matching, the cumulative PR was numerically higher with hrFSH stimulation compared to rFSH (68.3% vs. 64.9%; Fig. 3 a), but this difference did not reach statistical significance anymore. However, the cumulative LBR remained significantly higher when hrFSH was used for ovarian stimulation compared to rFSH (57.4% vs. 50.7%; p < 0.05; 3b). Fig. 3 Results after propensity score matching: a ) cumulative PR ( n = number of punctures), b ) cumulative LBR ( n = number of punctures). Cumulative PR and cumulative LBR were calculated for all fresh and frozen/thawed embryo transfers after the first oocyte pickup. All fresh cycles with first oocyte pickup that ended in freeze-all or without cryopreservation were excluded from this cumulative analysis. Cumulative values were analyzed for patients with at least 1 pregnancy/live birth. a 43.1% in the hrFSH group and 40.3% in the rFSH group had more than 2 cycles; b 39.2% in the hrFSH group and 35.6% in the rFSH group had more than 2 cycles
Results after propensity score matching: a ) cumulative PR ( n = number of punctures), b ) cumulative LBR ( n = number of punctures). Cumulative PR and cumulative LBR were calculated for all fresh and frozen/thawed embryo transfers after the first oocyte pickup. All fresh cycles with first oocyte pickup that ended in freeze-all or without cryopreservation were excluded from this cumulative analysis. Cumulative values were analyzed for patients with at least 1 pregnancy/live birth. a 43.1% in the hrFSH group and 40.3% in the rFSH group had more than 2 cycles; b 39.2% in the hrFSH group and 35.6% in the rFSH group had more than 2 cycles
Background
Controlled ovarian stimulation (COS) is a medical procedure designed to induce the growth of multiple ovarian follicles. It has become one of the cornerstones of assisted reproductive technologies (ART) procedures, such as in vitro fertilization (IVF) and intracytoplasmic sperm injection (ICSI). The basis of COS is an increased exposure to follicle stimulating hormone (FSH), which is mainly achieved by the administration of exogenous FSH [ 1 ]. Various FSH preparations are available, including highly purified urinary and recombinant gonadotropins.
Recombinant gonadotropins (rFSH) – follitropin alfa and follitropin beta – have been the mainstay of COS. However, continuous efforts are being made to improve the efficacy and safety of COS protocols, and novel follitropins have been introduced in recent years. While follitropin alfa and beta are derived from Chinese hamster ovary cell lines, follitropin delta is the newest available recombinant FSH available and the first one established to be produced in a human cell line (hrFSH; human recombinant FSH). This cell line (PER.C6) is fully characterized and is a widely used industry standard [ 2 ]. Due to its human origin, it closely resembles native human FSH, unlike the older versions that were produced using Chinese hamster ovary cells [ 3 ]. In contrast, follitropin delta has a glycosylation pattern (consisting of α2,3- and α2,6-linked sialic acids) that is more similar to native human FSH, making hrFSH 60% more potent than rFSH in terms of follicle recruitment, possibly due to reduced clearance [ 3 , 4 ]. Prior to COS, predictors of ovarian response should be evaluated to optimize treatment protocols. The ESHRE guidelines recommend the use of either antral follicle count (AFC) or anti-Müllerian hormone (AMH) to predict high and poor response [ 1 ]. Follitropin delta is the first recombinant FSH with a personalized dosing algorithm based on AMH and body weight to target an optimal ovarian response (8–14 oocytes) [ 5 , 6 ]. The dosing algorithm was developed specifically for follitropin delta, taking into account its unique pharmacokinetic/pharmacodynamic profile [ 3 ]. It is designed to sustain ongoing pregnancy rates while minimizing the risks associated with extreme hypo- and hyper-ovarian responses, particularly ovarian hyperstimulation syndrome (OHSS), compared to existing therapeutic dosing strategies.
The efficacy and safety of follitropin delta have been demonstrated in numerous randomized clinical trials (RCTs) in various patient populations. The phase 3 ESTHER-1 trial demonstrated that follitropin delta is a highly effective and well tolerated treatment for COS. While pregnancy rates (PR) and live birth rates (LBR) were similar for follitropin delta and conventional follitropin alfa, individualized dosing with follitropin delta was associated with a reduced risk of OHSS and a reduced need for gonadotropins to prevent OHSS [ 5 ]. In the phase 3 GRAPE trial, conducted in Asian women, treatment with follitropin delta was found to be non-inferior to follitropin alfa with respect to ongoing PR. Moreover, follitropin delta demonstrated a significantly higher LBR and a lower incidence of early OHSS [ 7 ]. In the phase 3 STORK trial in Japanese women undergoing IVF/ICSI, follitropin delta demonstrated non-inferiority to follitropin beta with respect to the number of oocytes retrieved. In addition, follitropin delta exhibited a favorable benefit-risk profile with a reduced risk of OHSS without compromising PR or LBR [ 8 ].
Despite this evidence, it can be challenging to apply findings from RCTs to real-world clinical practice because RCTs often have strict inclusion and exclusion criteria that may result in a study population that is not fully representative of the broader patient population encountered in real-world clinical practice. Accordingly, non-interventional studies can complement RCTs by providing evidence from routine clinical practice; a strategy that has also been implemented for follitropin delta. A prospective, multinational, multicenter, observational single-arm study in which all treatment protocols reflected routine clinical practice confirmed the favorable PR for follitropin delta [ 9 ], and this was also seen in a prospective real-world study from France [ 10 ]. In addition, a retrospective analysis of data from 360 women who underwent ovarian stimulation with follitropin delta across eight reproductive medicine centers in Germany was performed. The analysis showed that 42.1% of patients achieved the target number of oocytes (8–14 oocytes) using the follitropin delta dosing algorithm [ 11 ], similar to the results of the ESTHER-1 study in which 43.3% of patients achieved the target ovarian response using the follitropin delta algorithm (compared to 38.4% with conventional follitropin alfa) [ 5 ]. This success in the real-world study was observed despite variation in pretreatment AMH levels. In addition, these patients experienced very good clinical PR (49.4% cumulative PR for the first stimulation cycle). Thus, algorithm-based ovarian stimulation with follitropin delta is successful in real-world clinical practice [ 11 ].
It is important to note, however, that these real-world studies focused only on follitropin delta data and did not directly compare reproductive outcomes with other FSH preparations. Comparing different FSH preparations, such as follitropin delta and follitropin alfa/beta, is essential to understanding their relative efficacy and safety in clinical practice, so that healthcare providers can make more informed decisions and tailor treatment plans to individual needs. This study seeks to address the existing knowledge gap by retrospectively analyzing data from a large national registry to compare reproductive outcomes following COS with follitropin delta versus follitropin alfa/beta. Data for this analysis were obtained from the German IVF-Registry (D·I·R; Deutsches IVF-Register), which prospectively receives data from nearly all reproductive medicine centers in Germany. This approach helps mitigate the problem of selection bias among centers, which can be a limitation in other real-world studies.
With this in mind, the aim of the present study was to compare the effectiveness of follitropin delta versus recombinant follitropins in women aged 24–45 years at the start of their COS cycle, in terms of number of oocytes retrieved, PR and cumulative PR, as well as LBR and cumulative LBR, using data from the D·I·R. While PR and LBR were calculated for women whose number of previous cycles was not specified (i.e., their 1st, 2nd, 3rd, etc. cycle), cumulative PR and LBR were calculated for naive patients who had no previous stimulation.
Conclusion
In conclusion, this study indicates potential benefits of using hrFSH over rFSH in terms of cumulative LBR in real-world settings. These findings provide valuable evidence for clinical decision-making in assisted reproductive technology. However, limitations such as missing data on ovarian reserve biomarkers or data on embryo quality highlight the need for further investigations.
Discussion
This is the first study to directly compare the effectiveness of hrFSH and rFSH for COS in a real-world setting. The results show that women may benefit from treatment with hrFSH versus rFSH in terms of cumulative LBR. The findings of this study have the potential to impact clinical practice in ART settings by providing valuable evidence regarding the choice of FSH formulation for COS.
PR and LBR were comparable between the two groups, consistent with the results of the ESTHER-1 and STORK trials [ 5 , 8 ]. However, cumulative PR and LBR were significantly higher for hrFSH than for rFSH with relative increases of 5% and 10%, respectively. To ensure the comparability between the two treatment groups, PSM was used to control for confounding variables, increasing the reliability of the results. After matching, cumulative PR remained higher with hrFSH but lost statistical significance, probably due to the smaller sample size. In contrast, cumulative LBR remained significantly higher with hrFSH compared to rFSH. As this is a retrospective analysis of real-world data the reason for this increased cumulative LBR can only be speculated. One potential factor could be the difference in gonadotropin doses between the groups. Previous studies have shown a decrease in LBR with higher doses of FSH [ 14 , 15 ]. In the current study, the mean daily dose of highly purified recombinant FSH (hrFSH) was 10.6 µg (equates to approximately 159 IU follitropin alfa [ 16 ]), which is lower than the mean daily dose of rFSH (> 190 IU). This difference may explain the higher cumulative LBR observed with hrFSH. However, it's important to note that the dose equivalence factor has only been established for follitropin delta and follitropin alfa, not for follitropin beta. The rFSH group in this analysis includes women treated with either follitropin alfa or beta, which could influence the outcomes. In addition, the data set lacks information on embryo quality. It would be interesting to see if the use of hrFSH results in better blastocysts compared to rFSH. This should be investigated in future studies.
While differences in gonadotropin doses may help explain the higher cumulative LBR, this is likely also linked to AMH levels. Data from an individual participant meta-analysis including data from the three phase-3 trials (ESTHER-1, STORK, and GRAPE) suggest that higher LBR using hrFSH may be associated with AMH levels. Women with high AMH levels (≥ 15 pmol/L) may benefit from using hrFSH, as higher LBR were observed compared with rFSH. For women with low AMH levels (< 15 pmol/L), no difference between the groups was observed [ 17 ]. As the D·I·R database does not include data on ovarian reserve biomarkers such as AMH, FSH or AFC, any disparity in ovarian reserve between the two groups, which could potentially account for the observations, could not be evaluated and adjusted for in this data analysis.
Notably, hrFSH is the first and only FSH used for COS that uses an individualized daily dose based on the woman’s body weight and AMH levels. In this study, the mean daily dose of hrFSH (10.6 μg) exceeded that observed in randomized clinical trials, where the mean daily dose ranged from 8.5 to 10.1 µg [ 5 , 7 , 8 ]. This discrepancy could be due to differences in AMH levels or body weight, as the mean BMI in this data set was 24.2 kg/m 2 , higher than in the clinical trials. In comparison, the PROFILE study, which had a similar BMI (24.2 kg/m 2 ), reported a mean starting daily dose of 10.4 µg, close to that of this study. Notably, the study showed that in the real world, nearly all patients (95%) had their starting dose calculated using the approved algorithm and most women (87%) received hrFSH within 0.33 μg of the algorithm-recommended dose [ 9 ]. Due to the registry nature of the study, it cannot be definitively determined whether the variance in daily dose between this analysis and other studies is due to the dosing regimen of hrFSH, as the D·I·R does not collect data on whether the algorithm was used as approved. The current study shows that PR and LBR were similar to, or higher than, the rates observed in RCTs with hrFSH when determined in a real-world setting, regardless of the dosing regimen, i.e., conventional or based on algorithm use.
One of the strengths of the current study is the use of prospectively collected data from the D·I·R which reflects the daily practice of German health care professionals. Unlike large clinical trials, which often have strict inclusion and exclusion criteria, registry data include patient cohorts that might otherwise be overlooked in such trials. The D·I·R database encompasses large nationwide datasets, minimizing the risk of selection bias. To further reduce bias, only data from centers that used both hrFSH and rFSH for COS were included, as it could otherwise lead to distortion of data. For example, centers using only one type of FSH might have high reproductive outcomes, potentially wrongly attributing them solely to the FSH type. However, it is crucial to recognize that several factors influence reproductive outcomes. By including only data from centers using both types of FSH, the reliability and completeness of the results are improved, allowing for a more balanced assessment of treatment effectiveness.
While using the comprehensive D·I·R database is a notable strength of this study, it also comes with limitations. One limitation relates to the tracking of patient data. Although the database records the number of stimulations performed, it does not track the number of individual women who receive these stimulations. This is because the D·I·R database does not use a unique patient identification system and does not track individual patients, which could lead to potential inaccuracies in data interpretation and statistical analysis. Despite the adjustments made for numerous variables, the presence of residual confounders between the groups cannot be ruled out, and it is conceivable that the PMS was unable to fully adjust for all unmeasured confounders.
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.