Section 4
This was a cohort study that included 124 females with infertility attributed to female factors and who were undergoing ICSI. Females with tubal obstruction were excluded from the study. Females were recruited from the Centre for Medically Assisted Procreation, Centro Materno-Infantil do Norte Dr. Albino Aroso (CMIN), Unidade Local de Saúde de Santo António (ULSSA) between January 2020 and February 2022.
This study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of the ULSSA (process number 2020.119(097-DEFI/099-CE). All participants provided written informed consent.
Data regarding the age and cause of infertility was obtained for all participants. Data on markers of ovarian reserve, including follicle-stimulating hormone (FSH) at day 3, anti-Müllerian hormone (AMH), and antral follicle count (AFC), were obtained. Data regarding IVF outcomes, including response to ovarian stimulation (total gonadotrophin dose, stimulation duration, follicle count on trigger day, and the number of retrieved, immature and aberrant oocytes), oocyte maturation (number of injected metaphase II [MII] oocytes), and fertilization success (number of two pronuclei [2PN] oocytes) were also obtained from all participants. All data were obtained from clinical records.
The number of CGG repeats was determined using a fluorescent polymerase chain reaction (PCR). The forward primer P1-5′-TTCGGTTTCACTTCCGGTG-3′ and the reverse primer P2- FAM labeled—5′-CCATCTTCTCTTCAGCCCTGC-3′ were used. All the PCR components used are detailed in Supplementary Table S5 . The amplification protocol involved an initial incubation at 98 °C for 5 min, followed by 42 cycles of denaturation at 95 °C for 1 min and 10 s, annealing at 57 °C for 45 s, and extension at 68 °C for 1 min and 10 s, with a final extension of 10 min at 68 °C. PCR products were analyzed by capillary electrophoresis using ABI PRISM ® 3130xl Genetic Analyzer (Applied Biosystems™, Foster City, CA, USA) with 500 ROX™ size standard (GeneScan™, Warrington, UK) and were further analyzed using GeneMapper ® software version 4.0 (Applied Biosystems™) ( Supplementary Figure S1A ). A previously sequenced sample containing 30 CGGs was used as a control.
The categorization of FMR1 alleles followed the European Molecular Genetics Quality Network (EMQN) guidelines: normal (CGG repeats < 45), intermediate or “grey-zone” (45 ≤ CGG repeats ≤ 54), PM (55 ≤ CGG repeats < 200), and full mutation (CGG repeats ≥ 200) [ 9 ].
AGG interruptions were determined by Triplet-Primed PCR (TP-PCR) using the forward primer P1-5′-GACGGAGGCGCCGCTGCCAGG-3′, reserve primer P2- HEX labeled—5′-TACGCATCCCAGTTTGAGACGGGCCGCCGCCGC-3′, and primer “tail” P3- 5′-ACGCATCCCAGTTTGAGACG-3′. All the PCR components are described in Supplementary Table S6 . Amplification involved an initial incubation at 98 °C for 5 min, followed by 15 cycles of denaturation at 98 °C for 1 min, annealing at 55 °C for 1 min, and extension 68 °C for 2 min, and 30 cycles of denaturation at 98 °C for 1 min, annealing at 55 °C for 1 min, and extension 68 °C for 3 min. The final extension of 10 min was performed at 68 °C. PCR products were analyzed using capillary electrophoresis as previously described. The AGG interspersion pattern was analyzed using the software described above. In samples where the AGG interspersion pattern was ambiguous ( n = 34 samples), the FRAXA PCR kit LabGscan™ (Diagnostica Longwood, Zaragoza, Spain) was used, following the manufacturer’s specifications ( Supplementary Figure S1B ).
The allelic complexity for each allele was calculated using the formula described by Rodrigues et al. (2020) [ 21 ], which integrates both the number and pattern of AGG interspersions and the total CGG repeat length. The resulting value, termed the allelic score , quantitatively reflects the complexity of the FMR1 CGG/AGG substructure.
For instance, consider an allele with the AGG interspersion pattern CGG 10 AGGCGG 9 AGGCGG 9. To calculate its allelic score , the formula considers the length of each uninterrupted CGG repeat stretch and its position relative to the AGG interruptions. The allelic score calculation emphasizes the 5′ end of the allele by assigning higher weights to CGG repeats closer to the 5′ end and decreasing weights to repeats located further downstream towards the 3′ end from the initial AGG interruptions. Following Rodrigues et al. (2020) [ 21 ], the allelic score for this specific pattern is calculated as follows: Allelic score = (Number of CGG repeats before the 1st AGG × 4 1 − 1 ) + (Number of CGG repeats between the 1st and 2nd AGG × 4 2 − 1 ) + (Number of CGG repeats after the last AGG × 4 2 ) Allelic score = (9 × 4 1 − 1 ) + (9 × 4 2 − 1 ) + (10 × 4 2 ) = 205
The categorization of samples according to FMR1 sub-genotypes was based on the study by Gleicher et al. (2010a) as follows [ 42 ]: normal/normal (N/N) when both alleles were within the new normal range (between 26 and 34 CGG repeats); low/normal (L/N), when one allele was below and the other within the normal range; normal/high (N/H), when one allele was within the normal range and the other was above; low/high (L/H), when one allele was below and the other was above the normal range; low/low (L/L), when both alleles were below the normal range; and high/high (H/H), when both alleles were above the normal range.
Categorical variables are presented as absolute (n) and relative frequencies (%). Continuous variables are presented as the mean ± standard deviation (SD) and range, as indicated in each table. A linearized form of a logarithmic model [i.e., regression of ln(score 1) against score 2] was used to obtain a mathematical model to predict the allelic complexity ( allelic score ) relationships between both alleles for each group. By analysis of covariance (ANCOVA), we compared the regression models between potentially fertile females (i.e., the reference set that was previously described by Rodrigues et al. (2020) [ 21 ] and infertile females, following the methodology outlined by Zar [ 43 ]. Principal component analysis (PCA) was performed to test if the markers of ovarian reserve and IVF outcomes were able to discriminate between equivalent and dissimilar groups. For normally distributed data, a comparative analysis of the markers of ovarian reserve and IVF outcomes between the equivalent and dissimilar groups was performed using a t -test. When the data failed normality and homoscedasticity tests, comparisons were performed using the Mann-Whitney test. Pearson correlations were used to assess the relationship between the allelic score of each allele, the markers of ovarian reserve and IVF outcomes. The chi-square test was used to compare the number of samples (frequency) across the sub-genotype categories between the two groups ( equivalent and dissimilar ).
All statistical analyses were performed with SigmaPlot version 14.0 (Systat Software ® Inc., Chicago, IL, USA), except for PCA, which was performed with Past ® version 4.16c (Statistic software) [ 44 ]. A statistical significance level of 0.05 was used for all statistical tests.
Intro
The fragile X messenger ribonucleoprotein 1 ( FMR1 ) gene, located on the long arm of the X chromosome at Xq27.3, plays a critical role in female reproductive health [ 1 ]. The FMR1 gene is notably associated with female infertility through its relationship with fragile X-associated primary ovarian insufficiency (FXPOI; OMIM #311360) [ 2 , 3 ]. Approximately 20% of female carriers of the FMR1 premutation (PM, 55 < CGG repeats < 200) develop FXPOI [ 4 , 5 ]. These carriers typically exhibit hypergonadotropic hypogonadism and present with absent or irregular menstrual cycles before age 40, thereby increasing their risk of infertility [ 6 , 7 , 8 ].
The FMR1 gene contains a polymorphic region within its 5′ untranslated region (UTR). Based on the number of CGG repeats, alleles are classified as normal (CGG repeats < 45), intermediate (45–54 CGG repeats), PM (55–199 CGG repeats), and full mutation (≥200 CGG repeats) [ 9 ]. Intermediate alleles were associated with idiopathic primary ovarian insufficiency (POI) [ 10 , 11 ]. On the other hand, some studies have implicated normal-sized alleles—particularly those with fewer than 26 CGG repeats—in reduced in vitro fertilization (IVF) pregnancy rates, poor embryonic quality, diminished ovarian response to stimulation, decreased ovarian reserve (DOR), and POI [ 12 , 13 , 14 , 15 , 16 ], although other research has failed to confirm these associations [ 17 , 18 , 19 , 20 ]. Most studies into FMR1 ’s role in DOR have primarily considered the total CGG repeat length. Considering that most non-expanded alleles (CGG repeats < 55) are typically interrupted by stabilizing AGG triplets, which commonly occur at positions 9 or 10 within the CGG stretches, we previously developed a formula that determines the allelic complexity— allelic score —of each FMR1 allele. This score integrates both the total CGG repeat length and the number and pattern of these AGG interruptions [ 21 ]. It is well established that both the length of the CGG repeat tract and the pattern of AGG interruptions influence repeat stability and the risk of expansion in offspring. Specifically, AGG interruptions are known to act as stabilizing elements, mitigating strand slippage during DNA replication. Consequently, alleles lacking these protective AGG interruptions exhibit a significantly higher propensity for expansion, particularly during maternal transmission [ 22 , 23 ]. While the influence of AGG interruptions on clinical outcomes in individuals with a premutation remains an area of active investigation, with conflicting findings reported regarding their role in FXPOI [ 4 , 24 ], we hypothesize that a comprehensive metric—one that considers the entire repeat region, encompassing both its length and the AGG interspersion pattern—may provide a more accurate assessment of risk and contribute to a clearer understanding of these complex relationships.
In this study, we aimed to evaluate whether FMR1 allelic complexity could serve as a predictor of ovarian reserve and IVF success. To this end, we analyzed samples from females experiencing infertility who were undergoing intracytoplasmic sperm injection (ICSI). Samples were categorized based on their FMR1 allelic complexity, and correlations with ovarian reserve markers and IVF outcomes were assessed.
Results
This study included 124 females diagnosed with infertility, with a mean age of 34.7 ± 3.7 years, ranging from 22 to 40 years. Most females presented multiple etiologies of infertility, the main causes being ovulatory dysfunction ( n = 83, 66.8%), endometriosis ( n = 17, 13.7%), and oocyte factor ( n = 12, 9.7%). Less frequently ( n = 12, 9.7%), they presented hypothyroidism, hyperprolactinemia, POI, DOR, and adenomyosis. Table 1 shows the demographic and clinical characteristics of the study cohort.
Of the 124 samples analyzed, 118 presented two normal-sized alleles, ranging from 17 to 44 CGG repeats. Three samples exhibited an intermediate allele with 48, 51, and 52 CGG repeats, while three other samples presented a PM allele with 56, 59, and 75 CGG repeats, respectively ( Figure 1 ).
Across the total of 248 alleles, the most frequent CGG repeat length was 30 CGG repeats ( n = 95, 38.3%). Most alleles presented one or two AGG interruptions ( n = 232, 93.6%), while 4% of alleles ( n = 10) presented no AGG interruptions, and 2.4% of alleles ( n = 6) presented three AGG interruptions. Overall, 62 distinct AGG interspersion patterns were identified, with the most common being (CGG) 10 AGG(CGG) 9 AGG(CGG) 9 ( n = 81, 32.7%) and (CGG) 9 GG(CGG) 9 AGG(CGG) 9 ( n = 25, 10.1%). FMR1 molecular data can be found in Supplementary Table S1 .
The PM alleles were excluded from further analysis, as the study focused on normal-sized and intermediate alleles.
The summary of the FMR1 allelic complexity ( allelic score ) results can be found in Table 2 . The mean allelic score was 125.5 ± 95.6 for allele 1 and 198.9 ± 135.4 for allele 2.
The combination of allelic scores resulted in two distinct groups: one containing alleles with similar allelic scores ( equivalent group) and another containing alleles with different allelic scores ( dissimilar group). These were classified as follows: when both alleles presented an allelic score > 150 or 150 and the other < 150, the sample was included in the dissimilar group. The correlation between the allelic scores of each group was described using a linearized logarithmic model (or mathematical model). Significant correlations were found in the equivalent group: r = 0.562; df = 67; p < 0.0001, and in the dissimilar group: r = −0.417; df = 50; p = 0.0021. Allelic scores above 700 were obtained in samples with 3 AGG interruptions ( n = 3 in both the equivalent and dissimilar groups, Supplementary Table S2 , samples 4, 14, 44, 71, 89, and 97, respectively), due to the relevance attributed to the number of AGG interruptions by the formula.
ANCOVA was then used to compare the regression models resulting from the combination of the allelic scores of both alleles in each group. Coincident regression lines demonstrated no statistically significant differences in equivalent (F (2, 139) = 0.3023; p = 0.7396) and dissimilar (F (2, 99) = 0.3496; p = 0.7058) groups when comparing this infertile cohort with the reference set of potentially fertile females (described in Rodrigues et al. (2020) [ 21 ]). These results enable the development of a more robust mathematical model that includes all observations: equivalent group—Score 2 = −334.6 + 106.6 × ln(score 1) (r = 0.547; df = 141; p < 0.0001) and dissimilar group—Score 2 = 482.5 − 73.6 × ln(score 1) (r = −0.874; df = 101; p < 0.0001).
PCA was conducted to evaluate if the markers of ovarian reserve and IVF outcomes were able to discriminate between the equivalent and dissimilar groups. The variables analyzed did not allow a distinct separation between the two groups, as shown in Figure 2 . The first principal component (PC1) accounted for 33.4% of the total variance, while the second principal component (PC2) explained an additional 17.3% of the variance. Among the variables analyzed, the number of retrieved oocytes, the number of injected MII oocytes, and the number of 2PN oocytes were identified as the primary contributors to the variance explained by PC1.
Consistent with these findings, no statistically significant differences were found between the equivalent and dissimilar groups in any of the markers of ovarian reserve and IVF outcomes analyzed ( p > 0.05 for all the variables) ( Table 3 ).
We next explored if the FMR1 allelic score of each individual allele correlated with the markers of ovarian reserve and with the IVF outcomes. In the equivalent group, no significant correlations were observed between the allelic score of allele 1 or allele 2 and the markers of ovarian reserve and IVF outcomes ( p > 0.05) ( Table 4 and Supplementary Table S3 ). Similarly, no significant correlations were found for allele 2 in the dissimilar group ( Supplementary Table S3 ). However, in this group, a significant negative correlation was observed between the allelic score of allele 1 and the number of injected MII oocytes (Pearson correlation: r = −0.289, n = 49, p = 0.044) as well as the number of 2PN oocytes (Pearson correlation: r = −0.311, n = 48, p = 0.031) ( Table 4 ). Additionally, a significant positive correlation was found between the number of injected MII oocytes and the number of oocytes with 2PN (Pearson correlation: r = 0.859, n = 48, p < 0.001).
The most frequent CGG repeat length of allele 1 of the dissimilar group was 20 CGG repeats ( Supplementary Table S2 ). Considering that alleles with fewer than 26 CGG repeats have been previously associated with poor fertility prognosis, we analyzed the distribution pattern of the sub-genotypes previously described [ 25 ] among the two groups ( Figure 3 ). Most samples from the dissimilar group presented alleles with 26 CGG repeats ( n = 52, 75.4%). A statistically significant difference was found in the distribution of the normal/normal and low/normal sub-genotype (χ 2 = 35.9; df = 1; p < 0.001).
Discussion
In this study, we aimed to investigate whether the FMR1 gene allelic complexity can be used as a predictor of ovarian reserve and IVF success. By combining FMR1 allelic scores , we introduce a novel integrative approach that extends beyond traditional CGG repeat length assessment, offering a more comprehensive evaluation of the repetitive tract. This approach allowed us to categorize samples according to allelic complexity and explore their relationship with ovarian reserve markers and IVF outcomes.
The mathematical model derived from the combination of the allelic scores of our infertile cohort and was not statistically different from the reference models calculated in the previous study using samples from potentially fertile females [ 21 ]. The fact that the combination of the allelic complexity is independent of the clinical condition in each group of samples allows the integration of data from both studies, permitting the development of a more robust model.
Notably, within the dissimilar group (where the allelic complexity of each allele is inversely related), the allelic score of allele 1 showed a negative correlation with the number of 2PN oocytes, a key indicator of successful fertilization [ 26 , 27 , 28 ]. Specifically, cases with an allelic score of allele 1 exceeding 150 exhibited very few 2PN oocytes, suggesting a higher risk of fertilization failure in this subgroup. Conversely, a female with a low allelic score in allele 1 ( allelic score = 23; Supplementary Table S2 , sample 114) demonstrated high fertilization success (12 out of 13 injected MII oocytes resulting in2PN; Supplementary Table S4 , sample 114). In contrast, another female with a high allelic score ( allelic score = 205; Supplementary Table S2 , sample 112) had poor fertilization outcomes, with only 3 out of 7 injected MII oocytes reaching the 2PN stage, alongside three unfertilized and one degenerated oocyte ( Supplementary Table S4 , sample 112). Interestingly, the size range of 35–54 CGG repeats has been previously associated with an increased risk of DOR [ 29 , 30 ], which aligns with our observation that 5 out of 7 samples with high allelic score (>150) in allele 1 also presented with a number of CGG repeats above 34 in allele 2. These findings suggest that patterns of allelic complexity—beyond mere CGG repeat counts—may serve as novel predictive biomarkers for fertilization failure risk in IVF, representing a significant advancement from previous studies that focused solely on CGG repeat length. This approach provides insights that may clarify the inconsistent associations previously reported between CGG repeat numbers and IVF outcomes.
Our observation that the categorization of samples based on FMR1 allelic complexity, using our established formula, might elucidate the disparities seen in COC and 2PN oocyte numbers in PCOS versus non-PCOS populations [ 31 ] presents a new perspective on interpreting existing data. While elevated AMH levels in the dissimilar group, echoing findings in PCOS [ 31 ], did not reach statistical significance in our cohort, the trend suggests a potential link that warrants further investigation within the framework of allelic complexity. Consistent with prior studies linking shorter CGG repeats (<26) to poorer IVF outcomes [ 12 , 13 , 14 , 16 , 32 , 33 ], our dissimilar group was enriched with such alleles. However, our key innovation lies in demonstrating that allelic complexity allows for the identification of a specific subgroup at elevated risk of fertilization failure, even within these shorter repeat ranges. This underscores the critical importance of considering not just the number of CGG repeats but also the stabilizing pattern of AGG interruptions. This is further supported by Quilichini et al. (2024), finding that allelic complexity enhances POI risk prediction in intermediate and premutated alleles [ 34 ], extending the significance of this metric beyond fertilization outcomes to broader reproductive health. The discrepancy with Nunes et al. (2024) study, which found no association between CGG repeat number and IVF outcomes [ 35 ], directly highlights the added value of our allelic complexity assessment. By incorporating the AGG interruption pattern, our more comprehensive approach appears to capture nuances in the FMR1 alleles that are missed by simply considering the total CGG repeat length.
Additionally, a recent publication by our team highlights the importance of molecular profiling, such as hormone and metabolite markers, in elucidating the mechanisms underlying PCOS-related reproductive dysfunctions. It demonstrates that integrating molecular data can identify biomarkers linked to mitochondrial and glycolytic impairments, which may adversely affect oocyte quality and fertilization success [ 36 ]. This supports our approach of combining genetic complexity with phenotypic indicators, thereby strengthening the case for the potential predictive value of FMR1 allelic complexity in IVF outcomes.
In summary, our study’s primary novel contribution is the identification of FMR1 allelic complexity patterns, particularly within the dissimilar group, as a promising predictive biomarker for fertilization success in IVF. This refined approach offers a more nuanced understanding compared to traditional CGG repeat analysis and may help to identify females at higher risk of fertilization failure. Furthermore, our findings reinforce the broader clinical utility of assessing FMR1 allelic complexity, as suggested by others [ 34 ].
While our primary focus was on the relationship between FMR1 allelic complexity and fertilization outcomes, our analysis of the study population also revealed an important secondary finding: an increased frequency of PM carriers (3 out of 124 cases) compared to the general population [ 37 , 38 ]. This finding underscores the critical need for the implementation of routine FXS screening protocols in infertile females, aligning with recommendations from other researchers [ 39 , 40 , 41 ], to facilitate informed reproductive decisions and genetic counseling.
While acknowledging the limitations of our small sample size and exclusive focus on fertilization outcomes, our findings strongly advocate for future research with larger cohorts to validate the predictive power of FMR1 allelic complexity for the entire spectrum of IVF success, including embryo development and live birth rates. Ultimately, a deeper understanding of FMR1 allelic complexity holds significant potential for developing more personalized and effective IVF treatments.
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.