Revealing Novel Protein Biomarkers for Female Infertility through an Integrated Analysis of Plasma Proteomics and Mendelian Randomization

preprint OA: closed
Full text JSON View at publisher
Full text 96,980 characters · extracted from preprint-html · click to expand
Revealing Novel Protein Biomarkers for Female Infertility through an Integrated Analysis of Plasma Proteomics and Mendelian Randomization | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Revealing Novel Protein Biomarkers for Female Infertility through an Integrated Analysis of Plasma Proteomics and Mendelian Randomization Yi Fang, He Ren, Chun Wang, Liangjun xia, Youbing Xia This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6313119/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background Female infertility is a prevalent reproductive health issue, the incidence of which has been rising in recent years. However, there remains a lack of highly effective and targeted treatments. This study employs a proteome-wide Mendelian randomization (MR) analysis to investigate the causal relationships between plasma proteins and female infertility and to identify and validate potential therapeutic targets. Methods We utilized pQTL data from DECODE Genetics, covering 35,559 proteins in 4,907 individuals. Summary data for female infertility were extracted from the FinnGen project, including 14,759 cases and 111,583 controls. A two-sample MR analysis was conducted, using single nucleotide polymorphisms (SNPs) as genetic instruments to estimate the causal effects of plasma proteins on female infertility. Sensitivity analyses were performed to assess the stability and reliability of the MR results. PPI networks were constructed, and drug-gene interaction systems were integrated to elucidate potential links between the identified proteins and existing treatments for female infertility. Results The MR analysis indicated significant associations between the expression levels of two plasma proteins and the risk of female infertility. Higher levels of Cardiotrophin-1 (CTF1) (OR = 0.68, CI 0.53–0.87, P = 2.54× 10−3 ) were associated with a reduced risk of female infertility, whereas higher levels of Insulin-like growth factor-binding protein 5 (IGFBP5) (OR = 1.35, CI 1.09–1.66, P = 5.54× 10−3 ) were associated with an increased risk of female infertility. Sensitivity analyses showed no evidence of pleiotropy or heterogeneity. Conclusion This study identified two plasma proteins associated with the risk of female infertility, providing new insights into the potential pathogenesis of the condition. Female infertility Plasma Proteins Mendelian Randomization FinnGen Therapeutic Targets Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 1 Introduction According to the World Health Organization (WHO), infertility is defined as the inability to achieve a clinical pregnancy after 12 months of regular, unprotected sexual intercourse[ 1 ]. The global prevalence of infertility is approximately 8–12%[ 2 , 3 ], with female factors accounting for 33–41% of cases[ 4 ]. Female infertility is often associated with pathological characteristics such as ovulatory disorders, endometriosis, uterine abnormalities, endocrine and genital developmental anomalies, as well as clinical manifestations like menstrual irregularities, dysmenorrhea, acne, obesity, abnormal vaginal discharge, and decreased libido. Currently, assisted reproductive technology (ART) is the primary treatment for infertility, but its success rate is limited by various factors, including female age, egg quality, and endometrial receptivity. Additionally, the complex treatment process and uncertain outcomes, along with high costs, can place a double burden of physical and financial stress on women. Due to societal and cultural expectations, as well as traditional family roles, women are often assigned significant responsibility for childbearing, making infertility a source of considerable psychological stress. Studies have shown that prolonged psychological stress can lead to mental health disorders such as anxiety and depression, increase the risk of marital conflicts, and negatively impact the marital relationship[ 5 , 6 ]. Given the increasing prevalence of infertility, identifying its causes and deeply investigating the causal associations with potential risk factors is of significant importance for improving patients' reproductive outcomes and optimizing clinical treatment strategies. Plasma proteins play a central role in a variety of biological processes and serve as key biomarkers for diagnosing specific diseases as well as potential therapeutic targets[ 7 , 8 ]. These proteins enter the bloodstream through cellular leakage or active secretion, acting as critical regulators of molecular pathways. Plasma proteins encompass a wide range of immune-related proteins, such as immunoglobulins and cytokines, which play crucial roles in mediating immune responses and regulating inflammation[ 9 , 10 ]. With the rapid advancement of genome-wide association studies (GWAS) in human plasma proteomics, the integration framework of genomics and proteomics data has gradually been perfected, providing strong technical support for the discovery of biomarkers. Against this backdrop, proteome-wide association studies (PWAS), as an extension of GWAS, hold promise to become an important direction for the study of female infertility, offering new research pathways for elucidating the complex genetic and molecular mechanisms underlying this condition. Due to the limitations of traditional study designs, observational studies are susceptible to confounding factors and reverse causality, while randomized controlled trials (RCTs) face practical constraints such as ethical considerations, cost, and time, which limit their ability to address all research questions and may lead to biased results[ 11 ]. Mendelian Randomization (MR) is a causal inference method based on genetic principles, utilizing genetic variants (GVs) as instrumental variables (IVs) to assess the causal relationship between exposure and outcome[ 12 , 13 ]. The distribution of genetic variants possesses inherent randomness and stability, which naturally differentiates individuals carrying specific genetic variants from those who do not in terms of exposure[ 14 ]. By comparing the outcome differences between these two groups, the causal effect of exposure on the outcome can be inferred. Currently, no studies have established the association between the human plasma proteome and the risk of female infertility. Therefore, the aim of our study is to evaluate the causal relationship between the human plasma proteome and female infertility using MR analysis, and to integrate functional enrichment analysis, protein-protein interaction (PPI) network analysis, and drug-gene interactions to understand the pathogenesis of female infertility. This could facilitate the development of personalized treatment strategies and the identification of novel biomarkers for targeted therapies in female infertility. 2 Materials and Methods 2.1 Study Design To assess the potential causal relationship between plasma proteins and female infertility, we employed a two-sample MR analysis approach. The MR analysis is guided by three key assumptions: (1) Relevance assumption: the IVs are strongly associated with the exposure variable (plasma proteins); (2) Independence assumption: the IVs are independent of confounding factors; (3) Exclusivity assumption: the IVs affect the outcome (female infertility) only through the exposure variable[14]. The three main assumptions of MR analysis in this project are depicted in Fig. 1. 2.2 Sources of exposure and outcome data The pQTL (protein quantitative trait loci) dataset was provided by DECODE Genetics (https://www.decode.com/summarydata/) and is based on a large-scale population study, including 35,559 samples of individuals of European descent and covering 4,907 proteins[15]. The outcome data originated from the FinnGen database. The FinnGen study (https://r10.finngen.fi/) is a large-scale genomics initiative that has analyzed over 500,000 Finnish biobank samples and correlated genetic variation with health data to understand disease mechanisms and predispositions. The project is a collaboration between research organisations and biobanks within Finland and international industry partners[16]. The FinnGen database utilizes the SomaScan high-throughput detection platform for proteomics analysis. This technology not only enables the detection of low-abundance proteins but also offers a broader dynamic detection range, significantly enhancing the efficiency and accuracy of proteomics research. In this study, we obtained outcome data for female infertility from the FinnGen database, which included 14,759 cases and 111,583 controls. 2.3 Instrumental Variable Selection We adhered to the three fundamental assumptions outlined previously and applied stringent selection criteria to identify eligible IVs. (1) SNPs significantly associated with the exposure factor were selected, with a significance level set at 1×10 − 5 [17]; (2) Linkage disequilibrium (LD) among SNPs was eliminated, using an R² value corresponding to a 100 kb LD distance, with a threshold set at R² < 0.1; (3) Palindromic SNPs were excluded. Furthermore, to avoid bias due to weak IVs, we assessed the strength of the IVs by calculating the F-statistic, and IVs with an F-statistic < 10 were defined as weak instruments and excluded[18]. The calculation method for the F-statistic is as follows: R 2 = 2× (1-EAF) × EAF × β 2 /[2 ×(1-EAF)× EAF × β 2 + 2 ×(1-EAF)× EAF× SE 2 × N] statistics = (N-2)×R 2 /(1-R 2 ) EAF represents the frequency of the effect allele for the exposure, while β and SE denote the estimated effect and standard error of the exposure, respectively. N indicates the sample size for the exposure. 2.4 Mendelian Randomization Analysis To establish the connection between genetically predicted protein levels and female infertility, a two-sample Mendelian Randomization (TSMR) analysis was conducted. Statistical analyses were performed using "TwoSampleMR" (version 0.6.6) with R4.3.3[19]. The primary method for estimating causal effects was the inverse variance weighted (IVW) approach, which offers the highest statistical power under the assumption that all IVs are valid instruments[20]. We utilized random-effects IVW analysis in the presence of heterogeneity ( P < 0.05) and fixed-effects IVW analysis in the absence of heterogeneity[21]. Additionally, supplementary analyses were conducted through MR Egger, weighted median, simple mode, and weighted mode methods. The MR Egger method effectively detects and addresses horizontal pleiotropy, where IVs may influence the outcome variable through pathways independent of the causal pathway of the exposure of interest[22]. The weighted median method serves as an important complementary tool, assuming that at least 50% of the IVs are valid[23]. The simple mode method clusters the causal effect estimates of IVs and selects the most frequently occurring cluster as the final estimate, effectively reducing the interference of outliers or invalid IVs through a clustering and mode selection mechanism. The weighted mode estimation method offers enhanced causal effect detection capabilities, reduced bias, and a lower Type I error rate. This study reports the odds ratio (OR), indicating that with each one standard deviation increase in plasma protein levels, the risk of female infertility increases. Results from IVW, MR-Egger, weighted median, simple mode, and weighted mode can be considered when the direction of causal effect estimates is consistent. When there is inconsistency in the direction of estimates, we do not establish a correlation between plasma proteins and female infertility. The research framework diagram of MR analysis is presented in Fig. 2. 2.5 Sensitivity Analysis We systematically conducted a series of sensitivity analyses to verify the robustness of our study results. Specifically, Cochran's Q test based on the IVW method was employed to assess the heterogeneity of IVs, with a P > 0.05 indicating no significant heterogeneity[24]. The MR-Egger approach was utilized to identify and evaluate horizontal pleiotropy; a p-value for the intercept greater than 0.05 suggests the absence of horizontal pleiotropy[25]. The MR-PRESSO method was applied to identify and remove outliers that might influence the results, where a Global test P -value > 0.05 indicates that there are no outliers caused by horizontal pleiotropy[26]. Furthermore, we employed a "leave-one-out" approach, performing MR analysis after sequentially excluding each SNP to determine whether a single SNP exerted an undue influence on the overall estimation[27]. Scatter plots were used to identify outliers or outliers, and funnel plots were employed to detect heterogeneity among genetic variants. 2.6 Functional Enrichment Analysis To investigate the shared biological pathways between plasma proteins and female infertility, we conducted enrichment analyses for Biological Process (Gene Ontology, GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathways, and Disease-gene Associations (DISEASES)[28]. 2.7 PPI Network Analysis and Drug-Gene Interactions To explore the intrinsic relationship between key proteins and the pathogenesis of female infertility, we conducted a protein-protein interaction (PPI) network analysis on the differentially expressed plasma proteins identified[29]. Utilizing the STRING database (https://string-db.org) and Cytoscape software[30, 31], we constructed the regulatory network with a minimum interaction score requirement set at 0.4. Additionally, we referred to the DGIdb database (https://www.dgidb.org/) to explore potential drugs targeting the risk proteins identified in our study and retrieved detailed drug information. 3 Results 3.1 Causal Relationship Between Plasma Proteins and Female Infertility Based on the results from IVW, MR-Egger, weighted median, simple mode, and weighted mode, an initial identification of 83 plasma proteins associated with female infertility was made for any result with a P-value less than 0.05, as shown in Fig. 3 . Further analysis considered only the results from the IVW method ( P < 0.05), excluding results with inconsistent directions of causal effect estimates and IVs with horizontal pleiotropy. Ultimately, two plasma proteins were identified to be associated with female infertility: Cardiotrophin-1 (CTF1) and Insulin-like growth factor-binding protein 5 (IGFBP5)(Table S1 -S3). Specifically, CTF1 (OR = 0.68, CI 0.53–0.87, P = 2.54× 10−3 ) was associated with a reduced risk of Female infertility, while IGFBP5 (OR = 1.35, CI 1.09–1.66, P = 5.54× 10−3 ) was associated with an increased risk of Female infertility, as depicted in Fig. 4 . 3.2 Sensitivity Analysis of Identified Proteins The results of the sensitivity analyses confirmed the robustness of our main MR analysis results (Table S4). The MR analysis included 26 SNPs for CTF1 and 23 SNPs for IGFBP5 (Fig. 5 a, b, c, d). Heterogeneity was not detected in the associations of the two plasma proteins, as determined by the Cochran’s Q test via the IVW method ( P > 0.05). No evidence of horizontal pleiotropy among the IVs was found, as assessed by either the MR-Egger intercept ( P Egger−Intercept > 0.05) or the MR-PRESSO global pleiotropy test ( P Global−Test > 0.05). Leave-one-out sensitivity analysis indicated that the causal effect estimates of the remaining SNPs did not significantly deviate from the overall effect after excluding any single SNP (Fig. 5 e, f). Additionally, funnel plots were used to further visually demonstrate the heterogeneity among SNPs, showing that most SNP effect sizes were symmetrically distributed within a concentrated area of the funnel plot without obvious outliers or anomalies (Fig. 5 g, h). Therefore, our findings indicate that the causal relationship between the identified proteins and female infertility is not compromised by potential risk factors. 3.3 Functional Enrichment Analysis Results Through an integrated multi-layered bioinformatics analysis, we have identified that CTF1 and IGFBP5 are involved in multiple biological processes and disease pathways. GO enrichment analysis indicates that these two proteins significantly participate in key biological processes including the Ciliary neurotrophic factor mediated signaling pathway, Regulation of tyrosine phosphorylation of STAT protein, and Cytokine mediated signaling pathway, which play central regulatory roles in cellular signaling and immune responses (Fig. 6 i). Notably, the Cytokine mediated signaling pathway shows significant enrichment in the GO analysis, further confirming the pivotal role of CTF1 and IGFBP5 in cellular signaling. KEGG pathway analysis reveals a significant association between CTF1 and IGFBP5 and the JAK-STAT signaling pathway, which plays an important role in cell proliferation, differentiation, and survival (Fig. 6 j). Additionally, the Cytokine-cytokine receptor interaction also shows significant enrichment, suggesting that these two proteins may play a crucial role in intercellular communication. Disease-gene association analysis using the DISEASES enrichment demonstrates a significant correlation between CTF1 and IGFBP5 and Cold induced sweating syndrome (Fig. 6 k). These disease associations, along with their roles in biological processes, collectively point to the potential mechanisms of CTF1 and IGFBP5 in thermoregulation and sweat gland function. Integrating these findings, we hypothesize that CTF1 and IGFBP5 may act as key molecular nodes, regulating cellular signaling, intercellular communication, and thermoregulation, thereby exerting multiple biological functions in cellular function and physiological homeostasis. 3.4 Protein-Protein Interaction (PPI) Network Analysis and Drug-Gene Interaction We loaded the two identified potential drug target proteins into the STRING database for network construction. The generated file was imported into Cytoscape for visualizing the PPI network. Ultimately, we identified 20 protein genes interacting with CTF1 and IGFBP5, namely CLCF1, PAPPA2, PAPPA, GHR, CNTFR, IL11RA, PIR, IGFALS, CLEC17A, OSMR, IL6ST, IGFBP6, IL27RA, IL22RA2, IL20RA, LIFR, CRLF1, FHL1, CNTF, and OSM (Fig. 7 ). Subsequently, utilizing a drug-gene interaction database, we identified 14 potential therapeutic targets for female infertility treatment from the PPI network comprising 22 proteins, including CTF1 and IGFBP5. Specifically, CNTFR, IL11RA, OSMR, IL6ST, LIFR, CNTF, and OSM were associated with CTF1, while FHL1, IGFBP6, PAPPA, and PAPPA2 were associated with IGFBP5, and GHR was associated with both CTF1 and IGFBP5. We identified seven drugs that directly interact with CTF1 (CARDIOTROPHIN-1, GO6976, NIK SMI1, COMPOUND 13C, KB-NB142-70, CRT 0066101, and BPKDI) and two drugs that directly interact with IGFBP5 (THERAPEUTIC ANDROSTANOLONE and RECOMBINANT TRANSFORMING GROWTH FACTOR-BETA 1) (Table S5). 4 Discussion In the present study, our MR analysis revealed causal relationships between plasma proteins and Female infertility, identifying potential diagnostic biomarkers and therapeutic targets for this condition. We identified two plasma proteins significantly associated with Female infertility: CTF1, which is associated with a reduced risk, and IGFBP5, which is associated with an increased risk. These findings highlight the complexity of Female infertility pathogenesis. The application of various sensitivity analyses enhanced the robustness of our results, with no significant horizontal pleiotropy observed, ensuring the reliability of the causal relationships. Through multi-level bioinformatics analyses, we found that CTF1 and IGFBP5 are not only key molecular nodes in Female infertility but are also significantly associated with diseases such as Cold-induced Sweating Syndrome. CTF1 is a cytokine belonging to the IL-6 family, named for its role in cardiomyocyte growth and survival[ 32 ]. Its receptor is a trimeric complex composed of glycoprotein 130 (gp130), leukemia inhibitory factor receptor (LIFR), and an undescribed third component[ 33 ]. While CTF1 has been extensively studied in cardiac function, recent research has shown that it also exerts protective effects on other organs, such as the liver, kidneys, and nervous system[ 34 ]. Zhu et al. demonstrated that CTF1-modified BMSCs produce exosomes that more effectively promote endometrial and myometrial tissue regeneration, enhancing uterine receptivity and driving neovascularization in a rat model system to improve embryo implantation rates[ 35 ]. This finding suggests that CTF1 expression is negatively correlated with Female infertility, consistent with our results. CTF1 is emerging as one of the most promising therapeutic and diagnostic targets. María et al. identified CTF1 as a key regulator of glucose and lipid metabolism, with recombinant CTF1 (rCTF-1) reducing body weight and correcting insulin resistance in ob/ob and high-fat-fed obese mice[ 36 ]. Additionally, studies have shown that CTF1 protects MIN6B1 cells and mouse islets from apoptosis under serum deprivation, preserving β-cell viability and preventing streptozotocin-induced diabetes[ 37 ]. CTF1 is upregulated during liver regeneration, where it is actively produced by stressed hepatocytes in an autocrine manner to activate survival signals and maintain cell viability, exerting a robust cytoprotective effect in acute severe liver injury models[ 38 ]. Furthermore, CTF1 treatment has been shown to improve embryo implantation in ICR and B6 mice by activating transcriptional activity in the uterine luminal epithelium[ 39 ]. IGFBP5 is a binding protein with high affinity for insulin-like growth factors (IGFs) and belongs to the insulin-like growth factor-binding protein family[ 40 ]. IGFBP5 has previously been studied in relation to osteoblast function and diabetes[ 41 , 42 ]. However, its association with Female infertility remains poorly understood. Our results provide evidence that IGFBP5 is a risk factor for Female infertility. Studies have shown that normal follicle development and ovulation are fundamental to female fertility, with follicle developmental abnormalities or ovulation disorders being common causes of female infertility. Granulosa cells (GCs), as important somatic cells within the follicle, determine follicle fate. Recently, Zhang et al. observed that overexpression of IGFBP5 inhibits granulosa cell proliferation and differentiation and induces their degeneration[ 43 ]. Considering the close relationship between follicle development and infertility, this finding appears reasonable. Further investigation is needed to elucidate the mechanisms by which IGFBP5 induces Female infertility. Our strengths lie in the fact that this is the first study to apply genome-wide MR analysis of plasma proteins to the investigation of female infertility. We have revealed the causal relationships between plasma proteins and female infertility from a genetic perspective and explored the underlying molecular mechanisms through integrated bioinformatics analyses. The two-sample MR approach avoids the confounding factors and reverse causality that are common in traditional observational studies and provides a replicable framework for future causal inference research. Additionally, the robustness of the MR results was validated using various sensitivity analysis methods, such as MR-Egger and MR-PRESSO, which showed low heterogeneity and no significant horizontal pleiotropy. This ensures that the identified causal associations between the plasma proteins and female infertility were not confounded by pleiotropy or heterogeneity. However, several limitations should be noted. First, the female infertility outcome data were primarily derived from individuals of European descent. Significant differences in genetic background, environmental factors, and lifestyle across different races and geographical regions may lead to population-specific biases, affecting the associations between protein expression levels and female infertility. Second, the current sample size of the female infertility dataset remains limited. The number of genetic loci identified through GWAS and other methods is restricted, which may result in false-negative findings due to the failure to detect susceptibility loci with moderate or weak effects. The effect sizes and generalizability of these loci need further validation. Third, the possibility of horizontal pleiotropy cannot be entirely ruled out, as IVs may influence the risk of female infertility through pathways other than the target proteins. Although no significant horizontal pleiotropy was detected in our sensitivity analyses, this possibility cannot be completely excluded. Finally, we used a more stringent LD threshold (R² < 0.1) and a shorter linkage distance (100 kb), which ensured the independence of instrumental variables but may have reduced the number of available IVs, leading to decreased statistical power and more conservative results. 5 Conclusion In summary, this study systematically evaluated the potential genetic associations between plasma proteins and female infertility using MR analysis and successfully identified two plasma proteins significantly associated with the risk of female infertility. These findings provide new insights into the complex pathogenesis of female infertility and open up new avenues for future research and clinical applications. Future studies should expand the sample size and validate the findings in diverse racial and geographical populations to ensure the generalizability and reliability of the results. Integrating genetic data with clinical phenotypes and other multi-omics information may further elucidate the specific roles of these proteins in female infertility, thereby promoting the development of personalized prevention and treatment strategies to improve patient outcomes and quality of life. Declarations Acknowledgements We express our sincere gratitude to DECODE Genetics for providing the summary data. We also acknowledge the contributions of the participants and researchers of the FinnGen study to the GWAS dataset on female infertility used in this study. Author Contribution Y.F and C.W: Project development, H.R and C.W: Data Collection, Data collection, Y.F and H.R: Manuscript writing. All authors read and approved the final manuscript. Funding This study was funded by Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX24_2187) and National Natural Science Foundation of China (82274638). Data availability This study analyzed publicly available datasets. Plasma protein data were obtained from the deCODE Genetics database (https://www.decode.com/summarydata/), while the GWAS summary data for female infertility were sourced from FinnGen (https://r10.finngen.fi/). Conflict of interest All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. Ethics approval As this study utilized data from publicly available databases, no ethical approval was required. Consent for publication Not applicable. Clinical trial number Not applicable. References Starc A, Trampuš M, Pavan Jukić D, Rotim C, Jukić T, Polona Mivšek A (2019) INFERTILITY AND SEXUAL DYSFUNCTIONS: A SYSTEMATIC LITERATURE REVIEW. Acta clinica Croatica 58:508-515.https://doi.org/10.20471/acc.2019.58.03.15 Bala R, Singh V, Rajender S, Singh K (2021) Environment, Lifestyle, and Female Infertility. Reproductive Sciences 28:617-638.https://doi.org/10.1007/s43032-020-00279-3 Vander Borght M, Wyns C (2018) Fertility and infertility: Definition and epidemiology. Clinical Biochemistry 62:2-10.https://doi.org/10.1016/j.clinbiochem.2018.03.012 Mo W, Zhang J, Peng X, Wang Y (2024) Causal relationship between genetically predicted antibody-Mediated Immune Responses and female infertility. Journal of Reproductive Immunology 166:104319.https://doi.org/10.1016/j.jri.2024.104319 Zarif Golbar Yazdi H, Aghamohammadian Sharbaf H, Kareshki H, Amirian M (2020) Psychosocial Consequences of Female Infertility in Iran: A Meta-Analysis. Frontiers in psychiatry 11:518961.https://doi.org/10.3389/fpsyt.2020.518961 Stanhiser J, Steiner AZ (2018) Psychosocial Aspects of Fertility and Assisted Reproductive Technology. Obstetrics and gynecology clinics of North America 45:563-574.https://doi.org/10.1016/j.ogc.2018.04.006 Santos R, Ursu O, Gaulton A, Bento AP, Donadi RS, Bologa CG, et al (2017) A comprehensive map of molecular drug targets. Nature reviews Drug discovery 16:19-34.https://doi.org/10.1038/nrd.2016.230 Suhre K, McCarthy MI, Schwenk JM (2021) Genetics meets proteomics: perspectives for large population-based studies. Nature reviews Genetics 22:19-37.https://doi.org/10.1038/s41576-020-0268-2 Anderson NL, Anderson NG (2002) The human plasma proteome: history, character, and diagnostic prospects. Molecular & cellular proteomics : MCP 1:845-867.https://doi.org/10.1074/mcp.r200007-mcp200 Zhang J, Li Y, Gong A, Wang J (2024) From proteome to pathogenesis: investigating polycystic ovary syndrome with Mendelian randomization analysis. Frontiers in endocrinology 15:1442483.https://doi.org/10.3389/fendo.2024.1442483 Sekula P, Del Greco MF, Pattaro C, Köttgen A (2016) Mendelian Randomization as an Approach to Assess Causality Using Observational Data. Journal of the American Society of Nephrology : JASN 27:3253-3265.https://doi.org/10.1681/ASN.2016010098 Davey Smith G, Hemani G (2014) Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Human molecular genetics 23:R89-98.https://doi.org/10.1093/hmg/ddu328 Zhu Z, Zhang F, Hu H, Bakshi A, Robinson MR, Powell JE, et al (2016) Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nature genetics 48:481-487.https://doi.org/10.1038/ng.3538 Emdin CA, Khera AV, Kathiresan S (2017) Mendelian Randomization. Jama 318:1925-1926.https://doi.org/10.1001/jama.2017.17219 Ferkingstad E, Sulem P, Atlason BA, Sveinbjornsson G, Magnusson MI, Styrmisdottir EL, et al (2021) Large-scale integration of the plasma proteome with genetics and disease. Nature genetics 53:1712-1721.https://doi.org/10.1038/s41588-021-00978-w Kurki MI, Karjalainen J, Palta P, Sipilä TP, Kristiansson K, Donner KM, et al (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613:508-518.https://doi.org/10.1038/s41586-022-05473-8 Bentham J, Morris DL, Graham DSC, Pinder CL, Tombleson P, Behrens TW, et al (2015) Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus. Nature genetics 47:1457-1464.https://doi.org/10.1038/ng.3434 Burgess S, Thompson SG (2011) Avoiding bias from weak instruments in Mendelian randomization studies. International journal of epidemiology 40:755-764.https://doi.org/10.1093/ije/dyr036 Woolf B, Di Cara N, Moreno-Stokoe C, Skrivankova V, Drax K, Higgins JPT, et al (2022) Investigating the transparency of reporting in two-sample summary data Mendelian randomization studies using the MR-Base platform. International journal of epidemiology 51:1943-1956.https://doi.org/10.1093/ije/dyac074 Burgess S, Butterworth A, Thompson SG (2013) Mendelian randomization analysis with multiple genetic variants using summarized data. Genetic epidemiology 37:658-665.https://doi.org/10.1002/gepi.21758 Birney E (2022) Mendelian Randomization. Cold Spring Harbor perspectives in medicine 12https://doi.org/10.1101/cshperspect.a041302 Burgess S, Thompson SG (2017) Interpreting findings from Mendelian randomization using the MR-Egger method. European journal of epidemiology 32:377-389.https://doi.org/10.1007/s10654-017-0255-x Bowden J, Davey Smith G, Haycock PC, Burgess S (2016) Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Genetic epidemiology 40:304-314.https://doi.org/10.1002/gepi.21965 Greco MF, Minelli C, Sheehan NA, Thompson JR (2015) Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Statistics in medicine 34:2926-2940.https://doi.org/10.1002/sim.6522 Bowden J, Davey Smith G, Burgess S (2015) Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. International journal of epidemiology 44:512-525.https://doi.org/10.1093/ije/dyv080 Verbanck M, Chen CY, Neale B, Do R (2018) Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nature genetics 50:693-698.https://doi.org/10.1038/s41588-018-0099-7 Burgess S, Bowden J, Fall T, Ingelsson E, Thompson SG (2017) Sensitivity Analyses for Robust Causal Inference from Mendelian Randomization Analyses with Multiple Genetic Variants. 28:30-42.https://doi.org/10.1097/EDE.0000000000000559 Chen JY, Wang JF, Hu Y, Li XH, Qian YR, Song CL (2025) Evaluating the advancements in protein language models for encoding strategies in protein function prediction: a comprehensive review. Frontiers in bioengineering and biotechnology 13:1506508.https://doi.org/10.3389/fbioe.2025.1506508 Doncheva NT, Morris JH, Holze H, Kirsch R, Nastou KC, Cuesta-Astroz Y, et al (2023) Cytoscape stringApp 2.0: Analysis and Visualization of Heterogeneous Biological Networks. Journal of proteome research 22:637-646.https://doi.org/10.1021/acs.jproteome.2c00651 Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, et al (2019) STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic acids research 47:D607-d613.https://doi.org/10.1093/nar/gky1131 Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome research 13:2498-2504.https://doi.org/10.1101/gr.1239303 Pennica D, King KL, Shaw KJ, Luis E, Rullamas J, Luoh SM, et al (1995) Expression cloning of cardiotrophin 1, a cytokine that induces cardiac myocyte hypertrophy. Proceedings of the National Academy of Sciences of the United States of America 92:1142-1146.https://www.pnas.org/doi/abs/10.1073/pnas.92.4.1142 Robledo O, Fourcin M, Chevalier S, Guillet C, Auguste P, Pouplard-Barthelaix A, et al (1997) Signaling of the Cardiotrophin-1 Receptor: EVIDENCE FOR A THIRD RECEPTOR COMPONENT*. Journal of Biological Chemistry 272:4855-4863.https://doi.org/10.1074/jbc.272.8.4855 López-Yoldi M, Moreno-Aliaga MJ, Bustos M (2015) Cardiotrophin-1: A multifaceted cytokine. Cytokine & growth factor reviews 26:523-532.https://doi.org/10.1016/j.cytogfr.2015.07.009 Zhu Q, Tang S, Zhu Y, Chen D, Huang J, Lin J (2022) Exosomes Derived From CTF1-Modified Bone Marrow Stem Cells Promote Endometrial Regeneration and Restore Fertility. Frontiers in bioengineering and biotechnology 10:868734.https://doi.org/10.3389/fbioe.2022.868734 Moreno-Aliaga MJ, Pérez-Echarri N, Marcos-Gómez B, Larequi E, Gil-Bea FJ, Viollet B, et al (2011) Cardiotrophin-1 is a key regulator of glucose and lipid metabolism. Cell metabolism 14:242-253.https://doi.org/10.1016/j.cmet.2011.05.013 Jiménez-González M, Jaques F, Rodríguez S, Porciuncula A, Principe RM, Abizanda G, et al (2013) Cardiotrophin 1 protects beta cells from apoptosis and prevents streptozotocin-induced diabetes in a mouse model. Diabetologia 56:838-846.https://doi.org/10.1007/s00125-012-2822-8 Bustos M, Beraza N, Lasarte J-J, Baixeras E, Alzuguren P, Bordet T, Prieto J (2003) Protection against liver damage by cardiotrophin-1: a hepatocyte survival factor up-regulated in the regenerating liver in rats. Gastroenterology 125:192-201.https://doi.org/10.1016/S0016-5085(03)00698-X Kobayashi R, Terakawa J, Kato Y, Azimi S, Inoue N, Ohmori Y, Hondo E (2014) The contribution of leukemia inhibitory factor (LIF) for embryo implantation differs among strains of mice. Immunobiology 219:512-521.https://doi.org/10.1016/j.imbio.2014.03.011 Tripathi G, Salih DA, Drozd AC, Cosgrove RA, Cobb LJ, Pell JM (2009) IGF-independent effects of insulin-like growth factor binding protein-5 (Igfbp5) in vivo. FASEB journal : official publication of the Federation of American Societies for Experimental Biology 23:2616-2626.https://doi.org/10.1096/fj.08-114124 Kanatani M, Sugimoto T, Nishiyama K, Chihara K (2000) Stimulatory effect of insulin-like growth factor binding protein-5 on mouse osteoclast formation and osteoclastic bone-resorbing activity. Journal of bone and mineral research : the official journal of the American Society for Bone and Mineral Research 15:902-910.https://doi.org/10.1359/jbmr.2000.15.5.902 Li X, Tang J, Lin S, Liu X, Li Y (2024) Mendelian randomization analysis demonstrates the causal effects of IGF family members in diabetes. Frontiers in medicine 11:1332162.https://doi.org/10.3389/fmed.2024.1332162 Zhang W, Chen X, Nie R, Guo A, Ling Y, Zhang B, Zhang H (2024) Single-cell transcriptomic analysis reveals regulative mechanisms of follicular selection and atresia in chicken granulosa cells. Food research international (Ottawa, Ont) 198:115368.https://doi.org/10.1016/j.foodres.2024.115368 Additional Declarations No competing interests reported. Supplementary Files SupplementaryFigure.docx SupplementaryTable.xlsx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6313119","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":451909800,"identity":"1af388b4-c840-4d4c-be15-3881d6aaaebc","order_by":0,"name":"Yi Fang","email":"","orcid":"","institution":"Nanjing University of Chinese Medicine","correspondingAuthor":false,"prefix":"","firstName":"Yi","middleName":"","lastName":"Fang","suffix":""},{"id":451909801,"identity":"adf4052d-6e09-4c24-b2ac-3454ca025765","order_by":1,"name":"He Ren","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA+UlEQVRIie3RsWrDMBCA4RMCZbnEq4RL+woqgkAhkFc5E/DkoW8QQyFZGroqb5HJs4qHLn4AQTq0FDoHPGVpm2IonWSPgegHLeI+JDiAWOwMSziv3wi/l08A1F25HqLWq1wfrjjblkOJbhqt7Iyz3d9kHwFP2mAhuNmXn/JYwPXEE2vvA4JZog9sUExfXa42FRjliac2QLgkZ8aPEqee8pRVkO08CY4BImRWpuMvLY3tyLKXINagLJLWsiOk+4gcrYQ+oCPpaXG3qeTttnl/SENkXiftaZWOEltk/ljNbiYvi+c2RP7/kX5fPR1WDgMAIzd0MhaLxS6sH+fYTIJYPF4pAAAAAElFTkSuQmCC","orcid":"","institution":"Nanjing University of Chinese Medicine","correspondingAuthor":true,"prefix":"","firstName":"He","middleName":"","lastName":"Ren","suffix":""},{"id":451909802,"identity":"784c2a02-05dd-4a8c-86ea-49f8cbf09b61","order_by":2,"name":"Chun Wang","email":"","orcid":"","institution":"Nanjing University of Chinese Medicine","correspondingAuthor":false,"prefix":"","firstName":"Chun","middleName":"","lastName":"Wang","suffix":""},{"id":451909806,"identity":"e9912a21-5b63-463a-9544-54c293ea00c2","order_by":3,"name":"Liangjun xia","email":"","orcid":"","institution":"Nanjing University of Chinese Medicine","correspondingAuthor":false,"prefix":"","firstName":"Liangjun","middleName":"","lastName":"xia","suffix":""},{"id":451909807,"identity":"34b07f95-05e5-4d36-bbf1-7fac8ed4beb1","order_by":4,"name":"Youbing Xia","email":"","orcid":"","institution":"Nanjing University of Chinese Medicine","correspondingAuthor":false,"prefix":"","firstName":"Youbing","middleName":"","lastName":"Xia","suffix":""}],"badges":[],"createdAt":"2025-03-26 14:08:11","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6313119/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6313119/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":82310428,"identity":"012b8489-e7e2-480f-b3c1-5b2c19ee7baa","added_by":"auto","created_at":"2025-05-09 01:51:46","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":75593,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eOverview of Mendelian Randomization Analysis and Its Three Key Assumptions\u003c/strong\u003e. IVs, Instrumental Variables; SNP, Single Nucleotide Polymorphism\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-6313119/v1/3f0745389a7d00603c433e6b.png"},{"id":82310421,"identity":"c4dabdcd-c478-45bb-a401-c29729f5db95","added_by":"auto","created_at":"2025-05-09 01:51:46","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":185064,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eResearch Framework Diagram\u003c/strong\u003e. MR, Mendelian randomization\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-6313119/v1/2431d48b626e5632a5b22c05.png"},{"id":82310419,"identity":"5cec47ea-7d10-4a5a-a122-7a2e126cee65","added_by":"auto","created_at":"2025-05-09 01:51:46","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":163890,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCircular Heatmap Reveals the Association Between Plasma Proteins and Female Infertility\u003c/strong\u003e. A total of 83 unique plasma proteins were preliminarily identified as significantly associated with female infertility\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-6313119/v1/6fa259280e8717733be9f8da.png"},{"id":82310432,"identity":"5e33517c-4833-4dd4-9e13-abdcdb052467","added_by":"auto","created_at":"2025-05-09 01:51:46","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":170320,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eMendelian Randomization Analysis Results of the Genetic Association Between Plasma Proteins and Female Infertility\u003c/strong\u003e. MR, Mendelian randomization; OR, Odds Ratio; 95% CI, 95% Confidence Interval. Circles represent the estimated causal effects, and the lines represent the 95% CIs for these ORs\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-6313119/v1/ee52408839d97175e6e377ff.png"},{"id":82311690,"identity":"f9252dff-deee-4128-9f59-68e63f97b1b7","added_by":"auto","created_at":"2025-05-09 01:59:46","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":113326,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSensitivity analysis results of identified proteins\u003c/strong\u003e. Scatter plots: Different regression lines represent effect sizes calculated through various Mendelian Randomization (MR) tests, (a) Scatter plot for CTF1, (b) Scatter plot for IGFBP5. Forest plots: Unidirectional causal inference for SNPs, where each horizontal solid line represents the effect size and its 95% confidence interval of an individual SNP, (c) Forest plot for CTF1, (d) Forest plot for IGFBP5. Leave-one-out sensitivity analysis: All SNPs represented by black dots are located on the same side of the \"0\" boundary, indicating that no single SNP significantly interferes with the overall effect, (e) Leave-one-out sensitivity analysis for CTF1, (f) Leave-one-out sensitivity analysis for IGFBP5. Funnel plots: In the absence of bias, the funnel plot exhibits a symmetrical shape, meaning there is no systematic bias between the study effect and its precision, (g) Funnel plot for CTF1, (h) Funnel plot for IGFBP5. CTF1=Cardiotrophin-1; IGFBP5=Insulin-like growth factor-binding protein 5; MR:Mendelian randomization;SNP: single nucleotide polymorphism;β, Effect estimate; SE, Standard error\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-6313119/v1/3b45746a2c4511ff1820ebd0.png"},{"id":82311689,"identity":"03357fff-2c48-4c08-9cf6-b87ce54a2e3d","added_by":"auto","created_at":"2025-05-09 01:59:46","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":148809,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFunctional Enrichment Visualization Analysis\u003c/strong\u003e. KEGG,Kyoto Encyclopedia of Genes and Genomes. FDR,False Discovery Rate. Intensity of Enrichment Signals Indicated by Color Gradient: Lighter Shades Correspond to Lower FDR Values, Indicating Stronger Enrichment and Higher Statistical Significance; Darker Shades Represent Higher FDR Values and Weaker Enrichment. Circle Size Reflects the Number of Genes Participating in the Biological Process, with Larger Circles Indicating Greater Gene Involvement. (i) Biological Process (Gene Ontology) enrichment, (j) KEGG Pathways enrichment, (k) Disease-gene Associations (DISEASES) enrichment\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-6313119/v1/3653df603702054d2e04a577.png"},{"id":82310425,"identity":"2e508148-77fa-43bb-be78-1ca2a0fbbc1b","added_by":"auto","created_at":"2025-05-09 01:51:46","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":161242,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eProtein-Protein Interaction Network Between Identified and Putative Protein Targets\u003c/strong\u003e. Yellow circles represent plasma proteins (CTF1 and IGFBP5), while green circles represent potential protein targets associated with the currently identified proteins. The thickness of the black lines indicates the strength of the supporting data, with thicker lines representing stronger protein-protein associations\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-6313119/v1/1a602b80249cad1bc1ae5a4d.png"},{"id":87411793,"identity":"1e23d4a3-7c0d-40f6-a3f8-99f539be9988","added_by":"auto","created_at":"2025-07-23 13:53:48","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1702012,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6313119/v1/9087112c-cdf0-4e59-94b7-20289b742ecb.pdf"},{"id":82310423,"identity":"dc260136-844a-45b3-82d2-45acbe4692bf","added_by":"auto","created_at":"2025-05-09 01:51:46","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":552568,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryFigure.docx","url":"https://assets-eu.researchsquare.com/files/rs-6313119/v1/16e43949b16e34e93f001192.docx"},{"id":82311685,"identity":"e2127b58-d4bc-4724-aa5f-87d2ab54e741","added_by":"auto","created_at":"2025-05-09 01:59:46","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":2143283,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-6313119/v1/c184d3c84661d6eb44f7e4ea.xlsx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Revealing Novel Protein Biomarkers for Female Infertility through an Integrated Analysis of Plasma Proteomics and Mendelian Randomization","fulltext":[{"header":"1 Introduction","content":"\u003cp\u003eAccording to the World Health Organization (WHO), infertility is defined as the inability to achieve a clinical pregnancy after 12 months of regular, unprotected sexual intercourse[\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. The global prevalence of infertility is approximately 8\u0026ndash;12%[\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e], with female factors accounting for 33\u0026ndash;41% of cases[\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. Female infertility is often associated with pathological characteristics such as ovulatory disorders, endometriosis, uterine abnormalities, endocrine and genital developmental anomalies, as well as clinical manifestations like menstrual irregularities, dysmenorrhea, acne, obesity, abnormal vaginal discharge, and decreased libido. Currently, assisted reproductive technology (ART) is the primary treatment for infertility, but its success rate is limited by various factors, including female age, egg quality, and endometrial receptivity. Additionally, the complex treatment process and uncertain outcomes, along with high costs, can place a double burden of physical and financial stress on women. Due to societal and cultural expectations, as well as traditional family roles, women are often assigned significant responsibility for childbearing, making infertility a source of considerable psychological stress. Studies have shown that prolonged psychological stress can lead to mental health disorders such as anxiety and depression, increase the risk of marital conflicts, and negatively impact the marital relationship[\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Given the increasing prevalence of infertility, identifying its causes and deeply investigating the causal associations with potential risk factors is of significant importance for improving patients' reproductive outcomes and optimizing clinical treatment strategies.\u003c/p\u003e \u003cp\u003ePlasma proteins play a central role in a variety of biological processes and serve as key biomarkers for diagnosing specific diseases as well as potential therapeutic targets[\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. These proteins enter the bloodstream through cellular leakage or active secretion, acting as critical regulators of molecular pathways. Plasma proteins encompass a wide range of immune-related proteins, such as immunoglobulins and cytokines, which play crucial roles in mediating immune responses and regulating inflammation[\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. With the rapid advancement of genome-wide association studies (GWAS) in human plasma proteomics, the integration framework of genomics and proteomics data has gradually been perfected, providing strong technical support for the discovery of biomarkers. Against this backdrop, proteome-wide association studies (PWAS), as an extension of GWAS, hold promise to become an important direction for the study of female infertility, offering new research pathways for elucidating the complex genetic and molecular mechanisms underlying this condition.\u003c/p\u003e \u003cp\u003eDue to the limitations of traditional study designs, observational studies are susceptible to confounding factors and reverse causality, while randomized controlled trials (RCTs) face practical constraints such as ethical considerations, cost, and time, which limit their ability to address all research questions and may lead to biased results[\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. Mendelian Randomization (MR) is a causal inference method based on genetic principles, utilizing genetic variants (GVs) as instrumental variables (IVs) to assess the causal relationship between exposure and outcome[\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]. The distribution of genetic variants possesses inherent randomness and stability, which naturally differentiates individuals carrying specific genetic variants from those who do not in terms of exposure[\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. By comparing the outcome differences between these two groups, the causal effect of exposure on the outcome can be inferred. Currently, no studies have established the association between the human plasma proteome and the risk of female infertility. Therefore, the aim of our study is to evaluate the causal relationship between the human plasma proteome and female infertility using MR analysis, and to integrate functional enrichment analysis, protein-protein interaction (PPI) network analysis, and drug-gene interactions to understand the pathogenesis of female infertility. This could facilitate the development of personalized treatment strategies and the identification of novel biomarkers for targeted therapies in female infertility.\u003c/p\u003e"},{"header":"2 Materials and Methods","content":"\u003cdiv id=\"Sec3\"\u003e\n \u003ch2\u003e2.1 Study Design\u003c/h2\u003e\n \u003cp\u003eTo assess the potential causal relationship between plasma proteins and female infertility, we employed a two-sample MR analysis approach. The MR analysis is guided by three key assumptions: (1) Relevance assumption: the IVs are strongly associated with the exposure variable (plasma proteins); (2) Independence assumption: the IVs are independent of confounding factors; (3) Exclusivity assumption: the IVs affect the outcome (female infertility) only through the exposure variable[14]. The three main assumptions of MR analysis in this project are depicted in Fig. 1.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec4\"\u003e\n \u003ch2\u003e2.2 Sources of exposure and outcome data\u003c/h2\u003e\n \u003cp\u003eThe pQTL (protein quantitative trait loci) dataset was provided by DECODE Genetics (https://www.decode.com/summarydata/) and is based on a large-scale population study, including 35,559 samples of individuals of European descent and covering 4,907 proteins[15]. The outcome data originated from the FinnGen database. The FinnGen study (https://r10.finngen.fi/) is a large-scale genomics initiative that has analyzed over 500,000 Finnish biobank samples and correlated genetic variation with health data to understand disease mechanisms and predispositions. The project is a collaboration between research organisations and biobanks within Finland and international industry partners[16]. The FinnGen database utilizes the SomaScan high-throughput detection platform for proteomics analysis. This technology not only enables the detection of low-abundance proteins but also offers a broader dynamic detection range, significantly enhancing the efficiency and accuracy of proteomics research. In this study, we obtained outcome data for female infertility from the FinnGen database, which included 14,759 cases and 111,583 controls.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec5\"\u003e\n \u003ch2\u003e2.3 Instrumental Variable Selection\u003c/h2\u003e\n \u003cp\u003eWe adhered to the three fundamental assumptions outlined previously and applied stringent selection criteria to identify eligible IVs. (1) SNPs significantly associated with the exposure factor were selected, with a significance level set at 1×10\u003csup\u003e− 5\u003c/sup\u003e[17]; (2) Linkage disequilibrium (LD) among SNPs was eliminated, using an R² value corresponding to a 100 kb LD distance, with a threshold set at R² \u0026lt; 0.1; (3) Palindromic SNPs were excluded. Furthermore, to avoid bias due to weak IVs, we assessed the strength of the IVs by calculating the F-statistic, and IVs with an F-statistic \u0026lt; 10 were defined as weak instruments and excluded[18]. The calculation method for the F-statistic is as follows:\u003c/p\u003e\n \u003cp\u003eR\u003csup\u003e2\u003c/sup\u003e = 2× (1-EAF) × EAF × β\u003csup\u003e2\u003c/sup\u003e/[2 ×(1-EAF)× EAF × β\u003csup\u003e2\u003c/sup\u003e + 2 ×(1-EAF)× EAF× SE\u003csup\u003e2\u003c/sup\u003e× N]\u003c/p\u003e\n \u003cp\u003estatistics = (N-2)×R\u003csup\u003e2\u003c/sup\u003e/(1-R\u003csup\u003e2\u003c/sup\u003e)\u003c/p\u003e\n \u003cp\u003eEAF represents the frequency of the effect allele for the exposure, while β and SE denote the estimated effect and standard error of the exposure, respectively. N indicates the sample size for the exposure.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec6\"\u003e\n \u003ch2\u003e2.4 Mendelian Randomization Analysis\u003c/h2\u003e\n \u003cp\u003eTo establish the connection between genetically predicted protein levels and female infertility, a two-sample Mendelian Randomization (TSMR) analysis was conducted. Statistical analyses were performed using \"TwoSampleMR\" (version 0.6.6) with R4.3.3[19]. The primary method for estimating causal effects was the inverse variance weighted (IVW) approach, which offers the highest statistical power under the assumption that all IVs are valid instruments[20]. We utilized random-effects IVW analysis in the presence of heterogeneity (\u003cem\u003eP\u003c/em\u003e \u0026lt; 0.05) and fixed-effects IVW analysis in the absence of heterogeneity[21]. Additionally, supplementary analyses were conducted through MR Egger, weighted median, simple mode, and weighted mode methods. The MR Egger method effectively detects and addresses horizontal pleiotropy, where IVs may influence the outcome variable through pathways independent of the causal pathway of the exposure of interest[22]. The weighted median method serves as an important complementary tool, assuming that at least 50% of the IVs are valid[23]. The simple mode method clusters the causal effect estimates of IVs and selects the most frequently occurring cluster as the final estimate, effectively reducing the interference of outliers or invalid IVs through a clustering and mode selection mechanism. The weighted mode estimation method offers enhanced causal effect detection capabilities, reduced bias, and a lower Type I error rate. This study reports the odds ratio (OR), indicating that with each one standard deviation increase in plasma protein levels, the risk of female infertility increases. Results from IVW, MR-Egger, weighted median, simple mode, and weighted mode can be considered when the direction of causal effect estimates is consistent. When there is inconsistency in the direction of estimates, we do not establish a correlation between plasma proteins and female infertility. The research framework diagram of MR analysis is presented in Fig. 2.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec7\"\u003e\n \u003ch2\u003e2.5 Sensitivity Analysis\u003c/h2\u003e\n \u003cp\u003eWe systematically conducted a series of sensitivity analyses to verify the robustness of our study results. Specifically, Cochran's Q test based on the IVW method was employed to assess the heterogeneity of IVs, with a \u003cem\u003eP\u003c/em\u003e \u0026gt; 0.05 indicating no significant heterogeneity[24]. The MR-Egger approach was utilized to identify and evaluate horizontal pleiotropy; a p-value for the intercept greater than 0.05 suggests the absence of horizontal pleiotropy[25]. The MR-PRESSO method was applied to identify and remove outliers that might influence the results, where a Global test \u003cem\u003eP\u003c/em\u003e-value \u0026gt; 0.05 indicates that there are no outliers caused by horizontal pleiotropy[26]. Furthermore, we employed a \"leave-one-out\" approach, performing MR analysis after sequentially excluding each SNP to determine whether a single SNP exerted an undue influence on the overall estimation[27]. Scatter plots were used to identify outliers or outliers, and funnel plots were employed to detect heterogeneity among genetic variants.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec8\"\u003e\n \u003ch2\u003e2.6 Functional Enrichment Analysis\u003c/h2\u003e\n \u003cp\u003eTo investigate the shared biological pathways between plasma proteins and female infertility, we conducted enrichment analyses for Biological Process (Gene Ontology, GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathways, and Disease-gene Associations (DISEASES)[28].\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec9\"\u003e\n \u003ch2\u003e2.7 PPI Network Analysis and Drug-Gene Interactions\u003c/h2\u003e\n \u003cp\u003eTo explore the intrinsic relationship between key proteins and the pathogenesis of female infertility, we conducted a protein-protein interaction (PPI) network analysis on the differentially expressed plasma proteins identified[29]. Utilizing the STRING database (https://string-db.org) and Cytoscape software[30, 31], we constructed the regulatory network with a minimum interaction score requirement set at 0.4. Additionally, we referred to the DGIdb database (https://www.dgidb.org/) to explore potential drugs targeting the risk proteins identified in our study and retrieved detailed drug information.\u003c/p\u003e\n\u003c/div\u003e"},{"header":"3 Results","content":"\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003e3.1 Causal Relationship Between Plasma Proteins and Female Infertility\u003c/h2\u003e \u003cp\u003eBased on the results from IVW, MR-Egger, weighted median, simple mode, and weighted mode, an initial identification of 83 plasma proteins associated with female infertility was made for any result with a P-value less than 0.05, as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e. Further analysis considered only the results from the IVW method (\u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;0.05), excluding results with inconsistent directions of causal effect estimates and IVs with horizontal pleiotropy. Ultimately, two plasma proteins were identified to be associated with female infertility: Cardiotrophin-1 (CTF1) and Insulin-like growth factor-binding protein 5 (IGFBP5)(Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e-S3). Specifically, CTF1 (OR\u0026thinsp;=\u0026thinsp;0.68, CI 0.53\u0026ndash;0.87, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;2.54\u0026times;\u003csup\u003e10\u0026minus;3\u003c/sup\u003e) was associated with a reduced risk of Female infertility, while IGFBP5 (OR\u0026thinsp;=\u0026thinsp;1.35, CI 1.09\u0026ndash;1.66, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;5.54\u0026times;\u003csup\u003e10\u0026minus;3\u003c/sup\u003e) was associated with an increased risk of Female infertility, as depicted in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003e3.2 Sensitivity Analysis of Identified Proteins\u003c/h2\u003e \u003cp\u003eThe results of the sensitivity analyses confirmed the robustness of our main MR analysis results (Table S4). The MR analysis included 26 SNPs for CTF1 and 23 SNPs for IGFBP5 (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ea, b, c, d). Heterogeneity was not detected in the associations of the two plasma proteins, as determined by the Cochran\u0026rsquo;s Q test via the IVW method (\u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026gt;\u0026thinsp;0.05). No evidence of horizontal pleiotropy among the IVs was found, as assessed by either the MR-Egger intercept (\u003cem\u003eP\u003c/em\u003e\u003csub\u003eEgger\u0026minus;Intercept\u003c/sub\u003e \u0026gt; 0.05) or the MR-PRESSO global pleiotropy test (\u003cem\u003eP\u003c/em\u003e\u003csub\u003eGlobal\u0026minus;Test\u003c/sub\u003e \u0026gt; 0.05). Leave-one-out sensitivity analysis indicated that the causal effect estimates of the remaining SNPs did not significantly deviate from the overall effect after excluding any single SNP (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ee, f). Additionally, funnel plots were used to further visually demonstrate the heterogeneity among SNPs, showing that most SNP effect sizes were symmetrically distributed within a concentrated area of the funnel plot without obvious outliers or anomalies (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eg, h). Therefore, our findings indicate that the causal relationship between the identified proteins and female infertility is not compromised by potential risk factors.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003e3.3 Functional Enrichment Analysis Results\u003c/h2\u003e \u003cp\u003eThrough an integrated multi-layered bioinformatics analysis, we have identified that CTF1 and IGFBP5 are involved in multiple biological processes and disease pathways. GO enrichment analysis indicates that these two proteins significantly participate in key biological processes including the Ciliary neurotrophic factor mediated signaling pathway, Regulation of tyrosine phosphorylation of STAT protein, and Cytokine mediated signaling pathway, which play central regulatory roles in cellular signaling and immune responses (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003ei). Notably, the Cytokine mediated signaling pathway shows significant enrichment in the GO analysis, further confirming the pivotal role of CTF1 and IGFBP5 in cellular signaling. KEGG pathway analysis reveals a significant association between CTF1 and IGFBP5 and the JAK-STAT signaling pathway, which plays an important role in cell proliferation, differentiation, and survival (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003ej). Additionally, the Cytokine-cytokine receptor interaction also shows significant enrichment, suggesting that these two proteins may play a crucial role in intercellular communication. Disease-gene association analysis using the DISEASES enrichment demonstrates a significant correlation between CTF1 and IGFBP5 and Cold induced sweating syndrome (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003ek). These disease associations, along with their roles in biological processes, collectively point to the potential mechanisms of CTF1 and IGFBP5 in thermoregulation and sweat gland function. Integrating these findings, we hypothesize that CTF1 and IGFBP5 may act as key molecular nodes, regulating cellular signaling, intercellular communication, and thermoregulation, thereby exerting multiple biological functions in cellular function and physiological homeostasis.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003e3.4 Protein-Protein Interaction (PPI) Network Analysis and Drug-Gene Interaction\u003c/h2\u003e \u003cp\u003eWe loaded the two identified potential drug target proteins into the STRING database for network construction. The generated file was imported into Cytoscape for visualizing the PPI network. Ultimately, we identified 20 protein genes interacting with CTF1 and IGFBP5, namely CLCF1, PAPPA2, PAPPA, GHR, CNTFR, IL11RA, PIR, IGFALS, CLEC17A, OSMR, IL6ST, IGFBP6, IL27RA, IL22RA2, IL20RA, LIFR, CRLF1, FHL1, CNTF, and OSM (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e). Subsequently, utilizing a drug-gene interaction database, we identified 14 potential therapeutic targets for female infertility treatment from the PPI network comprising 22 proteins, including CTF1 and IGFBP5. Specifically, CNTFR, IL11RA, OSMR, IL6ST, LIFR, CNTF, and OSM were associated with CTF1, while FHL1, IGFBP6, PAPPA, and PAPPA2 were associated with IGFBP5, and GHR was associated with both CTF1 and IGFBP5. We identified seven drugs that directly interact with CTF1 (CARDIOTROPHIN-1, GO6976, NIK SMI1, COMPOUND 13C, KB-NB142-70, CRT 0066101, and BPKDI) and two drugs that directly interact with IGFBP5 (THERAPEUTIC ANDROSTANOLONE and RECOMBINANT TRANSFORMING GROWTH FACTOR-BETA 1) (Table S5).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"4 Discussion","content":"\u003cp\u003eIn the present study, our MR analysis revealed causal relationships between plasma proteins and Female infertility, identifying potential diagnostic biomarkers and therapeutic targets for this condition. We identified two plasma proteins significantly associated with Female infertility: CTF1, which is associated with a reduced risk, and IGFBP5, which is associated with an increased risk. These findings highlight the complexity of Female infertility pathogenesis. The application of various sensitivity analyses enhanced the robustness of our results, with no significant horizontal pleiotropy observed, ensuring the reliability of the causal relationships. Through multi-level bioinformatics analyses, we found that CTF1 and IGFBP5 are not only key molecular nodes in Female infertility but are also significantly associated with diseases such as Cold-induced Sweating Syndrome.\u003c/p\u003e \u003cp\u003eCTF1 is a cytokine belonging to the IL-6 family, named for its role in cardiomyocyte growth and survival[\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. Its receptor is a trimeric complex composed of glycoprotein 130 (gp130), leukemia inhibitory factor receptor (LIFR), and an undescribed third component[\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. While CTF1 has been extensively studied in cardiac function, recent research has shown that it also exerts protective effects on other organs, such as the liver, kidneys, and nervous system[\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]. Zhu et al. demonstrated that CTF1-modified BMSCs produce exosomes that more effectively promote endometrial and myometrial tissue regeneration, enhancing uterine receptivity and driving neovascularization in a rat model system to improve embryo implantation rates[\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e]. This finding suggests that CTF1 expression is negatively correlated with Female infertility, consistent with our results. CTF1 is emerging as one of the most promising therapeutic and diagnostic targets. Mar\u0026iacute;a et al. identified CTF1 as a key regulator of glucose and lipid metabolism, with recombinant CTF1 (rCTF-1) reducing body weight and correcting insulin resistance in ob/ob and high-fat-fed obese mice[\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e]. Additionally, studies have shown that CTF1 protects MIN6B1 cells and mouse islets from apoptosis under serum deprivation, preserving β-cell viability and preventing streptozotocin-induced diabetes[\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e]. CTF1 is upregulated during liver regeneration, where it is actively produced by stressed hepatocytes in an autocrine manner to activate survival signals and maintain cell viability, exerting a robust cytoprotective effect in acute severe liver injury models[\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. Furthermore, CTF1 treatment has been shown to improve embryo implantation in ICR and B6 mice by activating transcriptional activity in the uterine luminal epithelium[\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eIGFBP5 is a binding protein with high affinity for insulin-like growth factors (IGFs) and belongs to the insulin-like growth factor-binding protein family[\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]. IGFBP5 has previously been studied in relation to osteoblast function and diabetes[\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e, \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e]. However, its association with Female infertility remains poorly understood. Our results provide evidence that IGFBP5 is a risk factor for Female infertility. Studies have shown that normal follicle development and ovulation are fundamental to female fertility, with follicle developmental abnormalities or ovulation disorders being common causes of female infertility. Granulosa cells (GCs), as important somatic cells within the follicle, determine follicle fate. Recently, Zhang et al. observed that overexpression of IGFBP5 inhibits granulosa cell proliferation and differentiation and induces their degeneration[\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e]. Considering the close relationship between follicle development and infertility, this finding appears reasonable. Further investigation is needed to elucidate the mechanisms by which IGFBP5 induces Female infertility.\u003c/p\u003e \u003cp\u003eOur strengths lie in the fact that this is the first study to apply genome-wide MR analysis of plasma proteins to the investigation of female infertility. We have revealed the causal relationships between plasma proteins and female infertility from a genetic perspective and explored the underlying molecular mechanisms through integrated bioinformatics analyses. The two-sample MR approach avoids the confounding factors and reverse causality that are common in traditional observational studies and provides a replicable framework for future causal inference research. Additionally, the robustness of the MR results was validated using various sensitivity analysis methods, such as MR-Egger and MR-PRESSO, which showed low heterogeneity and no significant horizontal pleiotropy. This ensures that the identified causal associations between the plasma proteins and female infertility were not confounded by pleiotropy or heterogeneity.\u003c/p\u003e \u003cp\u003eHowever, several limitations should be noted. First, the female infertility outcome data were primarily derived from individuals of European descent. Significant differences in genetic background, environmental factors, and lifestyle across different races and geographical regions may lead to population-specific biases, affecting the associations between protein expression levels and female infertility. Second, the current sample size of the female infertility dataset remains limited. The number of genetic loci identified through GWAS and other methods is restricted, which may result in false-negative findings due to the failure to detect susceptibility loci with moderate or weak effects. The effect sizes and generalizability of these loci need further validation. Third, the possibility of horizontal pleiotropy cannot be entirely ruled out, as IVs may influence the risk of female infertility through pathways other than the target proteins. Although no significant horizontal pleiotropy was detected in our sensitivity analyses, this possibility cannot be completely excluded. Finally, we used a more stringent LD threshold (R\u0026sup2; \u0026lt; 0.1) and a shorter linkage distance (100 kb), which ensured the independence of instrumental variables but may have reduced the number of available IVs, leading to decreased statistical power and more conservative results.\u003c/p\u003e"},{"header":"5 Conclusion","content":"\u003cp\u003eIn summary, this study systematically evaluated the potential genetic associations between plasma proteins and female infertility using MR analysis and successfully identified two plasma proteins significantly associated with the risk of female infertility. These findings provide new insights into the complex pathogenesis of female infertility and open up new avenues for future research and clinical applications. Future studies should expand the sample size and validate the findings in diverse racial and geographical populations to ensure the generalizability and reliability of the results. Integrating genetic data with clinical phenotypes and other multi-omics information may further elucidate the specific roles of these proteins in female infertility, thereby promoting the development of personalized prevention and treatment strategies to improve patient outcomes and quality of life.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe express our sincere gratitude to DECODE Genetics for providing the summary data. We also acknowledge the contributions of the participants and researchers of the FinnGen study to the GWAS dataset on female infertility used in this study.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Contribution\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eY.F and C.W: Project development, H.R and C.W: Data Collection, Data collection, Y.F and H.R: Manuscript writing. All authors read and approved the final manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was funded by Postgraduate Research \u0026amp; Practice Innovation Program of Jiangsu Province (KYCX24_2187) and National Natural Science Foundation of China (82274638).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study analyzed publicly available datasets. Plasma protein data were obtained from the deCODE Genetics database (https://www.decode.com/summarydata/), while the GWAS summary data for female infertility were sourced from FinnGen (https://r10.finngen.fi/).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConflict of interest\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics approval\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAs this study utilized data from publicly available databases, no ethical approval was required.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eClinical trial number\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eStarc A, Trampu\u0026scaron; M, Pavan Jukić D, Rotim C, Jukić T, Polona Miv\u0026scaron;ek A (2019) INFERTILITY AND SEXUAL DYSFUNCTIONS: A SYSTEMATIC LITERATURE REVIEW. Acta clinica Croatica 58:508-515.https://doi.org/10.20471/acc.2019.58.03.15\u003c/li\u003e\n\u003cli\u003eBala R, Singh V, Rajender S, Singh K (2021) Environment, Lifestyle, and Female Infertility. Reproductive Sciences 28:617-638.https://doi.org/10.1007/s43032-020-00279-3\u003c/li\u003e\n\u003cli\u003eVander Borght M, Wyns C (2018) Fertility and infertility: Definition and epidemiology. Clinical Biochemistry 62:2-10.https://doi.org/10.1016/j.clinbiochem.2018.03.012\u003c/li\u003e\n\u003cli\u003eMo W, Zhang J, Peng X, Wang Y (2024) Causal relationship between genetically predicted antibody-Mediated Immune Responses and female infertility. Journal of Reproductive Immunology 166:104319.https://doi.org/10.1016/j.jri.2024.104319\u003c/li\u003e\n\u003cli\u003eZarif Golbar Yazdi H, Aghamohammadian Sharbaf H, Kareshki H, Amirian M (2020) Psychosocial Consequences of Female Infertility in Iran: A Meta-Analysis. Frontiers in psychiatry 11:518961.https://doi.org/10.3389/fpsyt.2020.518961\u003c/li\u003e\n\u003cli\u003eStanhiser J, Steiner AZ (2018) Psychosocial Aspects of Fertility and Assisted Reproductive Technology. Obstetrics and gynecology clinics of North America 45:563-574.https://doi.org/10.1016/j.ogc.2018.04.006\u003c/li\u003e\n\u003cli\u003eSantos R, Ursu O, Gaulton A, Bento AP, Donadi RS, Bologa CG, et al (2017) A comprehensive map of molecular drug targets. Nature reviews Drug discovery 16:19-34.https://doi.org/10.1038/nrd.2016.230\u003c/li\u003e\n\u003cli\u003eSuhre K, McCarthy MI, Schwenk JM (2021) Genetics meets proteomics: perspectives for large population-based studies. Nature reviews Genetics 22:19-37.https://doi.org/10.1038/s41576-020-0268-2\u003c/li\u003e\n\u003cli\u003eAnderson NL, Anderson NG (2002) The human plasma proteome: history, character, and diagnostic prospects. Molecular \u0026amp; cellular proteomics : MCP 1:845-867.https://doi.org/10.1074/mcp.r200007-mcp200\u003c/li\u003e\n\u003cli\u003eZhang J, Li Y, Gong A, Wang J (2024) From proteome to pathogenesis: investigating polycystic ovary syndrome with Mendelian randomization analysis. Frontiers in endocrinology 15:1442483.https://doi.org/10.3389/fendo.2024.1442483\u003c/li\u003e\n\u003cli\u003eSekula P, Del Greco MF, Pattaro C, K\u0026ouml;ttgen A (2016) Mendelian Randomization as an Approach to Assess Causality Using Observational Data. Journal of the American Society of Nephrology : JASN 27:3253-3265.https://doi.org/10.1681/ASN.2016010098\u003c/li\u003e\n\u003cli\u003eDavey Smith G, Hemani G (2014) Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Human molecular genetics 23:R89-98.https://doi.org/10.1093/hmg/ddu328\u003c/li\u003e\n\u003cli\u003eZhu Z, Zhang F, Hu H, Bakshi A, Robinson MR, Powell JE, et al (2016) Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nature genetics 48:481-487.https://doi.org/10.1038/ng.3538\u003c/li\u003e\n\u003cli\u003eEmdin CA, Khera AV, Kathiresan S (2017) Mendelian Randomization. Jama 318:1925-1926.https://doi.org/10.1001/jama.2017.17219\u003c/li\u003e\n\u003cli\u003eFerkingstad E, Sulem P, Atlason BA, Sveinbjornsson G, Magnusson MI, Styrmisdottir EL, et al (2021) Large-scale integration of the plasma proteome with genetics and disease. Nature genetics 53:1712-1721.https://doi.org/10.1038/s41588-021-00978-w\u003c/li\u003e\n\u003cli\u003eKurki MI, Karjalainen J, Palta P, Sipil\u0026auml; TP, Kristiansson K, Donner KM, et al (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613:508-518.https://doi.org/10.1038/s41586-022-05473-8\u003c/li\u003e\n\u003cli\u003eBentham J, Morris DL, Graham DSC, Pinder CL, Tombleson P, Behrens TW, et al (2015) Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus. Nature genetics 47:1457-1464.https://doi.org/10.1038/ng.3434\u003c/li\u003e\n\u003cli\u003eBurgess S, Thompson SG (2011) Avoiding bias from weak instruments in Mendelian randomization studies. International journal of epidemiology 40:755-764.https://doi.org/10.1093/ije/dyr036\u003c/li\u003e\n\u003cli\u003eWoolf B, Di Cara N, Moreno-Stokoe C, Skrivankova V, Drax K, Higgins JPT, et al (2022) Investigating the transparency of reporting in two-sample summary data Mendelian randomization studies using the MR-Base platform. International journal of epidemiology 51:1943-1956.https://doi.org/10.1093/ije/dyac074\u003c/li\u003e\n\u003cli\u003eBurgess S, Butterworth A, Thompson SG (2013) Mendelian randomization analysis with multiple genetic variants using summarized data. Genetic epidemiology 37:658-665.https://doi.org/10.1002/gepi.21758\u003c/li\u003e\n\u003cli\u003eBirney E (2022) Mendelian Randomization. Cold Spring Harbor perspectives in medicine 12https://doi.org/10.1101/cshperspect.a041302\u003c/li\u003e\n\u003cli\u003eBurgess S, Thompson SG (2017) Interpreting findings from Mendelian randomization using the MR-Egger method. European journal of epidemiology 32:377-389.https://doi.org/10.1007/s10654-017-0255-x\u003c/li\u003e\n\u003cli\u003eBowden J, Davey Smith G, Haycock PC, Burgess S (2016) Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Genetic epidemiology 40:304-314.https://doi.org/10.1002/gepi.21965\u003c/li\u003e\n\u003cli\u003eGreco MF, Minelli C, Sheehan NA, Thompson JR (2015) Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Statistics in medicine 34:2926-2940.https://doi.org/10.1002/sim.6522\u003c/li\u003e\n\u003cli\u003eBowden J, Davey Smith G, Burgess S (2015) Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. International journal of epidemiology 44:512-525.https://doi.org/10.1093/ije/dyv080\u003c/li\u003e\n\u003cli\u003eVerbanck M, Chen CY, Neale B, Do R (2018) Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nature genetics 50:693-698.https://doi.org/10.1038/s41588-018-0099-7\u003c/li\u003e\n\u003cli\u003eBurgess S, Bowden J, Fall T, Ingelsson E, Thompson SG (2017) Sensitivity Analyses for Robust Causal Inference from Mendelian Randomization Analyses with Multiple Genetic Variants. 28:30-42.https://doi.org/10.1097/EDE.0000000000000559\u003c/li\u003e\n\u003cli\u003eChen JY, Wang JF, Hu Y, Li XH, Qian YR, Song CL (2025) Evaluating the advancements in protein language models for encoding strategies in protein function prediction: a comprehensive review. Frontiers in bioengineering and biotechnology 13:1506508.https://doi.org/10.3389/fbioe.2025.1506508\u003c/li\u003e\n\u003cli\u003eDoncheva NT, Morris JH, Holze H, Kirsch R, Nastou KC, Cuesta-Astroz Y, et al (2023) Cytoscape stringApp 2.0: Analysis and Visualization of Heterogeneous Biological Networks. Journal of proteome research 22:637-646.https://doi.org/10.1021/acs.jproteome.2c00651\u003c/li\u003e\n\u003cli\u003eSzklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, et al (2019) STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic acids research 47:D607-d613.https://doi.org/10.1093/nar/gky1131\u003c/li\u003e\n\u003cli\u003eShannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome research 13:2498-2504.https://doi.org/10.1101/gr.1239303\u003c/li\u003e\n\u003cli\u003ePennica D, King KL, Shaw KJ, Luis E, Rullamas J, Luoh SM, et al (1995) Expression cloning of cardiotrophin 1, a cytokine that induces cardiac myocyte hypertrophy. Proceedings of the National Academy of Sciences of the United States of America 92:1142-1146.https://www.pnas.org/doi/abs/10.1073/pnas.92.4.1142\u003c/li\u003e\n\u003cli\u003eRobledo O, Fourcin M, Chevalier S, Guillet C, Auguste P, Pouplard-Barthelaix A, et al (1997) Signaling of the Cardiotrophin-1 Receptor: EVIDENCE FOR A THIRD RECEPTOR COMPONENT*. Journal of Biological Chemistry 272:4855-4863.https://doi.org/10.1074/jbc.272.8.4855\u003c/li\u003e\n\u003cli\u003eL\u0026oacute;pez-Yoldi M, Moreno-Aliaga MJ, Bustos M (2015) Cardiotrophin-1: A multifaceted cytokine. Cytokine \u0026amp; growth factor reviews 26:523-532.https://doi.org/10.1016/j.cytogfr.2015.07.009\u003c/li\u003e\n\u003cli\u003eZhu Q, Tang S, Zhu Y, Chen D, Huang J, Lin J (2022) Exosomes Derived From CTF1-Modified Bone Marrow Stem Cells Promote Endometrial Regeneration and Restore Fertility. Frontiers in bioengineering and biotechnology 10:868734.https://doi.org/10.3389/fbioe.2022.868734\u003c/li\u003e\n\u003cli\u003eMoreno-Aliaga MJ, P\u0026eacute;rez-Echarri N, Marcos-G\u0026oacute;mez B, Larequi E, Gil-Bea FJ, Viollet B, et al (2011) Cardiotrophin-1 is a key regulator of glucose and lipid metabolism. Cell metabolism 14:242-253.https://doi.org/10.1016/j.cmet.2011.05.013\u003c/li\u003e\n\u003cli\u003eJim\u0026eacute;nez-Gonz\u0026aacute;lez M, Jaques F, Rodr\u0026iacute;guez S, Porciuncula A, Principe RM, Abizanda G, et al (2013) Cardiotrophin 1 protects beta cells from apoptosis and prevents streptozotocin-induced diabetes in a mouse model. Diabetologia 56:838-846.https://doi.org/10.1007/s00125-012-2822-8\u003c/li\u003e\n\u003cli\u003eBustos M, Beraza N, Lasarte J-J, Baixeras E, Alzuguren P, Bordet T, Prieto J (2003) Protection against liver damage by cardiotrophin-1: a hepatocyte survival factor up-regulated in the regenerating liver in rats. Gastroenterology 125:192-201.https://doi.org/10.1016/S0016-5085(03)00698-X\u003c/li\u003e\n\u003cli\u003eKobayashi R, Terakawa J, Kato Y, Azimi S, Inoue N, Ohmori Y, Hondo E (2014) The contribution of leukemia inhibitory factor (LIF) for embryo implantation differs among strains of mice. Immunobiology 219:512-521.https://doi.org/10.1016/j.imbio.2014.03.011\u003c/li\u003e\n\u003cli\u003eTripathi G, Salih DA, Drozd AC, Cosgrove RA, Cobb LJ, Pell JM (2009) IGF-independent effects of insulin-like growth factor binding protein-5 (Igfbp5) in vivo. FASEB journal : official publication of the Federation of American Societies for Experimental Biology 23:2616-2626.https://doi.org/10.1096/fj.08-114124\u003c/li\u003e\n\u003cli\u003eKanatani M, Sugimoto T, Nishiyama K, Chihara K (2000) Stimulatory effect of insulin-like growth factor binding protein-5 on mouse osteoclast formation and osteoclastic bone-resorbing activity. Journal of bone and mineral research : the official journal of the American Society for Bone and Mineral Research 15:902-910.https://doi.org/10.1359/jbmr.2000.15.5.902\u003c/li\u003e\n\u003cli\u003eLi X, Tang J, Lin S, Liu X, Li Y (2024) Mendelian randomization analysis demonstrates the causal effects of IGF family members in diabetes. Frontiers in medicine 11:1332162.https://doi.org/10.3389/fmed.2024.1332162\u003c/li\u003e\n\u003cli\u003eZhang W, Chen X, Nie R, Guo A, Ling Y, Zhang B, Zhang H (2024) Single-cell transcriptomic analysis reveals regulative mechanisms of follicular selection and atresia in chicken granulosa cells. Food research international (Ottawa, Ont) 198:115368.https://doi.org/10.1016/j.foodres.2024.115368\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Female infertility, Plasma Proteins, Mendelian Randomization, FinnGen, Therapeutic Targets","lastPublishedDoi":"10.21203/rs.3.rs-6313119/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6313119/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eFemale infertility is a prevalent reproductive health issue, the incidence of which has been rising in recent years. However, there remains a lack of highly effective and targeted treatments. This study employs a proteome-wide Mendelian randomization (MR) analysis to investigate the causal relationships between plasma proteins and female infertility and to identify and validate potential therapeutic targets.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003eWe utilized pQTL data from DECODE Genetics, covering 35,559 proteins in 4,907 individuals. Summary data for female infertility were extracted from the FinnGen project, including 14,759 cases and 111,583 controls. A two-sample MR analysis was conducted, using single nucleotide polymorphisms (SNPs) as genetic instruments to estimate the causal effects of plasma proteins on female infertility. Sensitivity analyses were performed to assess the stability and reliability of the MR results. PPI networks were constructed, and drug-gene interaction systems were integrated to elucidate potential links between the identified proteins and existing treatments for female infertility.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eThe MR analysis indicated significant associations between the expression levels of two plasma proteins and the risk of female infertility. Higher levels of Cardiotrophin-1 (CTF1) (OR\u0026thinsp;=\u0026thinsp;0.68, CI 0.53\u0026ndash;0.87, P\u0026thinsp;=\u0026thinsp;2.54\u0026times;\u003csup\u003e10\u0026minus;3\u003c/sup\u003e) were associated with a reduced risk of female infertility, whereas higher levels of Insulin-like growth factor-binding protein 5 (IGFBP5) (OR\u0026thinsp;=\u0026thinsp;1.35, CI 1.09\u0026ndash;1.66, P\u0026thinsp;=\u0026thinsp;5.54\u0026times;\u003csup\u003e10\u0026minus;3\u003c/sup\u003e) were associated with an increased risk of female infertility. Sensitivity analyses showed no evidence of pleiotropy or heterogeneity.\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e \u003cp\u003eThis study identified two plasma proteins associated with the risk of female infertility, providing new insights into the potential pathogenesis of the condition.\u003c/p\u003e","manuscriptTitle":"Revealing Novel Protein Biomarkers for Female Infertility through an Integrated Analysis of Plasma Proteomics and Mendelian Randomization","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-05-09 01:51:41","doi":"10.21203/rs.3.rs-6313119/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"91ff4d42-0266-4862-8d8e-7f27b11f538a","owner":[],"postedDate":"May 9th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-07-23T13:53:17+00:00","versionOfRecord":[],"versionCreatedAt":"2025-05-09 01:51:41","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6313119","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6313119","identity":"rs-6313119","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Condition tags

infertility

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00