Exploring the mechanism of bacterial lipopolysaccharide-related genes involved in polycystic ovary syndrome and its significance in diagnosis | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Exploring the mechanism of bacterial lipopolysaccharide-related genes involved in polycystic ovary syndrome and its significance in diagnosis Yang Li, Chunmei Bai, Xumin Zhang, Haixia Song, Caixia Yuan, Ziwei Huang, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7357877/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Bacterial lipopolysaccharide (LPS), a critical component of the outer membrane of Gram-negative bacteria, activates the host immune system via pattern recognition receptors (PRRs), triggering inflammatory responses. However, the role of LPS-related genes (BLRGs) in polycystic ovary syndrome (PCOS) remains unclear. This study integrated PCOS transcriptomic and single-cell RNA sequencing (scRNA-seq) data with BLRGs from the Comparative Toxicogenomics Database (CTD). Differential expression analysis, weighted gene co-expression network analysis (WGCNA), and consensus clustering identified candidate genes, while extreme gradient boosting (XGBoost) and random forest algorithms further screened C11orf68 and EVI5L as key biomarkers. Both genes were significantly downregulated in PCOS patients and linked to functions such as iron metabolism and heme clearance. Immune infiltration analysis revealed a significant negative correlation between activated mast cells and these biomarkers. Notably, the proportion of T cells was altered in PCOS samples, and scRNA-seq highlighted a dynamic "rising-plateau" expression pattern of C11orf68 and EVI5L during T-cell differentiation. A nomogram confirmed the predictive efficacy of these biomarkers for PCOS. Drug prediction and molecular regulatory network analysis provided insights into targeted therapies. This study is the first to uncover the regulatory role of LPS-related genes in PCOS, offering novel perspectives for early diagnosis and intervention strategies. Polycystic ovarian syndrome Bacterial lipopolysaccharide Single-cell RNA sequencing Machine learning Biomarkers Diagnostic performance Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 1 Introduction Polycystic ovary syndrome (PCOS) is the most common endocrine disorder affecting women of childbearing age, with effects throughout the life cycle from puberty to postmenopause, and associated features include irregular or absent menstruation, hyperandrogenemia, and related metabolic and psychological sequelae [ 1 ].It is the most common cause of anovulatory infertility [ 2 ]. It affects 5–10% of women of reproductive age, only in the United States 5 million women are affected, with a total of 105 million women affected worldwide[ 3 ]. PCOS is characterized by a wide range of clinical phenotypes of unknown etiology and complex pathogenesis, including hypothalamic and ovarian dysfunction, excessive androgen exposure, insulin resistance and obesity-related mechanisms, and genetic predisposition. The diagnosis of PCOS is heterogeneous and there is no single criterion[ 4 ]. Currently follows the Rotterdam criteria as stated in the 2003 Rotterdam Consensus Conference, whose core features are hyperandrogenemia and polycystic ovarian morphology (PCOM), and the diagnosis requires the fulfillment of two of the three features (hyperandrogenemia, menstrual disorders, and PCOM), and the exclusion of secondary etiologies (e.g. adult-onset congenital adrenocortical hyperplasia, hyperprolactinemia, and androgen-secreting tumors) [ 5 ]. Additionally, there is a close association between PCOS and the development of a variety of diseases. For example, ovulation dysfunction and lack of cyclic progesterone secretion in PCOS patients, the endometrium is stimulated by high estrogen for a long period of time, and the endometrium continues to proliferate, which is prone to hyperplasia, abnormal hyperplasia or even atypical hyperplasia and endometrial cancer[ 6 ]. PCOS is considered an important risk factor for Type 2 diabetes, cardiovascular disease, gestational diabetes, preeclampsia, preterm labor and gestational hypertension[ 7 ]. A population-based study of single births among 3,787 women with PCOS and more than 1 million women without PCOS in Sweden from 1995 to 2007 proved this point[ 8 ]. Emerging evidence from epidemiological and genetic studies indicates that the phenotypic expression of PCOS is modulated by a complex interplay of genetic predisposition and environmental determinants[ 9 ]. Accordingly, the study of corresponding biomarkers in PCOS is highly necessary to explore the prevention and early diagnosis of PCOS. Lipopolysaccharide (LPS), a structurally complex amphipathic molecule, constitutes the fundamental architectural component of the outer membrane in Gram-negative bacteria.LPS consists of three parts: lipid A, core polysaccharide and O-antigen repeats.Lipid A is the bioactive component of LPS[ 10 ]. Tremellen et al. postulated the gut-microbiota-dysbiosis hypothesis as a potential mechanism underlying the pathogenesis of polycystic ovary syndrome[ 11 ]. In PCOS, there are degrees of intestinal dysbiosis, and the impaired intestinal microecology leads to increased intestinal permeability and Gram-negative bacterial lipopolysaccharide (LPS)-associated endotoxemia. When LPS production exceeds hepatic capacity, intestinal-sourced LPS enters the systemic circulation and induces the activation of Toll Like Receptor 4(TLR4)-mediated inflammatory pathways, leading to persistent chronic low-grade inflammation, insulin resistance, and ultimately exacerbation of PCOS [ 12 ]. Therefore, LPS plays an important role in the disease process of PCOS, but the LPS-related genes that contribute to these PCOS have not been fully elucidated, prompting us to further investigate LPS-related biomarkers. Single-cell RNA sequencing (scRNA-seq) refers to a high-throughput sequencing method for studying the transcriptome of individual cells. Different from RNA sequencing of bulk samples (RNA-seq), scRNA-seq detects transcriptome heterogeneity of specific cells, suggesting dynamic changes in cellular status during disease processes[ 13 ]. In this study, we screened and identified LPS-related biomarkers in PCOS based on the PCOS transcriptome data from public databases, and explored the mechanism of biomarkers in PCOS using a series of bioinformatics analyses, as well as verified the expression levels and changes of biomarkers at the single-cell level, with a view to providing theoretical reference bases for clinical diagnosis and preventive treatments of PCOS. 2 Results 2.1 Identified 8640 DEGs and 527 key modular genes associated with PCOS In the training cohort, differential expression analysis identified 8,640 DEGs between PCOS and control samples, with 8,581 genes upregulated and 59 genes downregulated (Fig. 1 A,B). The expression matrix of the samples in the training set was then subjected to WGCNA. According to cluster analysis, with no notable outliers and no need to eliminate any samples (Fig. 1 C). The optimal soft threshold (β) was determined to be 5 when the scale-free fit index (R 2 ) exceeded 0.85 and the average connectivity was close to zero Fig. 1 D). A total of 16 modules were identified based on the criteria of the dynamic tree-cutting algorithm (Fig. 1 E). Following correlation analyses, two key modules were identified: the salmon module (cor = -0.56, p < 0.05) had the greatest negative correlation with PCOS, and the pink module (cor = 0.61, p < 0.05) had the strongest positive correlation. Together, the salmon and pink modules contained 185 and 484 genes, respectively, for a total of 669 genes (Fig. 1 F). Further refinement of the genes within the key module yielded 527 key modular genes associated with PCOS (|MM| >0.3 and |GS| >0.3) (Fig. 1 G, H). 2.2 Differential expression analysis between the two subtupes yielded 3563 DEBLRs Consistency clustering analysis of the PCOS samples based on the 50 BLRGs showed that it was the clustering into 2 subtypes (K = 2) that was the most appropriate (Fig. 2 A, B, C). The gene expression profiles between the two subtypes were differentially analyzed, and 3,182 up-regulated and 380 down-regulated genes were found in the 3,562 DE-BLRGs that were collected (Fig. 2 D). 2.3 Exploring the biological functions of 12 candidate genes Intersecting 8,640 DEGs, 527 key modular genes, and 3,562 DE-BLRGs (shown here are BLRGs) yielded 67 candidate genes associated with bacterial LPS in PCOS disease (Fig. 3 A). GO and KEGG analyses were performed on these candidate genes. Among the 70 GO-BP entries enriched, candidate genes were significantly associated with functions such as ‘cellular response to insulin stimulus’ and ‘regulation of Arp2/3 complex-mediated actin nucleation’ (Fig. 3 B). Among the 24 GO-MF categories enriched, candidate genes were associated with ‘pre-mRNA binding’ and ‘2-oxoglutarate-dependent dioxygenase activity’(Fig. 3 C). Among the 38 enriched GO-CC entries, the candidate genes were associated with ‘neuromuscular junction’ and ‘MLL1 complex’, etc. (Fig. 3 D). Three KEGG pathways were further identified as ‘spliceosome’, ‘phosphatidylinositol signaling system’ and ‘glycerophospholipid metabolism’ (Fig. 3 E). 2.4 Biomakers C11orf68 and EVI5L : effective diagnostic indicators for PCOS A total of 12 genes were screened using XGBoost algorithms, namely SMYD4 , GRPEL1 , EVI5L , SHPRH , RNU1-1 , RSBN1 , TMEM216 , UBE2Z , C11orf68 , RPP14 , RELCH and FRYL (Fig. 4 A). Meanwhile, RF algorithm screened eight genes, including C11orf68 , RELCH , EVI5L , STYXL1 , RSBN1 , MYO15B , DCAF1 and GAB1 (Fig. 4 B). Finally, the results of these two algorithms were intersected to obtain four feature genes: C11orf68 , RELCH , EVI5L and RSBN1 (Fig. 4 C). Subsequent expression analyses showed that C11orf68 and EVI5L expression was significantly downregulated in PCOS samples from both datasets ( P < 0.05) (Fig. 5 A, B). Consequently, C11orf68 and EVI5L were identified as the biomarkers in this study. Next, a diagnostic nomogram for PCOS was constructed based on C11orf68 and EVI5L (Fig. 5 C), and the calibration curve demonstrated the nomograms' strong predictive accuracy for PCOS. (Fig. 5 D), while the ROC curve emphasized the plausible diagnostic value of the nomograms, with an AUC of 0.824 (Fig. 5 E). 2.5 Validation of C11orf68 and EVI5L Downregulation in PCOS Bioinformatic analysis identified C11orf68 and EVI5L as significantly downregulated genes in PCOS. To validate these findings, we examined their expression levels in granulosa cells (GCs) from PCOS patients compared to healthy controls (CT). The baseline information of the women recruited in this study is shown in Table 2 .Consistent with our computational predictions, qPCR analysis confirmed a significant reduction in both C11orf68 and EVI5L mRNA levels in PCOS-derived GCs (Fig. 6 A, B). Table 2 Baseline characteristics of study participants. Basic parameters Control (n = 20) PCOS (n = 30) T P value Age (year) 32.25 ± 4.28) 32. 00 ± 3.84 -0.215 0.830 BMI (kg/m 2 ) 22.49 ± 2.17 25.11 ± 3.88 3.061 0.004 Basal serum E2 (pg/ml) 30.63 ± 10.72 36.80 ± 12.99 1.76 0.085 Basal serum P (ng/ml) 1.43 ± 3.92 0.61 ± 0.51 -0.93 0.364 Basal serum T (ng/ml) 35.57 ± 16.21 47.21 ± 20.11 2.16 0.036 AMH(ng/ml) 4.22 ± 2.38 6.41 ± 3.72 2.534 0.015 LH/FSH 0.60 ± 0.27 1.64 ± 0.82 6.441 0.000 To further investigate whether androgen excess, a hallmark of PCOS, contributes to this downregulation, we treated the KGN granulosa cell line with dihydrotestosterone (DHT) to mimic hyperandrogenic conditions. Strikingly, DHT treatment significantly suppressed the expression of both C11orf68 and EVI5L compared to untreated KGN cells (Fig. 6 C, D). These results suggest that hyperandrogenism may play a key role in the dysregulation of these genes in PCOS. 2.6 Functional and tissue-specific analysis of biomakers Tissue-specific analysis revealed that C11orf68 was expressed at the highest level in skeletal muscle and EVI5L was expressed at the highest level in cerebral cortex (Fig. 7 A, B). GSEA analysis of the biomarkers revealed that the top5 pathway significantly enriched for C11orf68 were ‘scavenging of heme from plasma’, ‘biocarta ahsp pathway’, ‘erythrocytes take up carbon dioxide and release oxygen’, ‘endosomal vacuolar pathway’ and ‘iron metabolism disorders’ (Fig. 7 C). This suggests that C11orf68 may influence the onset and progression of PCOS through pathways such as oxidative stress and metabolism. Incredibly, EVI5L was not enriched for significant results. Prediction of SUMO modification sites for the biomarkers revealed that 11 SUMO sites were detected in EVI5L and 3 SUMO sites were detected in C11orf68 ( Table 1 ). Table 1 Predicted SUMOylation sites and SUMO interaction motifs in EVI5L and C11orf68. Gene Position Peptide Score Cut-off Type EVI5L 40 DELELLAKLEEQNRL 0.9406 0.82 SUMOylation EVI5L 583–587 GRELRQRVVELETQDHIHR 0.9236 0.85 SUMO interaction EVI5L 700 KDQIEELKAEVRLLK 0.8897 0.82 SUMOylation EVI5L 137 SATDMPVKNQYSELL 0.8774 0.82 SUMOylation EVI5L 202–206 YCQGSAFIVGLLLMQMPEE 0.8753 0.85 SUMO interaction EVI5L 771 RRLERPAKDSEGSSD 0.8747 0.82 SUMOylation EVI5L 610 ERAALQEKLQYLAAQ 0.8723 0.82 SUMOylation EVI5L 509 AQLQEELKALKVREG 0.8578 0.82 SUMOylation EVI5L 354 VLKAYQVKYNPKKMK 0.854 0.82 SUMOylation EVI5L 619 QYLAAQNKGLQTQLS 0.8434 0.82 SUMOylation EVI5L 53 RLLEADSKSMRSMNG 0.8413 0.82 SUMOylation C11orf68 213–217 AKEGGRQVICVYTDDFTDR 0.9405 0.85 SUMO interaction C11orf68 225–229 TDDFTDRLGVLEADSAIRA 0.8953 0.85 SUMO interaction C11orf68 246 IKCLLTYKPDVYTYL 0.8367 0.82 SUMOylation 2.7 Infiltration of immune cells in PCOS patients In all training set samples, the degree of infiltration of 22 immune cell types was evaluated. This analysis revealed a significantly different distribution of seven immune cells between the PCOS and control groups ( P < 0.05) (Fig. 8 A, B). To be specific, PCOS samples showed higher infiltration abundances of macrophages M2, activated mast cells, resting NK cells, and activated T cells CD4 memory. Conversely, macrophages M0, resting mast cells and activated NK cells had lower infiltration abundances in PCOS samples. Furthermore, we discovered a substantial negative correlation between C11orf68 and both active mast cells and resting NK cells. EVI5L, on the other hand, showed a negative correlation with activated mast cells and a positive correlation with resting mast cells. (Fig. 8 C). This suggests that these biomarkers may influence PCOS through interactions with the immune environment. 2.8 TF-miRNA-mRNA regulatory network construction and drug forecasting Firstly, 25 miRNAs associated with C11orf68 and 15 miRNAs associated with EVI5L were predicted using the ENCORI and miRWalk databases, and it was observed that hsa-miR-331-3p, hsa-miR-671-5p, hsa-miR-34a-5p, and hsa-miR-34c-5pwas the miRNA associated with both biomarkers. The JASPAR database predicts TFs of C11orf68 and EVI5L, 18 and 23, respectively, and ZNF148, PATZ1, and ZNF460 were co-regulators of C11orf68 and EVI5L . Based on the above anticipated outcomes, a sophisticated TF-miRNA-mRNA regulation network was built (Fig. 9 A). In addition, 33 and 36 drugs were obtained by predicting drugs targeting C11orf68 and EVI5L in CTD, respectively. From the constructed drug-biomarker network, it can be seen that there are 13 drugs that can interact with these two biomarkers (Fig. 9 B). Among them, benzo(a)pyrene, valproic acid, bisphenol A and cisplatin may be promising for the treatment of PCOS and its related complications. 2.9 T cells could be key cells in PCOS. The pre- and post-QC results of the single-cell dataset are shown in Fig.S 1A and Fig.S 1B . The top 2000 highly variable genes (HVGs) were further identified as the focus of subsequent studies ( Fig.S 1C ). After PCA downscaling, the top 30 principal components (PCs) were selected for downstream analysis ( Fig.S 1D, E ). UMAP clustering classified the cells into 10 cell clusters, which were annotated by marker genes into seven cell types (Fig. 10 A ) , namely T cells, NK cells, macrophage, neutrophile, smooth muscle cells, endothelial cells, and GC cells (Fig. 10 B, C). We also assessed the ratio of the seven cell types in PCOS and normal samples (Fig. 10 D, E), and the results showed that except for NK cells, the other six cell types were significantly different between PCOS and normal samples. By looking at the differences in biomarker expression in different cell types in PCOS and normal samples (Fig. 10 F, G), it was found that EVI5L was not significantly different in all cell types; mean expression of C11orf68 was significantly different in all cell types except macrophages. Based on these results and in conjunction with the fact that T cells' critical function in PCOS has been documented in the literature, T cells were used as the key cells in this study for subsequent analyses. Pseudo-time inference of key cells showed that cells differentiate over time from right to left. Five cell taxa map to different times of differentiation (Fig. 11 A, B, C). We also observed the trend of biomarker expression in key cells throughout the pseudotemporal process (Fig. 11 D), where gene C11orf68 and EVI5L expression both rose to plateau as the cells differentiated. Compared with the control group, the number and strength of communications between cells in the PCOS group were enhanced (Fig. 12 A-D). T cells communicated with other cells, such as NK cells, endothelial cells, and smooth muscle cells in significantly increased numbers and strength. In the control group, the ligand-receptor pair exhibiting the strong communication probability was identified as SEMA3A − (NRP1 + PLXNA3) (Fig. 12 E). In the PCOS group, the ligand-receptor pair exhibiting the strong communication probability was MDK − NCL (Fig. 12 F). 3 Materials and methods 3.1 Data source From the Gene Expression Omnibus(GEO) database ( https://www.ncbi.nlm.nih.gov/gds ), transcriptome datasets and single-cell datasets pertaining to PCOS were obtained. The GSE84958 dataset (GPL16791 platform), included 15 PCOS patients and 23 controls' subcutaneous adipose tissue samples [ 14 ], which served as the training set for this study. Meanwhile, the GSE43264 dataset (GPL15362 platform) consisting of subcutaneous adipose tissue samples from 8 PCOS patients and 7 controls[ 14 ] was used as the validation set. The single-cell dataset GSE240688 (GPL24676 platform), contained ovarian granulosa cell samples from 3 PCOS patients and 3 controls. In addition, 50 bacterial lipopolysaccharide-related genes (BLRGs) were obtained from the Comparative Toxicogenomics Database (CTD) ( http://ctdbase.org/ ) by entering the keyword ‘lipopolysaccharide’ were retrieved and included in this study. 3.2 Identification of differentially expressed genes (DEGs) Differential expression analysis was performed using the DESeq2 package (v 1.34.0) [ 15 ] to identify DEGs ( |log 2 FC| >2 and adj.P.value < 0.05). The ggplot2 package (v 3.4.1) [ 16 ] was used to draw volcano plot to visualise the DEGs, to sort the DEGs based on the multiplicity of difference log 2 FC, and to volcano map labelled the top 10 up-and down-regulated genes in the disease group. Subsequently, expression heatmap was drawn for the top 10 up- and down-regulated DEGs using the pheatmap package (v 1.0.12) [ 17 ]. 3.3 Weighted gene co-expression network analysis(WGCNA) The expression matrix of samples within the training set was subjected to WGCNA using WGCNA package (v 1.71)[ 18 ]. Hierarchical clustering was conducted using Euclidean distance to examine potential outliers, which were subsequently removed. Then the optimal soft threshold was searched to make the constructed network more consistent with the scale-free topology. A systematic clustering tree between genes is generated by computing the adjacency between genes, calculating the similarity between genes based on the adjacency, deriving the coefficient of dissimilarity between genes, and so on. Then, according to the standard hybrid dynamic tree cutting algorithm, the minimum number of genes per gene module was set to 50, and MEDissThres was set to 0.2 to merge similar modules. Using 'PCOS and control' as the traits, Spearman's correlation analysis was performed using the psych package (v 2.4.3) [ 19 ] between the modular characterization genes and the traits, with a threshold of |r| >0.3 and a P value < 0.05, followed by the creation of a corresponding heatmap to visualize the correlations. The module with the strongest positive and negative correlation with PCOS was selected as the key modules. Lastly, genes within these key modules were further filtered using a threshold of module membership greater than 0.3 and gene significance greater than 0.3. These filtered genes were defined as key modular genes associated with PCOS. 3.4 Consistent cluster analysis and identification of differentially expressed BLRGs (DE-BLRGs) In summary, our investigation reveals that lipopolysaccharide (LPS)-associated mechanisms form an intricate regulatory network that plays a pivotal role in the initiation and progression of PCOS. These interconnected mechanisms, spanning immune activation, metabolic dysfunction, and endocrine disruption, provide valuable insights into the pathophysiological basis of PCOS. The elucidation of these pathways not only advances our comprehension of the molecular etiology of PCOS but also highlights potential therapeutic targets for clinical intervention. Nevertheless, given the current limitations in research methodologies and technological capabilities, conducting exhaustive experimental investigations remains challenging. In subsequent investigations, we will maintain our focus on elucidating the precise roles of these mechanisms in PCOS pathogenesis, with particular emphasis on their potential as diagnostic biomarkers and therapeutic targets. 3.5 Identification and functional enrichiment analysis of candidate genes Candidate genes were obtained by taking the intersection of DEGs, key modular genes and DE-BLRGs using the VennDiagram package (v 1.7.1)[ 20 ]. After obtaining candidate genes, Gene Ontology (GO), which involved biological processes (BP), cellular components (CC) and molecular functions (MF), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were carried out using the clusterProfiler package (v 4.7.1.003)[ 21 ] (p.value < 0.05). 3.6 Recognition of feature genes using machine learning algorithms After identifying the candidate genes, two machine learning algorithms, XGBoost and RF, were used to screen the feature genes. The predictive models were constructed using the XGBoost package (v 1.6.2.1)[ 22 ], and genes were identified according to the rank of feature importance. Modeling with the randomForest package (v 4.7-1) [ 23 ] was performed using the RF algorithm, which intersected the top 10 genes for MeanDecreaseAccuracy and MeanDecreaseGini to obtain the intersecting genes. The genes screened by the above two machine learning algorithms were intersected and represented by constructing a Venn diagram using the ggvenn package (v 1.7.3) [ 24 ], and the intersected genes were used as feature genes. 3.7 Identification of biomarkers and construction of nomogram The expression of these feature genes was analyzed in PCOS and control samples from the training and validation sets. Feature genes that showed significant differences (p < 0.05) and consistent expression trends in both datasets were selected as biomarkers for this study. Next, a nomogram was created based on biomarkers in all samples of the training sets using the rms package (v 6.5-1) [ 25 ]. To evaluate the nomogram's predictive performance, we constructed calibration curves, an ideal model has calibration curves with a slope approaching 1, indicating high accuracy in prediction. In addition, Receiver Operating Characteristic (ROC) curve were created by the pROC (v 1.18.0) [ 26 ] to quantify the diagnostic effect of the nomogram (AUC > 0.7). 3.8 Tissue-specific enrichment analysis of biomarkers Tissue-specific enrichment of biomarkers was performed in the training set using the "multi-gene query" function provided by The Human Protein Atlas ( https://www.proteinatlas.org ). The transcripts per million values of the genes in different tissues can be used to understand the expression level of each biomarker in different tissues. 3.9 Gene enrichment analysis Next, in the training dataset, Gene Set Enrichment Analysis (GSEA) was performed to further explore the pathways significantly enriched by the biomarkers. Firstly, based on the MSigDB database, "c2.cp.v2023.2.Hs.symbols.gmt" was chosen as the reference gene set. Biomarkers were categorized into high and low expression groups based on their expression levels, and the reference gene set was enriched according to the high and low expression, which were then ordered for GSEA via the clusterProfiler (FDR < 0.05 and p < 0.05). The top 5 enriched pathways of the biomarkers were shown in order of significance. 3.10 Immune infiltration analysis To explore immune infiltration in PCOS, the CIBERSORT was employed to assess the abundance of 22 immune cell infiltrations between the PCOS and the normal group in the training set, and samples with p > 0.05 were excluded. The Wilcoxon test was then used to screen out the immune cells with notable differences (p < 0.05) between PCOS and the normal group, and the results were presented in a box plot using the ggplot2 package. In addition, Spearman correlation analysis was carried out between differential immune cells and biomarkers using the psych package. 3.11 Construction of molecular regulatory network In order to explore the molecular regulation mechanism of biomarkers, the upstream miRNAs of biomarkers were predicted using the databases ENCORI ( https://starbase.sysu.edu.cn/ ) and miRWalk ( http://mirwalk.umm.uni-heidelberg.de/ ), respectively, then the intersection of two predictions was taken to obtain the miRNA; then the transcription factor (TF) regulating the biomarker was predicted by the JASPAR ( https://jaspar.elixir.no/ ) database; finally, the TF-miRNA-mRNA regulatory network was visualized by the Cytoscape software. 3.12 Small ubiquitin-like modifier (SUMO) and drug prediction analyses of biomarkers In order to explore the SUMO chemical modification sites of the biomarkers, we first searched the corresponding proteins of the biomarkers in the NCBI ( https://www.ncbi.nlm.nih.gov/ ), retrieved the FASTA files of the corresponding proteins, and entered the FASTA sequences of the proteins corresponding to the biomarkers into the GPS-SUMO 2.0 database ( https://www.ncbi.nlm.nih.gov/ ). The FASTA sequence of the protein corresponding to the biomarker was entered into the GPS-SUMO 2.0 database ( https://sumo.biocuckoo.cn/ ), and then the SUMO interaction motifs and SUMO consensus sites of the biomarker at the protein level were obtained. To further explore potential therapeutic agents for PCOS disease, potential drugs or molecular compounds interacting with biomarkers were predicted by the CTD through the Cytoscape software to visualize these interactions. 3.13 Quality control (QC) of the scRNA-seq dataset Subsequent to the above, we aimed to delve into the molecular regulatory mechanisms at the cellular level in PCOS. Using the Seurat package (v5.0.1) [ 27 ], the scRNA-seq analysis was performed. The Seurat package was used for QC and statistical analysis of the scRNA-seq data. The quality control criteria that were followed were as follows: (a) excluding genes that were found in fewer than three cells; (b) excluding cells that had fewer than 200 or more than 6,000 genes detected overall; and (c) excluding cells that expressed more than 20% of their genes in mitochondria. Plots the distribution of nFeature RNA, nCount RNA and percent.mt violin before and after QC of single cell data using the ggplot2 package. 3.14 Dimensional reduction and clustering After filtering out the cells and genes that did not meet the criteria, use the NormalizeData function to normalize the data, set the parameters to "LogNormalize," and scale.factor = 10000. Then the FindVariableFeatures function was used to extract the genes with higher coefficients of variation among cells, and the top 2000 highly variable genes (HVGs) with more obvious fluctuations were displayed for subsequent analysis. The LabelPoints function was used to visualize the results, identifying the top 10 genes with the highest variability. The JackStrawPlot and ElbowPlot functions were then used to decide which principal components (PCs). 3.15 Cellular annotation In order to further confirm the type of cell clusters, in the single-cell dataset GSE240688, the overall dimensionality reduction of the PCs screened in the previous step was initially carried out using the uniform manifold approximation and projection (UMAP), and then the FindClusters function was used to cluster the Principal Component Analysis (PCA) dimensionality reduction data, and the number of cell clusters was ascertained by UMAP. Then, using the FindClusters function in the Seurat package (resolution = 0.4), the PCA dimensionality reduction data were clustered to identify small cell clusters and determine the number of cell clusters. Cell types were then annotated based on marker genes. And histograms were plotted to show the proportions of each cell types in different samples. Then the cell types with significant differences were determined using the chi-square test. Expression patterns of previously identified two biomarkers in different cell types between PCOS and normal samples were further analyzed. In this study, in conjunction with the published literature [ 28 ], the cell type that showed remarkable differences in the mean expression of two biomarkers between samples was defined as a key cell. 3.16 Analyses of pseudo-time and cellular communication In order to explore the temporal dynamics of gene expression experienced by each cell during key cell state changes in the single-cell dataset GSE240688, single-cell pseudotime trajectories were constructed employing the Monocle2 package (v 2.26.0)[ 29 ], in which all the cells within a single cell population were projected onto a root and multiple branches, and a single-cell track map was constructed through dimensionality reduction clustering. To investigate the interactions among all annotated cell types, based on PCOS and control samples from the GSE240688 dataset, cell-cell communication networks between cell types were analyzed using CellChat package (v 1.6.1) [ 30 ]. The ligand-receptor pair interactions were visualized. 3.17 Cell line culture The human gastric cancer tumour-derived cell line KGN was purchased from Wuhan Punosai (Hubei, China).Biotechnology Company. Cells were cultured in DMEM/F12 (GIBCO, Carlsbad, CA, USA). The medium was supplemented with 10% foetal bovine serum (CellMax, Lanz Hangzhou, China) and 1% penicillin-streptomycin (BOSTER, Hangzhou, China). All cell lines were maintained in a humidified atmosphere of 5% CO₂ and 37°C. To establish a particulate cell model of PCOS in vitro, KGN cells were treated with 500 nM DHT for 24 hours. 3.18 Patients and sample acquisition The study on primary GCs was approved by the medical ethics committee of Shanxi Provincial People's Hospital. This study was approved by the institutional ethics committee of Shanxi Provincial People's Hospital (Approval No. V1.02025818). All patients who visited the Reproductive Medicine Centre provided informed consent. Between January and April 2025, 30 women recently diagnosed with PCOS and 15 infertile women with normal ovulatory menstrual cycles were enrolled in the study. Inclusion criteria were age 20–40 years, body mass index (BMI) 20–28 kg/m², infertility duration exceeding 1 year, and no use of hormonal medications within the past 3 months. Patients with a history of thyroid dysfunction, diabetes, Cushing’s syndrome, hyperprolactinemia, cardiovascular disease, or androgen-secreting tumours were excluded. A gonadotropin-releasing hormone antagonist protocol was used to control ovarian hyperstimulation. When two or more follicles reached a diameter of ≥ 18 mm, 0.1 mg of triptorelin acetate (FERRING, Wittland, Germany) and 2,000–5,000 IU of human chorionic gonadotropin (hCG) were administered. Follicular aspiration was performed 36 hours later. The follicular aspirate from each patient is pooled, centrifuged at 2,500 rpm for 10 minutes, the supernatant is removed, and the precipitates were resuspended in PBS. The suspension is slowly added to Ficell-Paque (cytiva, 17144002), centrifuged, and the intermediate white flocculent material is resuspended in 1 mL PBS, mixed thoroughly, centrifuged, and the supernatant is removed. After treatment with red blood cell lysis buffer and trypsin, resuspend in 3 mL of culture medium and seed into a 6 cm culture dish. Culture in DMEM/F12 supplemented with 5% FBS and antibiotics (100 U/mL penicillin, 0.1 mg/mL streptomycin; Gibco, USA) at 37°C. After 48 hours, collect the cells and extract RNA. 3.19 RNA extraction quantitative real‑time PCR (RT‑qPCR) RNA was extracted from KGN cells using an RNA extraction kit (Mei5bio, MF036). The purity and quantity of the extracted RNA were assessed using an ND-2000 spectrophotometer (Thermo, USA). Reverse transcription was performed using a commercial kit (Prime Script™ RT Kit with gDNA Eraser, Takara, Japan) to remove potential contamination from genomic DNA. The mRNA expression levels of differentially expressed genes were quantified using polymerase chain reaction (PCR). PCR was performed using the CFX96 real-time PCR system (Bio-Rad) and a commercial kit [(TB Green Premix Ex Taq II Fast qPCR (2X), Takara, Dalian, China]. Each PCR reaction (25 µL) contained 12.5 µL SYBR Green (CN830S, Takara), 10 ng cDNA, and 400 nmol/L specific primers. The primer sequences for each gene are listed in Table 3 The PCR programme consisted of the following: 2 minutes at 95°C, followed by 40 cycles of PCR amplification. The amplification programme included 30 seconds of activation at 95°C, followed by 40 PCR cycles (5 seconds at 95°C and 10 seconds at 60°C). The quality of the primers and reaction was assessed using a final denaturation curve analysis. The β-Actin gene was used as the reference gene. mRNA expression levels were calculated using the 2-ΔΔCT method. Table 3 Sequence of primers used in the study. Gene name Forward primer Reverse primer C11orf68 5′-TGTCTACACCTACCTGGGCA-3′ 5′-GTCAGTTCCACGTTGTTGGC-3′ EVI5L 5′-TCCTCCGCCTCCTCCAACC-3′ 5′-GCCGCCATTCCTCCCACTC-3′ β-Actin 5′GCTCTGGCTCCTAGCACCAT-3′ 5′GCCACCGATCCACACAGAGT-3′ 3.20 Statistical Analysis R software (v4.2.2) was used for all analyses. The Wilcoxon test was used to evaluate group differences. Statistical significance was defined as P < 0.05.Clinical data were analysed using SPSS 29.0 software (SPSS, Inc., Chicago, IL, USA). All results are expressed as mean ± standard error to accurately reflect the central tendency and dispersion of the data. Categorical data are presented as frequencies and percentages, and intergroup comparisons were performed using the chi-square test.Particle cell expression data obtained by RT-qPCR were analysed using t-tests with GraphPad Prism version 8.0.2 software. Differences were considered statistically significant when the P value was < 0.05. 4 Disscusion Intestinal-derived LPS has now been shown to be a pathophysiological nexus between low-grade systemic inflammation, insulin receptor substrate-1serine phosphorylation-induced insulin resistance and the clinical manifestations of PCOS[ 31 ]. However, how LPS affects the development of PCOS at the genetic level is unclear.The use of bioinformatics to study and predict the role of the LPS in PCOS may be one of the best approaches. In this study, we applied LPS-related genes curated from CTD and PCOS-related dataset from GEO databases as the basis of our analysis. The biological pathways attended by the biomarkers were then analyzed in combination with bioinformatics to explore their immune microenvironment, potential regulatory mechanisms and related drugs. EVI5L has also been demonstrated to bind with Rab10 and activate the small GTPase Rab10, which modulates the sustained replenishment of Toll Like Receptor 4(TLR4) from the Golgi to the plasma membrane, and serves as a prerequisite for optimal macrophage activation after LPS stimulation [ 32 ]. C11orf68 (Chromosome 11 Open Reading Frame 68) is a relatively new gene identified in the human genome, which is located on chromosome 11. C11orf68 was found to be upregulated in human cancer samples and associated with cell invasion [ 33 ]. We hypothesized that this gene may influence ovarian development and function by regulating the cell cycle, the biological function of LPS-stimulated C11orf68 is unknown. We combined the two biomarkers screened and the risk score, and the error between actual PCOS risk and predicted risk was small in the calibration curve (P = 0.817). To further assess the prognosis of PCOS patients with differential genes, the risk model prediction of key gene constructs was evaluated by constructing ROC curves based on diagnostic coefficients and gene expression levels, and the results showed that the key genes, C11orf68 and EVI5L , were more accurate than in previous studies in terms of the prognosis of PCOS[ 34 ]. To delineate the pathophysiological roles of candidate biomarkers in polycystic ovary syndrome (PCOS) progression, GSEA was employed to explore the potential mechanisms of C11orf68 and EVI5L, and according to the significant findings, C11orf68 was associated with iron metabolism disorders,reactome scavenging of heme from plasma and other pathways in PCOS, whereas EVI5L was not enriched for significant results. In patients with PCOS, hyperandrogenism is associated with abnormal levels of ferritin, and studies have found a negative correlation between serum ferritin and testosterone levels [ 35 ]. Moreover, clinical evidence further substantiates the dysregulation of iron metabolism in polycystic ovary syndrome (PCOS). A case-control study conducted by Liu et al. involving 149 PCOS patients and 108 healthy controls demonstrated significantly elevated serum ferritin levels in the PCOS cohort, independent of obesity status (p < 0.01) [ 36 ]. This persistent iron overload phenotype in PCOS patients predisposes to oxidative stress-mediated cellular damage through multiple mechanisms: (1) excessive generation of reactive oxygen species (ROS) via Fenton reactions, (2) disruption of cellular redox homeostasis, and (3) induction of lipid peroxidation cascades. These pathological processes ultimately culminate in cellular membrane destabilization and programmed cell death, contributing to the systemic manifestations of PCOS [ 37 ]. Meanwhile, free heme is a abundant reservoir of ferrous iron (Fe(II)) which driving the Fenton reaction, a process that produces ROS. Free heme is profound cytotoxic effects. Heme oxygenase (HO) is the rate-limiting enzyme in heme catabolism, and a recent study demonstrated that aberrant Nrf 2/HO-1 signaling promotes the development of PCOS and leads to pregnancy loss in gestating PCOS rats [ 38 ],while activation of the normal Nrf 2/HO-1 signaling pathway can have a protective effect by reducing oxidative stress. In conclusion, C11orf68 can be involved in the disease process of PCOS through the above pathways, providing new insights for further understanding of the pathogenesis of PCOS. Bioinformatic screening identified C11orf68 and EVI5L as significantly downregulated genes in PCOS. This computational prediction was subsequently confirmed through qPCR analysis of human granulosa cells (GCs) obtained from PCOS patients, which demonstrated marked reductions in both genes compared to healthy controls. The concordance between our in silico predictions and clinical sample analysis strongly supports the reliability of our bioinformatic approach and the biological relevance of these findings.To further investigate the potential mechanisms underlying this dysregulation, we employed the KGN granulosa cell line treated with dihydrotestosterone (DHT), a well-established in vitro model for studying androgen excess in PCOS. Remarkably, DHT treatment recapitulated the expression patterns observed in clinical samples, with both C11orf68 and EVi5L showing significant downregulation. This parallel between patient-derived data and experimental models provides multiple layers of evidence supporting our findings:The consistency across different experimental systems (clinical samples and cell lines) enhances the robustness of our conclusions.The androgen-responsiveness of these genes suggests they may mediate some of the hyperandrogenic effects characteristic of PCOS. Through systematic analysis of the GSE84958 dataset, we conducted a comprehensive characterization of immune cell infiltration patterns in PCOS, quantifying the relative abundance of 22 distinct immune cell populations. Our investigation revealed significant alterations in six specific immune cell subtypes: M0 macrophages, M2 macrophages, activated and resting NK cells, activated and resting mast cells, and activated memory CD4 + T cells. This detailed profiling of immune cell heterogeneity provides valuable insights into the immunopathological mechanisms underlying PCOS.Abhishek Trigunaite's investigation revealed that androgens exhibit anti-inflammatory properties and are capable of inhibiting immune cell functionality [ 39 ]. In PCOS, hyperandrogenemia may exacerbate chronic inflammatory processes through modulation of macrophage population density and phenotypic characteristics. Notably, elevated ratios of M1 to M2 macrophage subtypes have been documented within ovarian tissues of female rat models of PCOS exposed to 5-dihydrotestosterone (DHT). [ 40 ]. Estrogen contributes to the modulation of macrophage immune phenotypes via estrogen receptor alpha (ERα), which governs metabolic reprogramming in macrophages and facilitates their coordination with diverse activated signaling pathways across heterogeneous microenvironments. Under supraphysiological estrogen concentrations,the ability of macrophages binding LPS is enhanced, which may lead to a more severe inflammatory response following intestinal microecological disturbances. Furthermore, obesity, a prevalent phenotypic manifestation of PCOS, has been mechanistically linked to immune dysregulation and chronic low-grade inflammation. Trim et al. documented substantial quantities of neutrophil granulocytes, pro-inflammatory M1 phenotype macrophages, and T lymphocytes within adipose tissue[ 41 ]. These aggregated immune effector cells release the pro-inflammatory signaling molecules Tumor Necrosis Factor-alpha, Interleukin-6, and Interleukin-8, which subsequently trigger activation of the nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB) signaling cascade, thereby perpetuating a persistent systemic inflammatory condition.Our investigation revealed that C11orf68 exhibited an inverse association with NK cells resting and Mast cells activated, whereas EVI5L demonstrated a direct correlation with inactive mast cells alongside a counteractive relationship with their activated counterparts. These results suggest that biomarkers can correspond to the promotion or inhibition of cellular infiltration conditions and have an impact on PCOS disease progression. To explore the molecular regulatory mechanisms of key genes, hsa-miR-331-3p participate in cytosolic stress response and metabolic pathways, especially nucleotide metabolism, were predicted. It is suggested that the function of hsa-miR-331-3p maybe directly or indirectly related to the pathophysiology of PCOS, hsa-miR-331-3p has been documented to prevent neuron-associated inflammation, and knockdown of hsa-miR-331-3p decreases neuronal viability and promotes the expression of pro-inflammatory cytokines [ 42 ]. In addition, the hsa-miR-331-3p expression levels are associated with polyunsaturated fatty acids other than linoleic acid(LA),so it may affect the metabolic syndrome by regulating lipid metabolism[ 43 ]. After that, we predicted transcription factors regulating critical genes, and there were 18 transcription factors associated with C11orf68 and 23 transcription factors associated with EVI5L . Notably, multiple polymorphisms within the transcription factor ZNF148 locus demonstrated genetic linkages to fasting insulin concentrations, glycemic parameters, and insulin resistance indices.Counterintuitively, ZNF148 transcriptional activity exhibited an inverse regulatory relationship with glucose-stimulated insulin secretion (GSIS) dynamics,suggesting ZNF148 may be a new therapeutic target for enhancing insulin secretion[ 44 ]. The transcription factor PATZ1 promotes adipogenesis through a mechanism of interaction with the promoter region of key early adipogenic factors via a transcriptional mechanism. Knockdown of PATZ1 in adipose tissue protects mice from obesity[ 45 ]. In summary, the above co-predicted miRNAs and TFs can affect insulin levels and obesity and thus cause PCOS through the regulation of biomarkers. In our study, the average expression of the two key genes was significantly different in the rest of the cell types, except macrophages, and T cells were obtained as key cells through single-cell analysis. The dysregulation of immunological mechanisms constitutes a critical factor in the pathophysiological mechanisms of PCOS. Nevertheless, the phenotypic profiling of T-lymphocyte subpopulations in individuals with PCOS remains inadequately characterized. Individuals with polycystic ovary syndrome PCOS exhibit a persistent subclinical inflammatory state characterized by elevated leukocyte concentrations,vascular endothelial impairment, and disturbances in pro-inflammatory cytokines[ 46 ]. Substantial infiltration of immunocompetent effector populations, comprising T cells, B cells, macrophages and dendritic cells, have been detected in human preovulatory follicles[ 47 ]. Accumulating evidence from immunological studies has established the critical involvement of T lymphocytes in the pathogenesis of PCOS, and as the main component of lymphocytes, have a variety of biological functions and are mainly involved in the cellular immune response of the organism. They can kill target cells directly or enhance and expand the immune effect by releasing lymphokines [ 48 ]. Also, a dysregulation of T lymphocytes and antigen-presenting cells was found in the follicular fluid microenvironment of patients with PCOS, resulting in a markedly pro-inflammatory environment in the follicular fluid of patients with PCOS, as evidenced by an increase in reactive oxygen species and an accumulation of lipid peroxidation by-products, as well as an impaired antioxidant defense capacity[ 49 ]. The diagnosis and treatment of PCOS continue to present opportunities and challenges. Despite the nascent stage of lipopolysaccharide (LPS)-related research in PCOS pathogenesis, emerging evidence suggests its potential as a promising investigative avenue. The multifaceted role of LPS in immune modulation and metabolic regulation warrants further exploration in the context of PCOS pathophysiology. Future research directions may encompass: (1) the development of microbial-derived biomarkers for PCOS diagnosis, (2) the identification of LPS-mediated therapeutic targets, and (3) the advancement of immunotherapeutic strategies targeting LPS signaling pathways. These potential applications, though currently in their preliminary stages, represent significant opportunities for advancing both the understanding and clinical management of PCOS. 5 Conclusion In summary, our investigation reveals that lipopolysaccharide (LPS)-associated mechanisms form an intricate regulatory network that plays a pivotal role in the initiation and progression of PCOS. These interconnected mechanisms, spanning immune activation, metabolic dysfunction, and endocrine disruption, provide valuable insights into the pathophysiological basis of PCOS. The elucidation of these pathways not only advances our comprehension of the molecular etiology of PCOS but also highlights potential therapeutic targets for clinical intervention. Nevertheless, given the current limitations in research methodologies and technological capabilities, conducting exhaustive experimental investigations remains challenging. In subsequent investigations, we will maintain our focus on elucidating the precise roles of these mechanisms in PCOS pathogenesis, with particular emphasis on their potential as diagnostic biomarkers and therapeutic targets. Declarations The study protocol was approved by the the medical ethics committee of Shanxi Provincial People's Hospital (V1.02025818). All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration. All patients participating in this study signed informed consent forms. Competing Interests. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Funding. The research reported in this project was generously supported by Shanxi Provincial Central Guidance Local Science and Technology Development Project under grant agreement number YDZJSX2022A069. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Author Contribution YangLi: Conceptualization, Data curation, Validation, Visualization, Writing–original draft, Writing–review & editing; ChunmeiBai: Data curation, Validation, Visualization, Writing–review & editing; XuminZhang: Validation, Writing–review & editing; HaixiaSong: Visualization, Writing–review & editing; CaixiaYuan: Con-ceptualization, Supervision, Writing–review & editing; ZiweiHuang: Conceptualiza-tion, Supervision, Writing–review & editing; JianrongLiu: Conceptualization, Project administration, Supervision, Writing–review & editing. All authors read and approved the final manuscript. Acknowledgements. We would like to express our sincere gratitude to all individuals and organizations who supported and assisted us throughout this research. Special thanks to the following authors: Chunmei Bai, Xumin Zhang, Haixia Song, Caixia Yuan, Ziwei Huang,Jianrong Liu. In conclusion, we extend our thanks to everyone who has supported and assisted us along the way. Without your support, this research would not have been possible. Data Availability Microarray data in this work are available in the GEO online database (http:// www. ncbi. nlm. nih. gov/ geo). References Polycystic ovary. syndrome - PubMed [Internet]. [cited 2024 Sept 10]. Available from: https://pubmed.ncbi.nlm.nih.gov/35934017/ Criteria. phenotypes and prevalence of polycystic ovary syndrome - PubMed [Internet]. [cited 2024 Sept 10]. Available from: https://pubmed.ncbi.nlm.nih.gov/31089072/ Asunción M, Calvo RM, San Millán JL, Sancho J, Avila S, Escobar-Morreale HF. A Prospective Study of the Prevalence of the Polycystic Ovary Syndrome in Unselected Caucasian Women from Spain1. The Journal of Clinical Endocrinology & Metabolism [Internet]. 2000 [cited 2024 Sept 10];85:2434–8. Available from: https://doi.org/10.1210/jcem.85.7.6682 Khan MJ, Ullah A, Basit S. Genetic Basis of Polycystic Ovary Syndrome (PCOS): Current Perspectives. Appl Clin Genet. 2019;12:249–60. Rotterdam ESHRE, ASRM-Sponsored PCOS consensus workshop group. Revised 2003 consensus on diagnostic criteria and long-term health risks related to polycystic ovary syndrome (PCOS). Hum Reprod. 2004;19:41–7. A brief insight. into the etiology, genetics, and immunology of polycystic ovarian syndrome (PCOS) - PubMed [Internet]. [cited 2024 Dec 18]. Available from: https://pubmed.ncbi.nlm.nih.gov/36190593/ Boomsma CM, Eijkemans MJC, Hughes EG, Visser GHA, Fauser BCJM, Macklon NS. A meta-analysis of pregnancy outcomes in women with polycystic ovary syndrome. Hum Reprod Update. 2006;12:673–83. Roos N, Kieler H, Sahlin L, Ekman-Ordeberg G, Falconer H, Stephansson O. Risk of adverse pregnancy outcomes in women with polycystic ovary syndrome: population based cohort study. BMJ. 2011;343:d6309. Lerchbaum E, Schwetz V, Giuliani A, Obermayer-Pietsch B. Influence of a positive family history of both type 2 diabetes and PCOS on metabolic and endocrine parameters in a large cohort of PCOS women. European Journal of Endocrinology [Internet]. 2014 [cited 2024 Sept 12];170:727–39. Available from: https://doi.org/10.1530/EJE-13-1035 Wang X, Quinn PJ, Lipopolysaccharide. Biosynthetic pathway and structure modification. Prog Lipid Res. 2010;49:97–107. Tremellen K, Pearce K. Dysbiosis of Gut Microbiota (DOGMA)--a novel theory for the development of Polycystic Ovarian Syndrome. Med Hypotheses. 2012;79:104–12. Guerville M, Boudry G. Gastrointestinal and hepatic mechanisms limiting entry and dissemination of lipopolysaccharide into the systemic circulation. Am J Physiol Gastrointest Liver Physiol. 2016;311:G1–15. Nip KM, Chiu R, Yang C, Chu J, Mohamadi H, Warren RL, et al. RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes. Genome Res. 2020;30:1191–200. Wang M, An K, Huang J, Mprah R, Ding H. A novel model based on necroptosis to assess progression for polycystic ovary syndrome and identification of potential therapeutic drugs. Front Endocrinol (Lausanne). 2023;14:1193992. Moderated estimation of fold change. and dispersion for RNA-seq data with DESeq2 - PubMed [Internet]. [cited 2025 Feb 21]. Available from: https://pubmed.ncbi.nlm.nih.gov/25516281/ Gustavsson EK, Zhang D, Reynolds RH, Garcia-Ruiz S, Ryten M. ggtranscript: an R package for the visualization and interpretation of transcript isoforms using ggplot2. Bioinformatics. 2022;38:3844–6. Gu Z, Hübschmann D. Make Interactive Complex Heatmaps in R. Bioinformatics. 2022;38:1460–2. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559. Orifjon S, Jammatov J, Sousa C, Barros R, Vasconcelos O, Rodrigues P. Translation and Adaptation of the Adult Developmental Coordination Disorder/Dyspraxia Checklist (ADC) into Asian Uzbekistan. Sports (Basel). 2023;11:135. VennDiagram. a package for the generation of highly-customizable Venn and Euler diagrams in R - PubMed [Internet]. [cited 2025 Feb 21]. Available from: https://pubmed.ncbi.nlm.nih.gov/21269502/ Chen H, Boutros PC. VennDiagram: a package for the generation of highly-customizable Venn and Euler diagrams in R. BMC Bioinformatics. 2011;12:35. Hou N, Li M, He L, Xie B, Wang L, Zhang R, et al. Predicting 30-days mortality for MIMIC-III patients with sepsis-3: a machine learning approach using XGboost. J Transl Med. 2020;18:462. Predicting Pressure Injury in Critical Care Patients. A Machine-Learning Model - PubMed [Internet]. [cited 2025 Feb 21]. Available from: https://pubmed.ncbi.nlm.nih.gov/30385537/ Ferroptosis. and Autophagy-Related Genes in the Pathogenesis of Ischemic Cardiomyopathy - PubMed [Internet]. [cited 2025 Feb 21]. Available from: https://pubmed.ncbi.nlm.nih.gov/35845045/ Sachs MC. plotROC: A Tool for Plotting ROC Curves. J Stat Softw. 2017;79:2. pROC. an open-source package for R and S + to analyze and compare ROC curves - PubMed [Internet]. [cited 2025 Feb 21]. Available from: https://pubmed.ncbi.nlm.nih.gov/21414208/ Spatial reconstruction of. single-cell gene expression data - PubMed [Internet]. [cited 2025 Feb 21]. Available from: https://pubmed.ncbi.nlm.nih.gov/25867923/ Disturbed Follicular Microenvironment in Polycystic Ovary Syndrome. Relationship to Oocyte Quality and Infertility - PubMed [Internet]. [cited 2025 Feb 21]. Available from: https://pubmed.ncbi.nlm.nih.gov/38375912/ The dynamics and. regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells - PubMed [Internet]. [cited 2025 Feb 21]. Available from: https://pubmed.ncbi.nlm.nih.gov/24658644/ Jin S, Guerrero-Juarez CF, Zhang L, Chang I, Ramos R, Kuan C-H, et al. Inference and analysis of cell-cell communication using CellChat. Nat Commun. 2021;12:1088. Li W, Hakkak R. Soy Protein Concentrate Diets Inversely Affect LPS-Binding Protein Expression in Colon and Liver, Reduce Liver Inflammation, and Increase Fecal LPS Excretion in Obese Zucker Rats. Nutrients. 2024;16:982. Wang D, Lou J, Ouyang C, Chen W, Liu Y, Liu X et al. Ras-related protein Rab10 facilitates TLR4 signaling by promoting replenishment of TLR4 onto the plasma membrane. Proceedings of the National Academy of Sciences of the United States of America [Internet]. 2010 [cited 2024 Oct 30];107:13806. Available from: https://pmc.ncbi.nlm.nih.gov/articles/PMC2922283/ Cheishvili D, Stefanska B, Yi C, Li CC, Yu P, Arakelian A et al. A common promoter hypomethylation signature in invasive breast, liver and prostate cancer cell lines reveals novel targets involved in cancer invasiveness. Oncotarget [Internet]. 2015 [cited 2024 Oct 30];6:33253. Available from: https://pmc.ncbi.nlm.nih.gov/articles/PMC4741763/ Wang M, An K, Huang J, Mprah R, Ding H. A novel model based on necroptosis to assess progression for polycystic ovary syndrome and identification of potential therapeutic drugs. Front Endocrinol (Lausanne). 2023;14:1193992. 1,25-. Dihydroxyvitamin D3 alleviates hyperandrogen-induced ferroptosis in KGN cells - PubMed [Internet]. [cited 2024 Nov 8]. Available from: https://pubmed.ncbi.nlm.nih.gov/36884209/ Liu M, Wu K, Wu Y. The emerging role of ferroptosis in female reproductive disorders. Biomed Pharmacother. 2023;166:115415. Ferroptosis. mechanisms, biology and role in disease - PubMed [Internet]. [cited 2024 Nov 11]. Available from: https://pubmed.ncbi.nlm.nih.gov/33495651/ Wang Y, Li N, Zeng Z, Tang L, Zhao S, Zhou F, et al. Humanin regulates oxidative stress in the ovaries of polycystic ovary syndrome patients via the Keap1/Nrf2 pathway. Mol Hum Reprod. 2021;27:gaaa081. Suppressive effects of androgens on the immune system. - PubMed [Internet]. [cited 2024 Nov 19]. Available from: https://pubmed.ncbi.nlm.nih.gov/25708485/ Lima PDA, Nivet A-L, Wang Q, Chen Y-A, Leader A, Cheung A, et al. Polycystic ovary syndrome: possible involvement of androgen-induced, chemerin-mediated ovarian recruitment of monocytes/macrophages. Biol Reprod. 2018;99:838–52. Trim WV, Lynch L. Immune and non-immune functions of adipose tissue leukocytes. Nat Rev Immunol. 2022;22:371–86. Liu Q, Lei C. Neuroprotective effects of miR-331-3p through improved cell viability and inflammatory marker expression: Correlation of serum miR-331-3p levels with diagnosis and severity of Alzheimer’s disease. Exp Gerontol. 2021;144:111187. Raitoharju E, Seppälä I, Oksala N, Lyytikäinen L-P, Raitakari O, Viikari J, et al. Blood microRNA profile associates with the levels of serum lipids and metabolites associated with glucose metabolism and insulin resistance and pinpoints pathways underlying metabolic syndrome: the cardiovascular risk in Young Finns Study. Mol Cell Endocrinol. 2014;391:41–9. de Klerk E, Xiao Y, Emfinger CH, Keller MP, Berrios DI, Loconte V et al. Loss of ZNF148 enhances insulin secretion in human pancreatic β cells. JCI Insight [Internet]. 2023 [cited 2024 Nov 22];8:e157572. Available from: https://pmc.ncbi.nlm.nih.gov/articles/PMC10393241/ Patel S, Ganbold K, Cho CH, Siddiqui J, Yildiz R, Sparman N, et al. Transcription factor PATZ1 promotes adipogenesis by controlling promoter regulatory loci of adipogenic factors. Nat Commun. 2024;15:8533. Petríková J, Lazúrová I, Yehuda S. Polycystic ovary syndrome and autoimmunity. Eur J Intern Med. 2010;21:369–71. Li N, Wang X, Wang X, Yu H, Lin L, Sun C, et al. Upregulation of FoxO 1 Signaling Mediates the Proinflammatory Cytokine Upregulation in the Macrophage from Polycystic Ovary Syndrome Patients. Clin Lab. 2017;63:301–11. Detection of T lymphocyte subsets. and related functional molecules in follicular fluid of patients with polycystic ovary syndrome - PubMed [Internet]. [cited 2024 Nov 26]. Available from: https://pubmed.ncbi.nlm.nih.gov/30988342/ Dai M, Hong L, Yin T, Liu S. Disturbed Follicular Microenvironment in Polycystic Ovary Syndrome: Relationship to Oocyte Quality and Infertility. Endocrinology. 2024;165:bqae023. Additional Declarations No competing interests reported. Supplementary Files floatimage13.png Fig.S 1. Quality control and feature selection in single-cell RNA sequencing analysis.(A) Elbow plot showing the standard deviation of principal components (PCs). The "elbow" point indicates the optimal number of PCs for downstream analysis.(B) Feature selection plot displaying genes ranked by standardized variance (y-axis) and average expression (x-axis). Highly variable genes (e.g., ITM2A, CXCL8) are highlighted, with 2,000 selected for further analysis.(C) Violin plots before QC showing the distribution of detected genes (nFeature_RNA) and total UMIs (nCount_RNA) per cell.(D) JackStraw plot assessing the significance of PCs, with dashed lines indicating statistically meaningful components.(E) Violin plots after QC displaying filtered metrics (nFeature_RNA, nCount_RNA, mitochondrial percentage). (scRNA-seq: single-cell RNA sequencing; UMAP: uniform manifold approximation and projection) Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7357877","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":535222267,"identity":"daef94ce-d5fb-409c-874c-775f384d02d7","order_by":0,"name":"Yang Li","email":"","orcid":"","institution":"Fifth Clinical Medical College, Shanxi Medical University","correspondingAuthor":false,"prefix":"","firstName":"Yang","middleName":"","lastName":"Li","suffix":""},{"id":535222268,"identity":"9edb9d14-0f02-4b36-a0e1-da293df1cc8c","order_by":1,"name":"Chunmei Bai","email":"","orcid":"","institution":"Fifth Clinical Medical College, Shanxi Medical University","correspondingAuthor":false,"prefix":"","firstName":"Chunmei","middleName":"","lastName":"Bai","suffix":""},{"id":535222269,"identity":"22e15066-bb4e-41dd-aae7-0059f6e528a4","order_by":2,"name":"Xumin Zhang","email":"","orcid":"","institution":"Children's Hospital of Shanxi, The Affiliated Children's Hospital of Shanxi Medical University","correspondingAuthor":false,"prefix":"","firstName":"Xumin","middleName":"","lastName":"Zhang","suffix":""},{"id":535222270,"identity":"e23b2fc4-2e93-40fc-9229-db51d555bc7b","order_by":3,"name":"Haixia Song","email":"","orcid":"","institution":"Shanxi Provincial People's Hospital","correspondingAuthor":false,"prefix":"","firstName":"Haixia","middleName":"","lastName":"Song","suffix":""},{"id":535222271,"identity":"aad148c9-693b-4311-a7e8-a4827668afd9","order_by":4,"name":"Caixia Yuan","email":"","orcid":"","institution":"Shanxi Provincial People's Hospital","correspondingAuthor":false,"prefix":"","firstName":"Caixia","middleName":"","lastName":"Yuan","suffix":""},{"id":535222272,"identity":"534bef5e-15f8-45a9-b922-ee23df8618c1","order_by":5,"name":"Ziwei Huang","email":"","orcid":"","institution":"Shanxi Provincial People's Hospital","correspondingAuthor":false,"prefix":"","firstName":"Ziwei","middleName":"","lastName":"Huang","suffix":""},{"id":535222273,"identity":"f0cea520-dccc-4df7-9ae1-98ba58974682","order_by":6,"name":"Jianrong Liu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA10lEQVRIiWNgGAWjYBACA2YgToCwmR98YJAgTQub4QyitCBzpHmIchg7j0HBg5o7+fzS7ReMbf5Y5PE3MD98dAOvw3gMDBKOPbOcOedMwePcNoliiQNsxsY5BLWwHTYwuJGTYJzbIJHYcICHTZqwln8QLdIWfyQS5xOlJbENpCX9gDQDm0TiBsJa2AoMEvsOG0jOyGEz7G2TSNx4mIBf7PsPbzP88e2wAb9E+uMHP/7UJc473vzwMT4tQMAGjRseKM2MXzlYyQMIzf6AsNpRMApGwSgYkQAAx+NIJ312BaMAAAAASUVORK5CYII=","orcid":"","institution":"Fifth Clinical Medical College, Shanxi Medical University","correspondingAuthor":true,"prefix":"","firstName":"Jianrong","middleName":"","lastName":"Liu","suffix":""}],"badges":[],"createdAt":"2025-08-12 16:38:12","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7357877/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7357877/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":94824109,"identity":"cb76b8d0-ed11-409d-a99a-fb57535bf883","added_by":"auto","created_at":"2025-10-31 06:48:30","extension":"doc","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":41768448,"visible":true,"origin":"","legend":"","description":"","filename":"Exploringthemechanismofbacteriallipopolysacchariderelatedgenesinvolvedinpolycysticovarysyndromeanditssignificanceindiagnosisjournalofovatianresearch.doc","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/efe80d62b4603a157bdb9c7b.doc"},{"id":94761179,"identity":"50e86032-488e-4988-9943-e284d77da0a8","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":8372,"visible":true,"origin":"","legend":"","description":"","filename":"f5c76e5666ac4ed799628e2040c237d6.json","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/7da13fefe10259ccb9c0d2e6.json"},{"id":94824754,"identity":"765519e0-d23d-4a9a-acad-f15b21b316d4","added_by":"auto","created_at":"2025-10-31 06:49:16","extension":"xml","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":151962,"visible":true,"origin":"","legend":"","description":"","filename":"f5c76e5666ac4ed799628e2040c237d61enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/adc4b7edf05e22435a0d9dc0.xml"},{"id":94824709,"identity":"6750f347-a1ba-4fd6-8320-a5f0c26d2530","added_by":"auto","created_at":"2025-10-31 06:49:14","extension":"png","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":88425,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/876279fc0f488585c6dc308f.png"},{"id":94761184,"identity":"d5b3dd43-cc0c-496c-9386-5d26cd158eb7","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":17,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":95795,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage10.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/d3ef202c6281fde71dba6768.png"},{"id":94761198,"identity":"abcca8f4-7d9e-46cc-aa9e-ec76863d8f09","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":18,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":193145,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage11.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/e57c9e4496d995cf093831e8.png"},{"id":94824066,"identity":"6799e359-42cd-4765-bae2-c254304896a6","added_by":"auto","created_at":"2025-10-31 06:48:25","extension":"png","order_by":19,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":329349,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage12.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/98c6ba573232adcfa9b9f92c.png"},{"id":94761210,"identity":"25808134-9763-432b-afbd-8545a5d70716","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":20,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":224150,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage13.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/5dd0fa751e66062102ba55b0.png"},{"id":94761195,"identity":"46ebe178-3b05-4722-af97-3b15e4fa7370","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":21,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":153033,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/67eea73c745d2ad518cfc62a.png"},{"id":94824483,"identity":"e8386081-17e6-4f43-b35c-a525ce2638b6","added_by":"auto","created_at":"2025-10-31 06:49:02","extension":"png","order_by":22,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":231065,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/db4fa8d8cb9d7a9513c08819.png"},{"id":94823723,"identity":"4c0019eb-171b-42c5-9891-f79d071ef550","added_by":"auto","created_at":"2025-10-31 06:47:54","extension":"png","order_by":23,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":172109,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/11988f6d097f2e496ba0ca63.png"},{"id":94824565,"identity":"e758dacd-606d-4e7e-9afa-1b660cd3cf23","added_by":"auto","created_at":"2025-10-31 06:49:07","extension":"png","order_by":24,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":159122,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/10875b7dcfa37ae765840264.png"},{"id":94761206,"identity":"983ba78c-2d56-4f10-86e7-92665a7ee702","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":25,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":115137,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/c7e249d072fddf388bc4d3c8.png"},{"id":94761204,"identity":"491f73e3-0449-40f0-9f63-d8d7f4361802","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":26,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":158516,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/177dde1666de490010af871c.png"},{"id":94761200,"identity":"4bd1e583-ec43-433c-9719-7dbc10807c0d","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":27,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":54457,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/b1e0ef882c59354454a10d3d.png"},{"id":94823967,"identity":"eb58fa9b-b718-411d-a295-5347504800eb","added_by":"auto","created_at":"2025-10-31 06:48:20","extension":"png","order_by":28,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":126412,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage9.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/5c446454bec1b7dfb198f36a.png"},{"id":94823555,"identity":"a9cd04ee-173c-490b-8edd-0bb5dd34071a","added_by":"auto","created_at":"2025-10-31 06:47:37","extension":"xml","order_by":29,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":151586,"visible":true,"origin":"","legend":"","description":"","filename":"f5c76e5666ac4ed799628e2040c237d61structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/acfc3c1197da4dc003781976.xml"},{"id":94761208,"identity":"9e383474-5a60-4ace-b04f-dd78f5af6e76","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"html","order_by":30,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":166214,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/9e9daffc3eceb496cdb29a3f.html"},{"id":94823358,"identity":"5e42407a-2854-4b47-bb3b-cf4f6115c762","added_by":"auto","created_at":"2025-10-31 06:47:13","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":482772,"visible":true,"origin":"","legend":"\u003cp\u003eTranscriptomic and co-expression network analysis reveals key modules and associations.(A) Volcano plot displaying differentially expressed genes (DEGs) between experimental groups. Points represent genes with adjusted P-values (log10-transformed) versus log2 fold change. Highlighted genes (e.g., LINC02668, ATP2B3, OLR1) are labeled. The \"DOWN NOT UP\" category indicates downregulated genes.(B) Heatmap of gene expression distribution across groups (CT, PCOS). Rows represent genes (e.g., METTL27, WNT3, CSMD3), and columns denote samples. Color intensity reflects normalized expression levels.(C) Hierarchical clustering dendrogram of samples to identify outliers. Branch height reflects dissimilarity.(D) Scale independence and mean connectivity analysis for weighted gene co-expression network construction. Left: Scale-free fit index (R²) versus soft thresholding power. Right: Mean connectivity versus power. Optimal power (dotted line) balances network connectivity and scale-free topology.(E) Cluster dendrogram of genes grouped into co-expression modules, colored by assigned module identity (e.g., turquoise, salmon).(F) Module-group relationships. Heatmap shows correlation coefficients (and P-values) between modules (rows) and traits (columns, e.g., CT, PCOS). Key modules (e.g., MEpink, MEsalmon) show significant associations (P \u0026lt; 0.05).(G–H) Scatterplots of module membership versus gene significance for the salmon (G) and pink (H) modules. Correlation coefficients (cor) and significance (P) indicate strong module-trait relationships. Gene significance color scale: 0.2 (low) to 1.0 (high).\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/10a733e5e00b9832e3fac1a3.png"},{"id":94761181,"identity":"1431c439-668f-43ef-958c-b3cce9ca91ed","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":391619,"visible":true,"origin":"","legend":"\u003cp\u003eConsensus clustering and differential expression analysis.(A) Consensus cumulative distribution function (CDF) plot for evaluating clustering stability across different cluster numbers (k). The x-axis represents the consensus index (ranging 0–1), and the y-axis shows the cumulative distribution. Curves flattening at higher k indicate optimal cluster stability.(B) Delta area analysis to determine the optimal number of clusters. The delta area (y-axis) quantifies the relative change in CDF area between consecutive k values. A significant drop (e.g., at k=2) suggests the most stable clustering solution.(C) Consensus matrix heatmap for k=2, visualizing sample similarity. Rows and columns represent samples; color intensity reflects consensus values (0–1, white to dark blue), with higher values indicating stronger co-clustering agreement.(D) Volcano plot of differentially expressed genes (DEGs) between groups. Dots represent genes with log2 fold change (x-axis) versus −log10 adjusted P-values (y-axis). Thresholds: |log2FC| ≥ 2.0 and P-adjust \u0026lt; 0.05. Highlighted genes (e.g., LBHD2, COLSA1, FOX11) are labeled. Red/blue dots denote upregulated (n=3182) and downregulated (n=380) genes, respectively.\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/75a22e2a9aafe870bf5c16ed.png"},{"id":94823496,"identity":"437f758e-f396-4b73-a5f2-37f9e7fd3d70","added_by":"auto","created_at":"2025-10-31 06:47:30","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":778392,"visible":true,"origin":"","legend":"\u003cp\u003eFunctional enrichment and pathway analysis of key gene sets.(A) Overlap analysis of differentially expressed genes (DEGs), WGCNA modules, and BLRGs (biological process-related genes). Numbers indicate overlapping gene counts (with percentages). A total of 2,220 genes were identified as BLRGs, with 1,198 (64.2%) overlapping DEGs and 1,162 (11.3%) overlapping WGCNA modules.(B) Enriched pathways related to insulin signaling and actin dynamics. Bar plot displays pathways (e.g., \"cellular response to insulin stimulus,\" \"regulation of Arp2/3-mediated actin nucleation\") and associated genes (e.g., TBC1D4, PIK3C2A). Size scale reflects gene count or enrichment significance (2.0–4.0).(C) Molecular function enrichment highlighting RNA-binding and oxidoreductase activities. Key terms include \"pre-mRNA binding\" (e.g., STRBP) and \"2-oxoglutarate-dependent dioxygenase activity\" (e.g., OAS1). Size scale (2.00–3.00) indicates functional enrichment strength. (D) Cellular component and complex associations, including \"neuromuscular junction\" (e.g., LAMA5) and chromatin-modifying complexes (e.g., MLL1/2 complex). Size values (2.00–3.00) denote pathway relevance.(E) Enriched metabolic and splicing pathways, such as \"Glycerophospholipid metabolism\" (e.g., DGKD) and \"Spliceosome\" (e.g., U2AF1). Size scale (2.0–4.0) represents pathway significance.\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/387d1598cba227c43bd6b75a.png"},{"id":94823447,"identity":"74d60d15-d3e4-481b-8df1-978c1e2f0af2","added_by":"auto","created_at":"2025-10-31 06:47:24","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":330565,"visible":true,"origin":"","legend":"\u003cp\u003eA Machine learning-based feature importance and model comparison.(A) XGBoost feature importance analysis. Genes (e.g., SMYD4, GRPEL1, SHPRH) are ranked by their \"Gain\" scores (x-axis), reflecting their contribution to improving model accuracy during decision tree splits.(B) Random Forest feature importance assessed by two metrics: MeanDecreaseAccuracy (left, impact on model accuracy) and MeanDecreaseGini (right, impact on node purity). Top-ranking genes (e.g., C11orf68, RELCH, SMYD4) are labeled. Values on the x-axis indicate importance magnitude.(C) Venn diagram of the intersection for the results of two algorithms.\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/6da2f91d1d3babcb3dd946a7.png"},{"id":94761191,"identity":"e71aa8eb-54ec-47a2-bd39-c2e9e7d2a553","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":349007,"visible":true,"origin":"","legend":"\u003cp\u003eRisk model development and validation for polycystic ovary syndrome (PCOS).(A) Cluster analysis of CT and PCOS groups. Bars represent clusters, with \"*\" indicating statistical significance between groups. (B)Differential expression of key genes (e.g., C11orf68, RELCH, EVI5L) in CT vs. PCOS groups. (C) Risk score distribution for PCOS based on gene signatures. Points represent individual samples; higher \"Total Points\" (x-axis) correlate with increased predicted risk (y-axis). Genes such as C11orf68 and EVI5L contribute to scoring.(D) Hosmer–Lemeshow goodness-of-fit test for the logistic regression model (p=0.817). (E) ROC curve demonstrating model performance (AUC=0.824). \u003cem\u003e(CT: control; AUC: area under curve)\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/032083e8d26da7ed97be3fa3.png"},{"id":94761180,"identity":"3729e85c-9bb3-4d26-842e-e55f96f8b698","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":380839,"visible":true,"origin":"","legend":"\u003cp\u003eRelative mRNA expression levels of C11orf68 and EV15L in KGN cells and human granulosa cells (GCs). (A)Expression in human granulosa cells (GCs):EVISL and C11orf68 show significantly lower expression in the PCOS group compared to the CT group (***p \u0026lt; 0.001). (B) Validation in KGN cell model: Consistent with the findings in primary GCs, C11orf68 and EV15L show similar expression trends in KGN cells(****p \u0026lt; 0.001).\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/7c4ea71751e18f379cb4daec.png"},{"id":94824580,"identity":"3d1d20f1-04ea-489b-9628-e64fe21c561e","added_by":"auto","created_at":"2025-10-31 06:49:08","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":405836,"visible":true,"origin":"","legend":"\u003cp\u003eExternal validation and pathway enrichment analysis.(A) Expression profile of C11orf68 in the HPA dataset. Each organ is represented by a distinct colored bar, with the intensity of the color indicating the level of expression. C11orf68 was expressed at the highest level in skeletal muscle.(B) Expression profile of EVI5L in the HPA dataset. (C) GSEA for C11orf68. The running enrichment score plot (top) reflects cumulative enrichment (y-axis) across ranked genes (x-axis, 0–20,000). Negative scores indicate pathway suppression. Significantly enriched pathways (e.g., \"BIOCARTA_AHSP_PATHWAY,\" \"REACTOME_ERYTHROCYTES_TAKE_UP_CARBON_DIOXIDE\") are labeled, highlighting roles in iron metabolism and cellular transport processes.\u003cem\u003e (HPA: Human Protein Atlas; GSEA: Gene Set Enrichment Analysis)\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/f57c78b8b13ff2982c328c98.png"},{"id":94761186,"identity":"469e52b0-243e-4a93-b6ab-74fb30dac04f","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":290810,"visible":true,"origin":"","legend":"\u003cp\u003eImmune cell composition and gene expression in CT vs PCOS.(A) Immune cell proportions showing significant differences between CT and PCOS groups (bar plot).(B) Temporal changes in immune cell composition (2007-2521 samples), highlighting distinct patterns between groups (line plot).(C) Correlation heatmap of key genes (C11orf68, EVI5L) with specific immune cell types.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e(CT: control group; PCOS: polycystic ovary syndrome)\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/c42fe98a9a6e9acc1d6d0fa1.png"},{"id":94761189,"identity":"4d79806b-1029-47fd-9821-347c6844ab8f","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":753297,"visible":true,"origin":"","legend":"\u003cp\u003eRegulatory networks and drug interactions for C11orf68 and EVI5L.(A) TF-miRNA-mRNA regulatory network for C11orf68 and EVI5L. (B) Drug-biomarker interaction network.\u003c/p\u003e","description":"","filename":"floatimage9.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/907f52d6e1c007cdc69395c3.png"},{"id":94824809,"identity":"3f9cc9e3-cf1f-455b-b23b-4d393ae548c3","added_by":"auto","created_at":"2025-10-31 06:49:20","extension":"png","order_by":10,"title":"Figure 10","display":"","copyAsset":false,"role":"figure","size":461493,"visible":true,"origin":"","legend":"\u003cp\u003eSingle-cell RNA sequencing analysis of immune and stromal cell populations in normal and PCOS samples. (A) UMAP visualization of single-cell clusters.(B) UMAP plot annotated by cell type. Distinct cell populations are spatially separated, highlighting cellular heterogeneity.(C) Cell type composition analysis. Bar plots show the relative proportions of cell types in normal and PCOS samples. Percentages reflect cell type abundance.(D) Comparison of cell type proportions between normal and PCOS groups. Key differences include increased macrophages (40.65% vs. 44.05%) and decreased endothelial cells (13.37% vs. 18.99%) in PCOS.(E) Statistical comparison of cell type percentages between normal and PCOS groups. Asterisks denote significant differences (*p \u0026lt; 0.05, **p \u0026lt; 0.01, ***p \u0026lt; 0.001, ns = not significant).(F) Expression profile of C1tor68 gene across different cell types..(G) Expression pattern of EVISL gene in major ovarian cell populations. \u003cem\u003e(UMAP: Uniform Manifold Approximation and Projection; GC: granulosa cells)\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage10.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/58297a0807d4ef12e75e04b1.png"},{"id":94823682,"identity":"2590788e-d378-4b76-a3b9-c0752789bc37","added_by":"auto","created_at":"2025-10-31 06:47:49","extension":"png","order_by":11,"title":"Figure 11","display":"","copyAsset":false,"role":"figure","size":594453,"visible":true,"origin":"","legend":"\u003cp\u003ePseudotime analysis of cellular trajectories and gene expression dynamics.(A) Pseudotime trajectory analysis. Cells are projected onto a two-dimensional space (Component 1 vs. Component 2), with pseudotime progression indicated by color gradients. The trajectory represents inferred cellular states or transitions.(B) Central clusters identified in the pseudotime analysis. Cells are grouped into central clusters based on their position in the trajectory, highlighting key transitional states.(C) State transitions along pseudotime. Cells are colored by inferred states, with arrows indicating potential differentiation or transition paths. Component 1 and Component 2 represent the primary dimensions of variation.(D) Relative gene expression dynamics along pseudotime. Line plots show the expression levels of selected genes (y-axis) as a function of pseudotime (x-axis). Key genes exhibit dynamic expression patterns, suggesting roles in cellular differentiation or state transitions.\u003c/p\u003e","description":"","filename":"floatimage11.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/9b38c017394e5f3bf9343119.png"},{"id":94761192,"identity":"510da80e-6bac-4c18-bce3-062a645e5654","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":12,"title":"Figure 12","display":"","copyAsset":false,"role":"figure","size":1940195,"visible":true,"origin":"","legend":"\u003cp\u003eCell-cell interaction analysis in Normal and PCOS conditions.(A, B) Network analysis showing the \u0026nbsp;interactions between smooth muscle cells and other immune/stromal cell types in\u003cstrong\u003e \u003c/strong\u003enormal samples.(C,D) Corresponding interaction number\u003cstrong\u003e \u003c/strong\u003e(C) and weight (D) in PCOS samples, highlighting altered connectivity (e.g., reduced neutrophil interactions in(PCOS).(E,F) Bubble plots of ligand-receptor pairs in normal (E) and PCOS (F), with significant interactions labeled. Dot size/color indicates communication probability and p-value.\u003c/p\u003e","description":"","filename":"floatimage12.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/ee1741737e47da4f5c87feee.png"},{"id":104401302,"identity":"a5723b59-45bf-4ba8-a76c-df0f25b3222f","added_by":"auto","created_at":"2026-03-11 12:12:19","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":7334310,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/88ea41f2-d6f3-489c-a25a-5d3cd2b5836c.pdf"},{"id":94761182,"identity":"fefe2524-efb5-444b-9e5d-a197af22be85","added_by":"auto","created_at":"2025-10-30 12:07:16","extension":"png","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":1049385,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFig.S 1.\u003c/strong\u003e Quality control and feature selection in single-cell RNA sequencing analysis.(A) Elbow plot showing the standard deviation of principal components (PCs). The \"elbow\" point indicates the optimal number of PCs for downstream analysis.(B) Feature selection plot displaying genes ranked by standardized variance (y-axis) and average expression (x-axis). Highly variable genes (e.g., ITM2A, CXCL8) are highlighted, with 2,000 selected for further analysis.(C) Violin plots before QC showing the distribution of detected genes (nFeature_RNA) and total UMIs (nCount_RNA) per cell.(D) JackStraw plot assessing the significance of PCs, with dashed lines indicating statistically meaningful components.(E) Violin plots after QC displaying filtered metrics (nFeature_RNA, nCount_RNA, mitochondrial percentage).\u003cem\u003e (scRNA-seq: single-cell RNA sequencing; UMAP: uniform manifold approximation and projection)\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage13.png","url":"https://assets-eu.researchsquare.com/files/rs-7357877/v1/311717fbd351f4e7418fd586.png"}],"financialInterests":"No competing interests reported.","formattedTitle":"Exploring the mechanism of bacterial lipopolysaccharide-related genes involved in polycystic ovary syndrome and its significance in diagnosis","fulltext":[{"header":"1 Introduction","content":"\u003cp\u003ePolycystic ovary syndrome (PCOS) is the most common endocrine disorder affecting women of childbearing age, with effects throughout the life cycle from puberty to postmenopause, and associated features include irregular or absent menstruation, hyperandrogenemia, and related metabolic and psychological sequelae [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e].It is the most common cause of anovulatory infertility [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. It affects 5\u0026ndash;10% of women of reproductive age, only in the United States 5\u0026nbsp;million women are affected, with a total of 105\u0026nbsp;million women affected worldwide[\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. PCOS is characterized by a wide range of clinical phenotypes of unknown etiology and complex pathogenesis, including hypothalamic and ovarian dysfunction, excessive androgen exposure, insulin resistance and obesity-related mechanisms, and genetic predisposition. The diagnosis of PCOS is heterogeneous and there is no single criterion[\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. Currently follows the Rotterdam criteria as stated in the 2003 Rotterdam Consensus Conference, whose core features are hyperandrogenemia and polycystic ovarian morphology (PCOM), and the diagnosis requires the fulfillment of two of the three features (hyperandrogenemia, menstrual disorders, and PCOM), and the exclusion of secondary etiologies (e.g. adult-onset congenital adrenocortical hyperplasia, hyperprolactinemia, and androgen-secreting tumors) [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Additionally, there is a close association between PCOS and the development of a variety of diseases. For example, ovulation dysfunction and lack of cyclic progesterone secretion in PCOS patients, the endometrium is stimulated by high estrogen for a long period of time, and the endometrium continues to proliferate, which is prone to hyperplasia, abnormal hyperplasia or even atypical hyperplasia and endometrial cancer[\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. PCOS is considered an important risk factor for Type 2 diabetes, cardiovascular disease, gestational diabetes, preeclampsia, preterm labor and gestational hypertension[\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. A population-based study of single births among 3,787 women with PCOS and more than 1\u0026nbsp;million women without PCOS in Sweden from 1995 to 2007 proved this point[\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. Emerging evidence from epidemiological and genetic studies indicates that the phenotypic expression of PCOS is modulated by a complex interplay of genetic predisposition and environmental determinants[\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. Accordingly, the study of corresponding biomarkers in PCOS is highly necessary to explore the prevention and early diagnosis of PCOS.\u003c/p\u003e\u003cp\u003eLipopolysaccharide (LPS), a structurally complex amphipathic molecule, constitutes the fundamental architectural component of the outer membrane in Gram-negative bacteria.LPS consists of three parts: lipid A, core polysaccharide and O-antigen repeats.Lipid A is the bioactive component of LPS[\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. Tremellen et al. postulated the gut-microbiota-dysbiosis hypothesis as a potential mechanism underlying the pathogenesis of polycystic ovary syndrome[\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. In PCOS, there are degrees of intestinal dysbiosis, and the impaired intestinal microecology leads to increased intestinal permeability and Gram-negative bacterial lipopolysaccharide (LPS)-associated endotoxemia. When LPS production exceeds hepatic capacity, intestinal-sourced LPS enters the systemic circulation and induces the activation of Toll Like Receptor 4(TLR4)-mediated inflammatory pathways, leading to persistent chronic low-grade inflammation, insulin resistance, and ultimately exacerbation of PCOS [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. Therefore, LPS plays an important role in the disease process of PCOS, but the LPS-related genes that contribute to these PCOS have not been fully elucidated, prompting us to further investigate LPS-related biomarkers.\u003c/p\u003e\u003cp\u003eSingle-cell RNA sequencing (scRNA-seq) refers to a high-throughput sequencing method for studying the transcriptome of individual cells. Different from RNA sequencing of bulk samples (RNA-seq), scRNA-seq detects transcriptome heterogeneity of specific cells, suggesting dynamic changes in cellular status during disease processes[\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eIn this study, we screened and identified LPS-related biomarkers in PCOS based on the PCOS transcriptome data from public databases, and explored the mechanism of biomarkers in PCOS using a series of bioinformatics analyses, as well as verified the expression levels and changes of biomarkers at the single-cell level, with a view to providing theoretical reference bases for clinical diagnosis and preventive treatments of PCOS.\u003c/p\u003e"},{"header":"2 Results","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003e2.1 Identified 8640 DEGs and 527 key modular genes associated with PCOS\u003c/h2\u003e\u003cp\u003eIn the training cohort, differential expression analysis identified 8,640 DEGs between PCOS and control samples, with 8,581 genes upregulated and 59 genes downregulated (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA,B). The expression matrix of the samples in the training set was then subjected to WGCNA. According to cluster analysis, with no notable outliers and no need to eliminate any samples (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC). The optimal soft threshold (β) was determined to be 5 when the scale-free fit index (R\u003csup\u003e2\u003c/sup\u003e) exceeded 0.85 and the average connectivity was close to zero Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eD). A total of 16 modules were identified based on the criteria of the dynamic tree-cutting algorithm (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eE). Following correlation analyses, two key modules were identified: the salmon module (cor = -0.56, p\u0026thinsp;\u0026lt;\u0026thinsp;0.05) had the greatest negative correlation with PCOS, and the pink module (cor\u0026thinsp;=\u0026thinsp;0.61, p\u0026thinsp;\u0026lt;\u0026thinsp;0.05) had the strongest positive correlation. Together, the salmon and pink modules contained 185 and 484 genes, respectively, for a total of 669 genes (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eF). Further refinement of the genes within the key module yielded 527 key modular genes associated with PCOS (|MM| \u0026gt;0.3 and |GS| \u0026gt;0.3) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eG, H).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003e2.2 Differential expression analysis between the two subtupes yielded 3563 DEBLRs\u003c/h2\u003e\u003cp\u003eConsistency clustering analysis of the PCOS samples based on the 50 BLRGs showed that it was the clustering into 2 subtypes (K\u0026thinsp;=\u0026thinsp;2) that was the most appropriate (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA, B, C). The gene expression profiles between the two subtypes were differentially analyzed, and 3,182 up-regulated and 380 down-regulated genes were found in the 3,562 DE-BLRGs that were collected (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eD).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\u003ch2\u003e2.3 Exploring the biological functions of 12 candidate genes\u003c/h2\u003e\u003cp\u003eIntersecting 8,640 DEGs, 527 key modular genes, and 3,562 DE-BLRGs (shown here are BLRGs) yielded 67 candidate genes associated with bacterial LPS in PCOS disease (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). GO and KEGG analyses were performed on these candidate genes. Among the 70 GO-BP entries enriched, candidate genes were significantly associated with functions such as \u0026lsquo;cellular response to insulin stimulus\u0026rsquo; and \u0026lsquo;regulation of Arp2/3 complex-mediated actin nucleation\u0026rsquo; (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB). Among the 24 GO-MF categories enriched, candidate genes were associated with \u0026lsquo;pre-mRNA binding\u0026rsquo; and \u0026lsquo;2-oxoglutarate-dependent dioxygenase activity\u0026rsquo;(Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC). Among the 38 enriched GO-CC entries, the candidate genes were associated with \u0026lsquo;neuromuscular junction\u0026rsquo; and \u0026lsquo;MLL1 complex\u0026rsquo;, etc. (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eD). Three KEGG pathways were further identified as \u0026lsquo;spliceosome\u0026rsquo;, \u0026lsquo;phosphatidylinositol signaling system\u0026rsquo; and \u0026lsquo;glycerophospholipid metabolism\u0026rsquo; (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eE).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003e2.4 Biomakers \u003cem\u003eC11orf68\u003c/em\u003e and \u003cem\u003eEVI5L\u003c/em\u003e: effective diagnostic indicators for PCOS\u003c/h2\u003e\u003cp\u003eA total of 12 genes were screened using XGBoost algorithms, namely \u003cem\u003eSMYD4\u003c/em\u003e, \u003cem\u003eGRPEL1\u003c/em\u003e, \u003cem\u003eEVI5L\u003c/em\u003e, \u003cem\u003eSHPRH\u003c/em\u003e, \u003cem\u003eRNU1-1\u003c/em\u003e, \u003cem\u003eRSBN1\u003c/em\u003e, \u003cem\u003eTMEM216\u003c/em\u003e, \u003cem\u003eUBE2Z\u003c/em\u003e, \u003cem\u003eC11orf68\u003c/em\u003e, \u003cem\u003eRPP14\u003c/em\u003e, \u003cem\u003eRELCH\u003c/em\u003e and \u003cem\u003eFRYL\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA). Meanwhile, RF algorithm screened eight genes, including \u003cem\u003eC11orf68\u003c/em\u003e, \u003cem\u003eRELCH\u003c/em\u003e, \u003cem\u003eEVI5L\u003c/em\u003e, \u003cem\u003eSTYXL1\u003c/em\u003e, \u003cem\u003eRSBN1\u003c/em\u003e, \u003cem\u003eMYO15B\u003c/em\u003e, \u003cem\u003eDCAF1\u003c/em\u003e and \u003cem\u003eGAB1\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB). Finally, the results of these two algorithms were intersected to obtain four feature genes: \u003cem\u003eC11orf68\u003c/em\u003e, \u003cem\u003eRELCH\u003c/em\u003e, \u003cem\u003eEVI5L\u003c/em\u003e and \u003cem\u003eRSBN1\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC).\u003c/p\u003e\u003cp\u003eSubsequent expression analyses showed that \u003cem\u003eC11orf68\u003c/em\u003e and \u003cem\u003eEVI5L\u003c/em\u003e expression was significantly downregulated in PCOS samples from both datasets (\u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;0.05) (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eA, B). Consequently, \u003cem\u003eC11orf68\u003c/em\u003e and EVI5L were identified as the biomarkers in this study. Next, a diagnostic nomogram for PCOS was constructed based on \u003cem\u003eC11orf68\u003c/em\u003e and \u003cem\u003eEVI5L\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eC), and the calibration curve demonstrated the nomograms' strong predictive accuracy for PCOS. (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eD), while the ROC curve emphasized the plausible diagnostic value of the nomograms, with an AUC of 0.824 (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eE).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\u003ch2\u003e2.5 Validation of C11orf68 and EVI5L Downregulation in PCOS\u003c/h2\u003e\u003cp\u003eBioinformatic analysis identified C11orf68 and EVI5L as significantly downregulated genes in PCOS. To validate these findings, we examined their expression levels in granulosa cells (GCs) from PCOS patients compared to healthy controls (CT). The baseline information of the women recruited in this study is shown in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e2\u003c/span\u003e.Consistent with our computational predictions, qPCR analysis confirmed a significant reduction in both C11orf68 and EVI5L mRNA levels in PCOS-derived GCs (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eA, B).\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eBaseline characteristics of study participants.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBasic parameters\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eControl (n\u0026thinsp;=\u0026thinsp;20)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePCOS (n\u0026thinsp;=\u0026thinsp;30)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eT\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eP value\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAge (year)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e\u003cp\u003e32.25\u0026thinsp;\u0026plusmn;\u0026thinsp;4.28)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e\u003cp\u003e32. 00\u0026thinsp;\u0026plusmn;\u0026thinsp;3.84\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e-0.215\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.830\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBMI (kg/m\u003csup\u003e2\u003c/sup\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e\u003cp\u003e22.49\u0026thinsp;\u0026plusmn;\u0026thinsp;2.17\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e\u003cp\u003e25.11\u0026thinsp;\u0026plusmn;\u0026thinsp;3.88\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e3.061\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.004\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBasal serum E2 (pg/ml)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e\u003cp\u003e30.63\u0026thinsp;\u0026plusmn;\u0026thinsp;10.72\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e\u003cp\u003e36.80\u0026thinsp;\u0026plusmn;\u0026thinsp;12.99\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e1.76\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.085\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBasal serum P (ng/ml)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e\u003cp\u003e1.43\u0026thinsp;\u0026plusmn;\u0026thinsp;3.92\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e\u003cp\u003e0.61\u0026thinsp;\u0026plusmn;\u0026thinsp;0.51\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e-0.93\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.364\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBasal serum T (ng/ml)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e\u003cp\u003e35.57\u0026thinsp;\u0026plusmn;\u0026thinsp;16.21\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e\u003cp\u003e47.21\u0026thinsp;\u0026plusmn;\u0026thinsp;20.11\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e2.16\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.036\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAMH(ng/ml)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e\u003cp\u003e4.22\u0026thinsp;\u0026plusmn;\u0026thinsp;2.38\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e\u003cp\u003e6.41\u0026thinsp;\u0026plusmn;\u0026thinsp;3.72\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e2.534\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.015\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLH/FSH\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e\u003cp\u003e0.60\u0026thinsp;\u0026plusmn;\u0026thinsp;0.27\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e\u003cp\u003e1.64\u0026thinsp;\u0026plusmn;\u0026thinsp;0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e6.441\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.000\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eTo further investigate whether androgen excess, a hallmark of PCOS, contributes to this downregulation, we treated the KGN granulosa cell line with dihydrotestosterone (DHT) to mimic hyperandrogenic conditions. Strikingly, DHT treatment significantly suppressed the expression of both C11orf68 and EVI5L compared to untreated KGN cells (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eC, D). These results suggest that hyperandrogenism may play a key role in the dysregulation of these genes in PCOS.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003e2.6 Functional and tissue-specific analysis of biomakers\u003c/h2\u003e\u003cp\u003eTissue-specific analysis revealed that \u003cem\u003eC11orf68\u003c/em\u003e was expressed at the highest level in skeletal muscle and \u003cem\u003eEVI5L\u003c/em\u003e was expressed at the highest level in cerebral cortex (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eA, B). GSEA analysis of the biomarkers revealed that the top5 pathway significantly enriched for \u003cem\u003eC11orf68\u003c/em\u003e were \u0026lsquo;scavenging of heme from plasma\u0026rsquo;, \u0026lsquo;biocarta ahsp pathway\u0026rsquo;, \u0026lsquo;erythrocytes take up carbon dioxide and release oxygen\u0026rsquo;, \u0026lsquo;endosomal vacuolar pathway\u0026rsquo; and \u0026lsquo;iron metabolism disorders\u0026rsquo; (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eC). This suggests that \u003cem\u003eC11orf68\u003c/em\u003e may influence the onset and progression of PCOS through pathways such as oxidative stress and metabolism. Incredibly, \u003cem\u003eEVI5L\u003c/em\u003e was not enriched for significant results. Prediction of SUMO modification sites for the biomarkers revealed that 11 SUMO sites were detected in \u003cem\u003eEVI5L\u003c/em\u003e and 3 SUMO sites were detected in \u003cem\u003eC11orf68\u003c/em\u003e( Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003ePredicted SUMOylation sites and SUMO interaction motifs in EVI5L and C11orf68.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"6\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGene\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003ePosition\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePeptide\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eScore\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eCut-off\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eType\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEVI5L\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e40\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eDELELLAKLEEQNRL\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.9406\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMOylation\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEVI5L\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e583\u0026ndash;587\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eGRELRQRVVELETQDHIHR\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.9236\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.85\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMO interaction\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEVI5L\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e700\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eKDQIEELKAEVRLLK\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.8897\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMOylation\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEVI5L\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e137\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eSATDMPVKNQYSELL\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.8774\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMOylation\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEVI5L\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e202\u0026ndash;206\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eYCQGSAFIVGLLLMQMPEE\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.8753\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.85\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMO interaction\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEVI5L\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e771\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eRRLERPAKDSEGSSD\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.8747\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMOylation\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEVI5L\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e610\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eERAALQEKLQYLAAQ\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.8723\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMOylation\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEVI5L\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e509\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eAQLQEELKALKVREG\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.8578\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMOylation\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEVI5L\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e354\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eVLKAYQVKYNPKKMK\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.854\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMOylation\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEVI5L\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e619\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eQYLAAQNKGLQTQLS\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.8434\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMOylation\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEVI5L\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e53\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eRLLEADSKSMRSMNG\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.8413\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMOylation\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eC11orf68\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e213\u0026ndash;217\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eAKEGGRQVICVYTDDFTDR\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.9405\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.85\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMO interaction\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eC11orf68\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e225\u0026ndash;229\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eTDDFTDRLGVLEADSAIRA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.8953\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.85\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMO interaction\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eC11orf68\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e246\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eIKCLLTYKPDVYTYL\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.8367\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSUMOylation\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e\u003ch2\u003e2.7 Infiltration of immune cells in PCOS patients\u003c/h2\u003e\u003cp\u003eIn all training set samples, the degree of infiltration of 22 immune cell types was evaluated. This analysis revealed a significantly different distribution of seven immune cells between the PCOS and control groups (\u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;0.05) (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003eA, B). To be specific, PCOS samples showed higher infiltration abundances of macrophages M2, activated mast cells, resting NK cells, and activated T cells CD4 memory. Conversely, macrophages M0, resting mast cells and activated NK cells had lower infiltration abundances in PCOS samples. Furthermore, we discovered a substantial negative correlation between \u003cem\u003eC11orf68\u003c/em\u003e and both active mast cells and resting NK cells. EVI5L, on the other hand, showed a negative correlation with activated mast cells and a positive correlation with resting mast cells. (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003eC). This suggests that these biomarkers may influence PCOS through interactions with the immune environment.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e\u003ch2\u003e2.8 TF-miRNA-mRNA regulatory network construction and drug forecasting\u003c/h2\u003e\u003cp\u003eFirstly, 25 miRNAs associated with \u003cem\u003eC11orf68\u003c/em\u003e and 15 miRNAs associated with \u003cem\u003eEVI5L\u003c/em\u003e were predicted using the ENCORI and miRWalk databases, and it was observed that hsa-miR-331-3p, hsa-miR-671-5p, hsa-miR-34a-5p, and hsa-miR-34c-5pwas the miRNA associated with both biomarkers. The JASPAR database predicts TFs of C11orf68 and EVI5L, 18 and 23, respectively, and ZNF148, PATZ1, and ZNF460 were co-regulators of \u003cem\u003eC11orf68\u003c/em\u003e and \u003cem\u003eEVI5L\u003c/em\u003e. Based on the above anticipated outcomes, a sophisticated TF-miRNA-mRNA regulation network was built (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eA). In addition, 33 and 36 drugs were obtained by predicting drugs targeting \u003cem\u003eC11orf68\u003c/em\u003e and \u003cem\u003eEVI5L\u003c/em\u003e in CTD, respectively. From the constructed drug-biomarker network, it can be seen that there are 13 drugs that can interact with these two biomarkers (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eB). Among them, benzo(a)pyrene, valproic acid, bisphenol A and cisplatin may be promising for the treatment of PCOS and its related complications.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003e2.9 T cells could be key cells in PCOS.\u003c/h2\u003e\u003cp\u003eThe pre- and post-QC results of the single-cell dataset are shown in \u003cb\u003eFig.S 1A\u003c/b\u003e and\u003cb\u003eFig.S 1B\u003c/b\u003e. The top 2000 highly variable genes (HVGs) were further identified as the focus of subsequent studies (\u003cb\u003eFig.S 1C\u003c/b\u003e). After PCA downscaling, the top 30 principal components (PCs) were selected for downstream analysis (\u003cb\u003eFig.S 1D, E\u003c/b\u003e). UMAP clustering classified the cells into 10 cell clusters, which were annotated by marker genes into seven cell types (Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003eA\u003cb\u003e)\u003c/b\u003e, namely T cells, NK cells, macrophage, neutrophile, smooth muscle cells, endothelial cells, and GC cells (Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003eB, C). We also assessed the ratio of the seven cell types in PCOS and normal samples (Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003eD, E), and the results showed that except for NK cells, the other six cell types were significantly different between PCOS and normal samples. By looking at the differences in biomarker expression in different cell types in PCOS and normal samples (Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003eF, G), it was found that \u003cem\u003eEVI5L\u003c/em\u003e was not significantly different in all cell types; mean expression of \u003cem\u003eC11orf68\u003c/em\u003e was significantly different in all cell types except macrophages. Based on these results and in conjunction with the fact that T cells' critical function in PCOS has been documented in the literature, T cells were used as the key cells in this study for subsequent analyses. Pseudo-time inference of key cells showed that cells differentiate over time from right to left. Five cell taxa map to different times of differentiation (Fig.\u0026nbsp;\u003cspan refid=\"Fig11\" class=\"InternalRef\"\u003e11\u003c/span\u003eA, B, C). We also observed the trend of biomarker expression in key cells throughout the pseudotemporal process (Fig.\u0026nbsp;\u003cspan refid=\"Fig11\" class=\"InternalRef\"\u003e11\u003c/span\u003eD), where gene \u003cem\u003eC11orf68\u003c/em\u003e and \u003cem\u003eEVI5L\u003c/em\u003e expression both rose to plateau as the cells differentiated.\u003c/p\u003e\u003cp\u003eCompared with the control group, the number and strength of communications between cells in the PCOS group were enhanced (Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e12\u003c/span\u003eA-D). T cells communicated with other cells, such as NK cells, endothelial cells, and smooth muscle cells in significantly increased numbers and strength. In the control group, the ligand-receptor pair exhibiting the strong communication probability was identified as SEMA3A \u0026minus; (NRP1\u0026thinsp;+\u0026thinsp;PLXNA3) (Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e12\u003c/span\u003eE). In the PCOS group, the ligand-receptor pair exhibiting the strong communication probability was MDK\u0026thinsp;\u0026minus;\u0026thinsp;NCL (Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e12\u003c/span\u003eF).\u003c/p\u003e\u003c/div\u003e"},{"header":"3 Materials and methods","content":"\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003ch2\u003e3.1 Data source\u003c/h2\u003e\u003cp\u003eFrom the Gene Expression Omnibus(GEO) database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ncbi.nlm.nih.gov/gds\u003c/span\u003e\u003cspan address=\"https://www.ncbi.nlm.nih.gov/gds\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), transcriptome datasets and single-cell datasets pertaining to PCOS were obtained. The GSE84958 dataset (GPL16791 platform), included 15 PCOS patients and 23 controls' subcutaneous adipose tissue samples [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e], which served as the training set for this study. Meanwhile, the GSE43264 dataset (GPL15362 platform) consisting of subcutaneous adipose tissue samples from 8 PCOS patients and 7 controls[\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e] was used as the validation set. The single-cell dataset GSE240688 (GPL24676 platform), contained ovarian granulosa cell samples from 3 PCOS patients and 3 controls. In addition, 50 bacterial lipopolysaccharide-related genes (BLRGs) were obtained from the Comparative Toxicogenomics Database (CTD) (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://ctdbase.org/\u003c/span\u003e\u003cspan address=\"http://ctdbase.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) by entering the keyword \u0026lsquo;lipopolysaccharide\u0026rsquo; were retrieved and included in this study.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\u003ch2\u003e3.2 Identification of differentially expressed genes (DEGs)\u003c/h2\u003e\u003cp\u003eDifferential expression analysis was performed using the DESeq2 package (v 1.34.0) [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e] to identify DEGs ( |log\u003csub\u003e2\u003c/sub\u003eFC| \u0026gt;2 and adj.P.value\u0026thinsp;\u0026lt;\u0026thinsp;0.05). The ggplot2 package (v 3.4.1) [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e] was used to draw volcano plot to visualise the DEGs, to sort the DEGs based on the multiplicity of difference log\u003csub\u003e2\u003c/sub\u003eFC, and to volcano map labelled the top 10 up-and down-regulated genes in the disease group. Subsequently, expression heatmap was drawn for the top 10 up- and down-regulated DEGs using the pheatmap package (v 1.0.12) [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e].\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e\u003ch2\u003e3.3 Weighted gene co-expression network analysis(WGCNA)\u003c/h2\u003e\u003cp\u003eThe expression matrix of samples within the training set was subjected to WGCNA using WGCNA package (v 1.71)[\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. Hierarchical clustering was conducted using Euclidean distance to examine potential outliers, which were subsequently removed. Then the optimal soft threshold was searched to make the constructed network more consistent with the scale-free topology. A systematic clustering tree between genes is generated by computing the adjacency between genes, calculating the similarity between genes based on the adjacency, deriving the coefficient of dissimilarity between genes, and so on. Then, according to the standard hybrid dynamic tree cutting algorithm, the minimum number of genes per gene module was set to 50, and MEDissThres was set to 0.2 to merge similar modules. Using 'PCOS and control' as the traits, Spearman's correlation analysis was performed using the psych package (v 2.4.3) [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e] between the modular characterization genes and the traits, with a threshold of |r| \u0026gt;0.3 and a P value\u0026thinsp;\u0026lt;\u0026thinsp;0.05, followed by the creation of a corresponding heatmap to visualize the correlations. The module with the strongest positive and negative correlation with PCOS was selected as the key modules. Lastly, genes within these key modules were further filtered using a threshold of module membership greater than 0.3 and gene significance greater than 0.3. These filtered genes were defined as key modular genes associated with PCOS.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec16\" class=\"Section2\"\u003e\u003ch2\u003e3.4 Consistent cluster analysis and identification of differentially expressed BLRGs (DE-BLRGs)\u003c/h2\u003e\u003cp\u003eIn summary, our investigation reveals that lipopolysaccharide (LPS)-associated mechanisms form an intricate regulatory network that plays a pivotal role in the initiation and progression of PCOS. These interconnected mechanisms, spanning immune activation, metabolic dysfunction, and endocrine disruption, provide valuable insights into the pathophysiological basis of PCOS. The elucidation of these pathways not only advances our comprehension of the molecular etiology of PCOS but also highlights potential therapeutic targets for clinical intervention. Nevertheless, given the current limitations in research methodologies and technological capabilities, conducting exhaustive experimental investigations remains challenging. In subsequent investigations, we will maintain our focus on elucidating the precise roles of these mechanisms in PCOS pathogenesis, with particular emphasis on their potential as diagnostic biomarkers and therapeutic targets.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e\u003ch2\u003e3.5 Identification and functional enrichiment analysis of candidate genes\u003c/h2\u003e\u003cp\u003eCandidate genes were obtained by taking the intersection of DEGs, key modular genes and DE-BLRGs using the VennDiagram package (v 1.7.1)[\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. After obtaining candidate genes, Gene Ontology (GO), which involved biological processes (BP), cellular components (CC) and molecular functions (MF), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were carried out using the clusterProfiler package (v 4.7.1.003)[\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e] (p.value\u0026thinsp;\u0026lt;\u0026thinsp;0.05).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec18\" class=\"Section2\"\u003e\u003ch2\u003e3.6 Recognition of feature genes using machine learning algorithms\u003c/h2\u003e\u003cp\u003eAfter identifying the candidate genes, two machine learning algorithms, XGBoost and RF, were used to screen the feature genes. The predictive models were constructed using the XGBoost package (v 1.6.2.1)[\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e], and genes were identified according to the rank of feature importance. Modeling with the randomForest package (v 4.7-1) [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e] was performed using the RF algorithm, which intersected the top 10 genes for MeanDecreaseAccuracy and MeanDecreaseGini to obtain the intersecting genes. The genes screened by the above two machine learning algorithms were intersected and represented by constructing a Venn diagram using the ggvenn package (v 1.7.3) [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e], and the intersected genes were used as feature genes.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec19\" class=\"Section2\"\u003e\u003ch2\u003e3.7 Identification of biomarkers and construction of nomogram\u003c/h2\u003e\u003cp\u003eThe expression of these feature genes was analyzed in PCOS and control samples from the training and validation sets. Feature genes that showed significant differences (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05) and consistent expression trends in both datasets were selected as biomarkers for this study. Next, a nomogram was created based on biomarkers in all samples of the training sets using the rms package (v 6.5-1) [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]. To evaluate the nomogram's predictive performance, we constructed calibration curves, an ideal model has calibration curves with a slope approaching 1, indicating high accuracy in prediction. In addition, Receiver Operating Characteristic (ROC) curve were created by the pROC (v 1.18.0) [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e] to quantify the diagnostic effect of the nomogram (AUC\u0026thinsp;\u0026gt;\u0026thinsp;0.7).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec20\" class=\"Section2\"\u003e\u003ch2\u003e3.8 Tissue-specific enrichment analysis of biomarkers\u003c/h2\u003e\u003cp\u003eTissue-specific enrichment of biomarkers was performed in the training set using the \"multi-gene query\" function provided by The Human Protein Atlas (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.proteinatlas.org\u003c/span\u003e\u003cspan address=\"https://www.proteinatlas.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). The transcripts per million values of the genes in different tissues can be used to understand the expression level of each biomarker in different tissues.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec21\" class=\"Section2\"\u003e\u003ch2\u003e3.9 Gene enrichment analysis\u003c/h2\u003e\u003cp\u003eNext, in the training dataset, Gene Set Enrichment Analysis (GSEA) was performed to further explore the pathways significantly enriched by the biomarkers. Firstly, based on the MSigDB database, \"c2.cp.v2023.2.Hs.symbols.gmt\" was chosen as the reference gene set. Biomarkers were categorized into high and low expression groups based on their expression levels, and the reference gene set was enriched according to the high and low expression, which were then ordered for GSEA via the clusterProfiler (FDR\u0026thinsp;\u0026lt;\u0026thinsp;0.05 and p\u0026thinsp;\u0026lt;\u0026thinsp;0.05). The top 5 enriched pathways of the biomarkers were shown in order of significance.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec22\" class=\"Section2\"\u003e\u003ch2\u003e3.10 Immune infiltration analysis\u003c/h2\u003e\u003cp\u003eTo explore immune infiltration in PCOS, the CIBERSORT was employed to assess the abundance of 22 immune cell infiltrations between the PCOS and the normal group in the training set, and samples with p\u0026thinsp;\u0026gt;\u0026thinsp;0.05 were excluded. The Wilcoxon test was then used to screen out the immune cells with notable differences (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05) between PCOS and the normal group, and the results were presented in a box plot using the ggplot2 package. In addition, Spearman correlation analysis was carried out between differential immune cells and biomarkers using the psych package.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec23\" class=\"Section2\"\u003e\u003ch2\u003e3.11 Construction of molecular regulatory network\u003c/h2\u003e\u003cp\u003eIn order to explore the molecular regulation mechanism of biomarkers, the upstream miRNAs of biomarkers were predicted using the databases ENCORI (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://starbase.sysu.edu.cn/\u003c/span\u003e\u003cspan address=\"https://starbase.sysu.edu.cn/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) and miRWalk (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://mirwalk.umm.uni-heidelberg.de/\u003c/span\u003e\u003cspan address=\"http://mirwalk.umm.uni-heidelberg.de/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), respectively, then the intersection of two predictions was taken to obtain the miRNA; then the transcription factor (TF) regulating the biomarker was predicted by the JASPAR (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://jaspar.elixir.no/\u003c/span\u003e\u003cspan address=\"https://jaspar.elixir.no/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) database; finally, the TF-miRNA-mRNA regulatory network was visualized by the Cytoscape software.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec24\" class=\"Section2\"\u003e\u003ch2\u003e3.12 Small ubiquitin-like modifier (SUMO) and drug prediction analyses of biomarkers\u003c/h2\u003e\u003cp\u003eIn order to explore the SUMO chemical modification sites of the biomarkers, we first searched the corresponding proteins of the biomarkers in the NCBI (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ncbi.nlm.nih.gov/\u003c/span\u003e\u003cspan address=\"https://www.ncbi.nlm.nih.gov/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), retrieved the FASTA files of the corresponding proteins, and entered the FASTA sequences of the proteins corresponding to the biomarkers into the GPS-SUMO 2.0 database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ncbi.nlm.nih.gov/\u003c/span\u003e\u003cspan address=\"https://www.ncbi.nlm.nih.gov/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). The FASTA sequence of the protein corresponding to the biomarker was entered into the GPS-SUMO 2.0 database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://sumo.biocuckoo.cn/\u003c/span\u003e\u003cspan address=\"https://sumo.biocuckoo.cn/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), and then the SUMO interaction motifs and SUMO consensus sites of the biomarker at the protein level were obtained. To further explore potential therapeutic agents for PCOS disease, potential drugs or molecular compounds interacting with biomarkers were predicted by the CTD through the Cytoscape software to visualize these interactions.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec25\" class=\"Section2\"\u003e\u003ch2\u003e3.13 Quality control (QC) of the scRNA-seq dataset\u003c/h2\u003e\u003cp\u003eSubsequent to the above, we aimed to delve into the molecular regulatory mechanisms at the cellular level in PCOS. Using the Seurat package (v5.0.1) [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e], the scRNA-seq analysis was performed. The Seurat package was used for QC and statistical analysis of the scRNA-seq data. The quality control criteria that were followed were as follows: (a) excluding genes that were found in fewer than three cells; (b) excluding cells that had fewer than 200 or more than 6,000 genes detected overall; and (c) excluding cells that expressed more than 20% of their genes in mitochondria. Plots the distribution of nFeature RNA, nCount RNA and percent.mt violin before and after QC of single cell data using the ggplot2 package.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec26\" class=\"Section2\"\u003e\u003ch2\u003e3.14 Dimensional reduction and clustering\u003c/h2\u003e\u003cp\u003eAfter filtering out the cells and genes that did not meet the criteria, use the NormalizeData function to normalize the data, set the parameters to \"LogNormalize,\" and scale.factor\u0026thinsp;=\u0026thinsp;10000. Then the FindVariableFeatures function was used to extract the genes with higher coefficients of variation among cells, and the top 2000 highly variable genes (HVGs) with more obvious fluctuations were displayed for subsequent analysis. The LabelPoints function was used to visualize the results, identifying the top 10 genes with the highest variability. The JackStrawPlot and ElbowPlot functions were then used to decide which principal components (PCs).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec27\" class=\"Section2\"\u003e\u003ch2\u003e3.15 Cellular annotation\u003c/h2\u003e\u003cp\u003eIn order to further confirm the type of cell clusters, in the single-cell dataset GSE240688, the overall dimensionality reduction of the PCs screened in the previous step was initially carried out using the uniform manifold approximation and projection (UMAP), and then the FindClusters function was used to cluster the Principal Component Analysis\u0026zwnj; (PCA) dimensionality reduction data, and the number of cell clusters was ascertained by UMAP. Then, using the FindClusters function in the Seurat package (resolution\u0026thinsp;=\u0026thinsp;0.4), the PCA dimensionality reduction data were clustered to identify small cell clusters and determine the number of cell clusters.\u003c/p\u003e\u003cp\u003eCell types were then annotated based on marker genes. And histograms were plotted to show the proportions of each cell types in different samples. Then the cell types with significant differences were determined using the chi-square test. Expression patterns of previously identified two biomarkers in different cell types between PCOS and normal samples were further analyzed. In this study, in conjunction with the published literature [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e], the cell type that showed remarkable differences in the mean expression of two biomarkers between samples was defined as a key cell.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec28\" class=\"Section2\"\u003e\u003ch2\u003e3.16 Analyses of pseudo-time and cellular communication\u003c/h2\u003e\u003cp\u003eIn order to explore the temporal dynamics of gene expression experienced by each cell during key cell state changes in the single-cell dataset GSE240688, single-cell pseudotime trajectories were constructed employing the Monocle2 package (v 2.26.0)[\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e], in which all the cells within a single cell population were projected onto a root and multiple branches, and a single-cell track map was constructed through dimensionality reduction clustering.\u003c/p\u003e\u003cp\u003eTo investigate the interactions among all annotated cell types, based on PCOS and control samples from the GSE240688 dataset, cell-cell communication networks between cell types were analyzed using CellChat package (v 1.6.1) [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. The ligand-receptor pair interactions were visualized.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec29\" class=\"Section2\"\u003e\u003ch2\u003e3.17 Cell line culture\u003c/h2\u003e\u003cp\u003eThe human gastric cancer tumour-derived cell line KGN was purchased from Wuhan Punosai (Hubei, China).Biotechnology Company. Cells were cultured in DMEM/F12 (GIBCO, Carlsbad, CA, USA). The medium was supplemented with 10% foetal bovine serum (CellMax, Lanz Hangzhou, China) and 1% penicillin-streptomycin (BOSTER, Hangzhou, China). All cell lines were maintained in a humidified atmosphere of 5% CO₂ and 37\u0026deg;C. To establish a particulate cell model of PCOS in vitro, KGN cells were treated with 500 nM DHT for 24 hours.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec30\" class=\"Section2\"\u003e\u003ch2\u003e3.18 Patients and sample acquisition\u003c/h2\u003e\u003cp\u003e The study on primary GCs was approved by the medical ethics committee of Shanxi Provincial People's Hospital. This study was approved by the institutional ethics committee of Shanxi Provincial People's Hospital (Approval No. V1.02025818). All patients who visited the Reproductive Medicine Centre provided informed consent. Between January and April 2025, 30 women recently diagnosed with PCOS and 15 infertile women with normal ovulatory menstrual cycles were enrolled in the study. Inclusion criteria were age 20\u0026ndash;40 years, body mass index (BMI) 20\u0026ndash;28 kg/m\u0026sup2;, infertility duration exceeding 1 year, and no use of hormonal medications within the past 3 months. Patients with a history of thyroid dysfunction, diabetes, Cushing\u0026rsquo;s syndrome, hyperprolactinemia, cardiovascular disease, or androgen-secreting tumours were excluded. A gonadotropin-releasing hormone antagonist protocol was used to control ovarian hyperstimulation. When two or more follicles reached a diameter of \u0026ge;\u0026thinsp;18 mm, 0.1 mg of triptorelin acetate (FERRING, Wittland, Germany) and 2,000\u0026ndash;5,000 IU of human chorionic gonadotropin (hCG) were administered. Follicular aspiration was performed 36 hours later. The follicular aspirate from each patient is pooled, centrifuged at 2,500 rpm for 10 minutes, the supernatant is removed, and the precipitates were resuspended in PBS. The suspension is slowly added to Ficell-Paque (cytiva, 17144002), centrifuged, and the intermediate white flocculent material is resuspended in 1 mL PBS, mixed thoroughly, centrifuged, and the supernatant is removed. After treatment with red blood cell lysis buffer and trypsin, resuspend in 3 mL of culture medium and seed into a 6 cm culture dish. Culture in DMEM/F12 supplemented with 5% FBS and antibiotics (100 U/mL penicillin, 0.1 mg/mL streptomycin; Gibco, USA) at 37\u0026deg;C. After 48 hours, collect the cells and extract RNA.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec31\" class=\"Section2\"\u003e\u003ch2\u003e3.19 RNA extraction quantitative real‑time PCR (RT‑qPCR)\u003c/h2\u003e\u003cp\u003eRNA was extracted from KGN cells using an RNA extraction kit (Mei5bio, MF036). The purity and quantity of the extracted RNA were assessed using an ND-2000 spectrophotometer (Thermo, USA). Reverse transcription was performed using a commercial kit (Prime Script\u0026trade; RT Kit with gDNA Eraser, Takara, Japan) to remove potential contamination from genomic DNA. The mRNA expression levels of differentially expressed genes were quantified using polymerase chain reaction (PCR). PCR was performed using the CFX96 real-time PCR system (Bio-Rad) and a commercial kit [(TB Green Premix Ex Taq II Fast qPCR (2X), Takara, Dalian, China]. Each PCR reaction (25 \u0026micro;L) contained 12.5 \u0026micro;L SYBR Green (CN830S, Takara), 10 ng cDNA, and 400 nmol/L specific primers. The primer sequences for each gene are listed in Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e The PCR programme consisted of the following: 2 minutes at 95\u0026deg;C, followed by 40 cycles of PCR amplification. The amplification programme included 30 seconds of activation at 95\u0026deg;C, followed by 40 PCR cycles (5 seconds at 95\u0026deg;C and 10 seconds at 60\u0026deg;C). The quality of the primers and reaction was assessed using a final denaturation curve analysis. The β-Actin gene was used as the reference gene. mRNA expression levels were calculated using the 2-ΔΔCT method.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eSequence of primers used in the study.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"3\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGene name\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eForward primer\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eReverse primer\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eC11orf68\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5\u0026prime;-TGTCTACACCTACCTGGGCA-3\u0026prime;\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5\u0026prime;-GTCAGTTCCACGTTGTTGGC-3\u0026prime;\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEVI5L\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5\u0026prime;-TCCTCCGCCTCCTCCAACC-3\u0026prime;\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5\u0026prime;-GCCGCCATTCCTCCCACTC-3\u0026prime;\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eβ-Actin\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5\u0026prime;GCTCTGGCTCCTAGCACCAT-3\u0026prime;\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5\u0026prime;GCCACCGATCCACACAGAGT-3\u0026prime;\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec32\" class=\"Section2\"\u003e\u003ch2\u003e3.20 Statistical Analysis\u003c/h2\u003e\u003cp\u003eR software (v4.2.2) was used for all analyses. The Wilcoxon test was used to evaluate group differences. Statistical significance was defined as P\u0026thinsp;\u0026lt;\u0026thinsp;0.05.Clinical data were analysed using SPSS 29.0 software (SPSS, Inc., Chicago, IL, USA). All results are expressed as mean\u0026thinsp;\u0026plusmn;\u0026thinsp;standard error to accurately reflect the central tendency and dispersion of the data. Categorical data are presented as frequencies and percentages, and intergroup comparisons were performed using the chi-square test.Particle cell expression data obtained by RT-qPCR were analysed using t-tests with GraphPad Prism version 8.0.2 software. Differences were considered statistically significant when the P value was \u0026lt;\u0026thinsp;0.05.\u003c/p\u003e\u003c/div\u003e"},{"header":"4 Disscusion","content":"\u003cp\u003eIntestinal-derived LPS has now been shown to be a pathophysiological nexus between low-grade systemic inflammation, insulin receptor substrate-1serine phosphorylation-induced insulin resistance and the clinical manifestations of PCOS[\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. However, how LPS affects the development of PCOS at the genetic level is unclear.The use of bioinformatics to study and predict the role of the LPS in PCOS may be one of the best approaches. In this study, we applied LPS-related genes curated from CTD and PCOS-related dataset from GEO databases as the basis of our analysis. The biological pathways attended by the biomarkers were then analyzed in combination with bioinformatics to explore their immune microenvironment, potential regulatory mechanisms and related drugs.\u003c/p\u003e\u003cp\u003e\u003cem\u003eEVI5L\u003c/em\u003e has also been demonstrated to bind with Rab10 and activate the small GTPase Rab10, which modulates the sustained replenishment of Toll Like Receptor 4(TLR4) from the Golgi to the plasma membrane, and serves as a prerequisite for optimal macrophage activation after LPS stimulation [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. \u003cem\u003eC11orf68\u003c/em\u003e (Chromosome 11 Open Reading Frame 68) is a relatively new gene identified in the human genome, which is located on chromosome 11. \u003cem\u003eC11orf68\u003c/em\u003e was found to be upregulated in human cancer samples and associated with cell invasion [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. We hypothesized that this gene may influence ovarian development and function by regulating the cell cycle, the biological function of LPS-stimulated \u003cem\u003eC11orf68\u003c/em\u003e is unknown. We combined the two biomarkers screened and the risk score, and the error between actual PCOS risk and predicted risk was small in the calibration curve (P\u0026thinsp;=\u0026thinsp;0.817). To further assess the prognosis of PCOS patients with differential genes, the risk model prediction of key gene constructs was evaluated by constructing ROC curves based on diagnostic coefficients and gene expression levels, and the results showed that the key genes, \u003cem\u003eC11orf68\u003c/em\u003e and \u003cem\u003eEVI5L\u003c/em\u003e, were more accurate than in previous studies in terms of the prognosis of PCOS[\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eTo delineate the pathophysiological roles of candidate biomarkers in polycystic ovary syndrome (PCOS) progression, GSEA was employed to explore the potential mechanisms of C11orf68 and EVI5L, and according to the significant findings, C11orf68 was associated with iron metabolism disorders,reactome scavenging of heme from plasma and other pathways in PCOS, whereas EVI5L was not enriched for significant results. In patients with PCOS, hyperandrogenism is associated with abnormal levels of ferritin, and studies have found a negative correlation between serum ferritin and testosterone levels [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e]. Moreover, clinical evidence further substantiates the dysregulation of iron metabolism in polycystic ovary syndrome (PCOS). A case-control study conducted by Liu et al. involving 149 PCOS patients and 108 healthy controls demonstrated significantly elevated serum ferritin levels in the PCOS cohort, independent of obesity status (p\u0026thinsp;\u0026lt;\u0026thinsp;0.01) [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e]. This persistent iron overload phenotype in PCOS patients predisposes to oxidative stress-mediated cellular damage through multiple mechanisms: (1) excessive generation of reactive oxygen species (ROS) via Fenton reactions, (2) disruption of cellular redox homeostasis, and (3) induction of lipid peroxidation cascades. These pathological processes ultimately culminate in cellular membrane destabilization and programmed cell death, contributing to the systemic manifestations of PCOS [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e]. Meanwhile, free heme is a abundant reservoir of ferrous iron (Fe(II)) which driving the Fenton reaction, a process that produces ROS. Free heme is profound cytotoxic effects. Heme oxygenase (HO) is the rate-limiting enzyme in heme catabolism, and a recent study demonstrated that aberrant Nrf 2/HO-1 signaling promotes the development of PCOS and leads to pregnancy loss in gestating PCOS rats [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e],while activation of the normal Nrf 2/HO-1 signaling pathway can have a protective effect by reducing oxidative stress. In conclusion, \u003cem\u003eC11orf68\u003c/em\u003e can be involved in the disease process of PCOS through the above pathways, providing new insights for further understanding of the pathogenesis of PCOS.\u003c/p\u003e\u003cp\u003eBioinformatic screening identified C11orf68 and EVI5L as significantly downregulated genes in PCOS. This computational prediction was subsequently confirmed through qPCR analysis of human granulosa cells (GCs) obtained from PCOS patients, which demonstrated marked reductions in both genes compared to healthy controls. The concordance between our in silico predictions and clinical sample analysis strongly supports the reliability of our bioinformatic approach and the biological relevance of these findings.To further investigate the potential mechanisms underlying this dysregulation, we employed the KGN granulosa cell line treated with dihydrotestosterone (DHT), a well-established in vitro model for studying androgen excess in PCOS. Remarkably, DHT treatment recapitulated the expression patterns observed in clinical samples, with both C11orf68 and EVi5L showing significant downregulation. This parallel between patient-derived data and experimental models provides multiple layers of evidence supporting our findings:The consistency across different experimental systems (clinical samples and cell lines) enhances the robustness of our conclusions.The androgen-responsiveness of these genes suggests they may mediate some of the hyperandrogenic effects characteristic of PCOS.\u003c/p\u003e\u003cp\u003e Through systematic analysis of the GSE84958 dataset, we conducted a comprehensive characterization of immune cell infiltration patterns in PCOS, quantifying the relative abundance of 22 distinct immune cell populations. Our investigation revealed significant alterations in six specific immune cell subtypes: M0 macrophages, M2 macrophages, activated and resting NK cells, activated and resting mast cells, and activated memory CD4\u0026thinsp;+\u0026thinsp;T cells. This detailed profiling of immune cell heterogeneity provides valuable insights into the immunopathological mechanisms underlying PCOS.Abhishek Trigunaite's investigation revealed that androgens exhibit anti-inflammatory properties and are capable of inhibiting immune cell functionality [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e]. In PCOS, hyperandrogenemia may exacerbate chronic inflammatory processes through modulation of macrophage population density and phenotypic characteristics. Notably, elevated ratios of M1 to M2 macrophage subtypes have been documented within ovarian tissues of female rat models of PCOS exposed to 5-dihydrotestosterone (DHT). [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]. Estrogen contributes to the modulation of macrophage immune phenotypes via estrogen receptor alpha (ERα), which governs metabolic reprogramming in macrophages and facilitates their coordination with diverse activated signaling pathways across heterogeneous microenvironments. Under supraphysiological estrogen concentrations,the ability of macrophages binding LPS is enhanced, which may lead to a more severe inflammatory response following intestinal microecological disturbances. Furthermore, obesity, a prevalent phenotypic manifestation of PCOS, has been mechanistically linked to immune dysregulation and chronic low-grade inflammation. Trim et al. documented substantial quantities of neutrophil granulocytes, pro-inflammatory M1 phenotype macrophages, and T lymphocytes within adipose tissue[\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e]. These aggregated immune effector cells release the pro-inflammatory signaling molecules Tumor Necrosis Factor-alpha, Interleukin-6, and Interleukin-8, which subsequently trigger activation of the nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB) signaling cascade, thereby perpetuating a persistent systemic inflammatory condition.Our investigation revealed that C11orf68 exhibited an inverse association with NK cells resting and Mast cells activated, whereas EVI5L demonstrated a direct correlation with inactive mast cells alongside a counteractive relationship with their activated counterparts. These results suggest that biomarkers can correspond to the promotion or inhibition of cellular infiltration conditions and have an impact on PCOS disease progression.\u003c/p\u003e\u003cp\u003eTo explore the molecular regulatory mechanisms of key genes, hsa-miR-331-3p participate in cytosolic stress response and metabolic pathways, especially nucleotide metabolism, were predicted. It is suggested that the function of hsa-miR-331-3p maybe directly or indirectly related to the pathophysiology of PCOS, hsa-miR-331-3p has been documented to prevent neuron-associated inflammation, and knockdown of hsa-miR-331-3p decreases neuronal viability and promotes the expression of pro-inflammatory cytokines [\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e]. In addition, the hsa-miR-331-3p expression levels are associated with polyunsaturated fatty acids other than linoleic acid(LA),so it may affect the metabolic syndrome by regulating lipid metabolism[\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e]. After that, we predicted transcription factors regulating critical genes, and there were 18 transcription factors associated with \u003cem\u003eC11orf68\u003c/em\u003e and 23 transcription factors associated with \u003cem\u003eEVI5L\u003c/em\u003e. Notably, multiple polymorphisms within the transcription factor ZNF148 locus demonstrated genetic linkages to fasting insulin concentrations, glycemic parameters, and insulin resistance indices.Counterintuitively, ZNF148 transcriptional activity exhibited an inverse regulatory relationship with glucose-stimulated insulin secretion (GSIS) dynamics,suggesting ZNF148 may be a new therapeutic target for enhancing insulin secretion[\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e]. The transcription factor PATZ1 promotes adipogenesis through a mechanism of interaction with the promoter region of key early adipogenic factors via a transcriptional mechanism. Knockdown of PATZ1 in adipose tissue protects mice from obesity[\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e]. In summary, the above co-predicted miRNAs and TFs can affect insulin levels and obesity and thus cause PCOS through the regulation of biomarkers.\u003c/p\u003e\u003cp\u003eIn our study, the average expression of the two key genes was significantly different in the rest of the cell types, except macrophages, and T cells were obtained as key cells through single-cell analysis. The dysregulation of immunological mechanisms constitutes a critical factor in the pathophysiological mechanisms of PCOS. Nevertheless, the phenotypic profiling of T-lymphocyte subpopulations in individuals with PCOS remains inadequately characterized. Individuals with polycystic ovary syndrome PCOS exhibit a persistent subclinical inflammatory state characterized by elevated leukocyte concentrations,vascular endothelial impairment, and disturbances in pro-inflammatory cytokines[\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e]. Substantial infiltration of immunocompetent effector populations, comprising T cells, B cells, macrophages and dendritic cells, have been detected in human preovulatory follicles[\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e]. Accumulating evidence from immunological studies has established the critical involvement of T lymphocytes in the pathogenesis of PCOS, and as the main component of lymphocytes, have a variety of biological functions and are mainly involved in the cellular immune response of the organism. They can kill target cells directly or enhance and expand the immune effect by releasing lymphokines [\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e]. Also, a dysregulation of T lymphocytes and antigen-presenting cells was found in the follicular fluid microenvironment of patients with PCOS, resulting in a markedly pro-inflammatory environment in the follicular fluid of patients with PCOS, as evidenced by an increase in reactive oxygen species and an accumulation of lipid peroxidation by-products, as well as an impaired antioxidant defense capacity[\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eThe diagnosis and treatment of PCOS continue to present opportunities and challenges. Despite the nascent stage of lipopolysaccharide (LPS)-related research in PCOS pathogenesis, emerging evidence suggests its potential as a promising investigative avenue. The multifaceted role of LPS in immune modulation and metabolic regulation warrants further exploration in the context of PCOS pathophysiology. Future research directions may encompass: (1) the development of microbial-derived biomarkers for PCOS diagnosis, (2) the identification of LPS-mediated therapeutic targets, and (3) the advancement of immunotherapeutic strategies targeting LPS signaling pathways. These potential applications, though currently in their preliminary stages, represent significant opportunities for advancing both the understanding and clinical management of PCOS.\u003c/p\u003e"},{"header":"5 Conclusion","content":"\u003cp\u003eIn summary, our investigation reveals that lipopolysaccharide (LPS)-associated mechanisms form an intricate regulatory network that plays a pivotal role in the initiation and progression of PCOS. These interconnected mechanisms, spanning immune activation, metabolic dysfunction, and endocrine disruption, provide valuable insights into the pathophysiological basis of PCOS. The elucidation of these pathways not only advances our comprehension of the molecular etiology of PCOS but also highlights potential therapeutic targets for clinical intervention. Nevertheless, given the current limitations in research methodologies and technological capabilities, conducting exhaustive experimental investigations remains challenging. In subsequent investigations, we will maintain our focus on elucidating the precise roles of these mechanisms in PCOS pathogenesis, with particular emphasis on their potential as diagnostic biomarkers and therapeutic targets.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eThe study protocol was approved by the the medical ethics committee of Shanxi Provincial People's Hospital (V1.02025818). All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration. All patients participating in this study signed informed consent forms.\u003c/p\u003e\n\u003ch2\u003eCompeting Interests.\u003c/h2\u003e\u003cp\u003eThe authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.\u003c/p\u003e\u003ch2\u003eFunding.\u003c/h2\u003e\u003cp\u003eThe research reported in this project was generously supported by Shanxi Provincial Central Guidance Local Science and Technology Development Project under grant agreement number YDZJSX2022A069. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eYangLi: Conceptualization, Data curation, Validation, Visualization, Writing\u0026ndash;original draft, Writing\u0026ndash;review \u0026amp; editing; ChunmeiBai: Data curation, Validation, Visualization, Writing\u0026ndash;review \u0026amp; editing; XuminZhang: Validation, Writing\u0026ndash;review \u0026amp; editing; HaixiaSong: Visualization, Writing\u0026ndash;review \u0026amp; editing; CaixiaYuan: Con-ceptualization, Supervision, Writing\u0026ndash;review \u0026amp; editing; ZiweiHuang: Conceptualiza-tion, Supervision, Writing\u0026ndash;review \u0026amp; editing; JianrongLiu: Conceptualization, Project administration, Supervision, Writing\u0026ndash;review \u0026amp; editing. All authors read and approved the final manuscript.\u003c/p\u003e\u003ch2\u003eAcknowledgements.\u003c/h2\u003e\u003cp\u003eWe would like to express our sincere gratitude to all individuals and organizations who supported and assisted us throughout this research. Special thanks to the following authors: Chunmei Bai, Xumin Zhang, Haixia Song, Caixia Yuan, Ziwei Huang,Jianrong Liu. In conclusion, we extend our thanks to everyone who has supported and assisted us along the way. Without your support, this research would not have been possible.\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eMicroarray data in this work are available in the GEO online database (http:// www. ncbi. nlm. nih. gov/ geo).\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003ePolycystic ovary. syndrome - PubMed [Internet]. [cited 2024 Sept 10]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/35934017/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/35934017/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCriteria. phenotypes and prevalence of polycystic ovary syndrome - PubMed [Internet]. [cited 2024 Sept 10]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/31089072/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/31089072/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAsunci\u0026oacute;n M, Calvo RM, San Mill\u0026aacute;n JL, Sancho J, Avila S, Escobar-Morreale HF. A Prospective Study of the Prevalence of the Polycystic Ovary Syndrome in Unselected Caucasian Women from Spain1. The Journal of Clinical Endocrinology \u0026amp; Metabolism [Internet]. 2000 [cited 2024 Sept 10];85:2434\u0026ndash;8. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1210/jcem.85.7.6682\u003c/span\u003e\u003cspan address=\"10.1210/jcem.85.7.6682\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKhan MJ, Ullah A, Basit S. Genetic Basis of Polycystic Ovary Syndrome (PCOS): Current Perspectives. Appl Clin Genet. 2019;12:249\u0026ndash;60.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRotterdam ESHRE, ASRM-Sponsored PCOS consensus workshop group. Revised 2003 consensus on diagnostic criteria and long-term health risks related to polycystic ovary syndrome (PCOS). Hum Reprod. 2004;19:41\u0026ndash;7.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eA brief insight. into the etiology, genetics, and immunology of polycystic ovarian syndrome (PCOS) - PubMed [Internet]. [cited 2024 Dec 18]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/36190593/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/36190593/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBoomsma CM, Eijkemans MJC, Hughes EG, Visser GHA, Fauser BCJM, Macklon NS. A meta-analysis of pregnancy outcomes in women with polycystic ovary syndrome. Hum Reprod Update. 2006;12:673\u0026ndash;83.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRoos N, Kieler H, Sahlin L, Ekman-Ordeberg G, Falconer H, Stephansson O. Risk of adverse pregnancy outcomes in women with polycystic ovary syndrome: population based cohort study. BMJ. 2011;343:d6309.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLerchbaum E, Schwetz V, Giuliani A, Obermayer-Pietsch B. Influence of a positive family history of both type 2 diabetes and PCOS on metabolic and endocrine parameters in a large cohort of PCOS women. European Journal of Endocrinology [Internet]. 2014 [cited 2024 Sept 12];170:727\u0026ndash;39. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1530/EJE-13-1035\u003c/span\u003e\u003cspan address=\"10.1530/EJE-13-1035\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang X, Quinn PJ, Lipopolysaccharide. Biosynthetic pathway and structure modification. Prog Lipid Res. 2010;49:97\u0026ndash;107.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTremellen K, Pearce K. Dysbiosis of Gut Microbiota (DOGMA)--a novel theory for the development of Polycystic Ovarian Syndrome. Med Hypotheses. 2012;79:104\u0026ndash;12.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGuerville M, Boudry G. Gastrointestinal and hepatic mechanisms limiting entry and dissemination of lipopolysaccharide into the systemic circulation. Am J Physiol Gastrointest Liver Physiol. 2016;311:G1\u0026ndash;15.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNip KM, Chiu R, Yang C, Chu J, Mohamadi H, Warren RL, et al. RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes. Genome Res. 2020;30:1191\u0026ndash;200.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang M, An K, Huang J, Mprah R, Ding H. A novel model based on necroptosis to assess progression for polycystic ovary syndrome and identification of potential therapeutic drugs. Front Endocrinol (Lausanne). 2023;14:1193992.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eModerated estimation of fold change. and dispersion for RNA-seq data with DESeq2 - PubMed [Internet]. [cited 2025 Feb 21]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/25516281/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/25516281/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGustavsson EK, Zhang D, Reynolds RH, Garcia-Ruiz S, Ryten M. ggtranscript: an R package for the visualization and interpretation of transcript isoforms using ggplot2. Bioinformatics. 2022;38:3844\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGu Z, H\u0026uuml;bschmann D. Make Interactive Complex Heatmaps in R. Bioinformatics. 2022;38:1460\u0026ndash;2.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLangfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOrifjon S, Jammatov J, Sousa C, Barros R, Vasconcelos O, Rodrigues P. Translation and Adaptation of the Adult Developmental Coordination Disorder/Dyspraxia Checklist (ADC) into Asian Uzbekistan. Sports (Basel). 2023;11:135.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVennDiagram. a package for the generation of highly-customizable Venn and Euler diagrams in R - PubMed [Internet]. [cited 2025 Feb 21]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/21269502/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/21269502/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen H, Boutros PC. VennDiagram: a package for the generation of highly-customizable Venn and Euler diagrams in R. BMC Bioinformatics. 2011;12:35.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHou N, Li M, He L, Xie B, Wang L, Zhang R, et al. Predicting 30-days mortality for MIMIC-III patients with sepsis-3: a machine learning approach using XGboost. J Transl Med. 2020;18:462.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePredicting Pressure Injury in Critical Care Patients. A Machine-Learning Model - PubMed [Internet]. [cited 2025 Feb 21]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/30385537/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/30385537/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFerroptosis. and Autophagy-Related Genes in the Pathogenesis of Ischemic Cardiomyopathy - PubMed [Internet]. [cited 2025 Feb 21]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/35845045/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/35845045/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSachs MC. plotROC: A Tool for Plotting ROC Curves. J Stat Softw. 2017;79:2.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003epROC. an open-source package for R and S\u0026thinsp;+\u0026thinsp;to analyze and compare ROC curves - PubMed [Internet]. [cited 2025 Feb 21]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/21414208/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/21414208/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSpatial reconstruction of. single-cell gene expression data - PubMed [Internet]. [cited 2025 Feb 21]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/25867923/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/25867923/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDisturbed Follicular Microenvironment in Polycystic Ovary Syndrome. Relationship to Oocyte Quality and Infertility - PubMed [Internet]. [cited 2025 Feb 21]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/38375912/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/38375912/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eThe dynamics and. regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells - PubMed [Internet]. [cited 2025 Feb 21]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/24658644/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/24658644/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJin S, Guerrero-Juarez CF, Zhang L, Chang I, Ramos R, Kuan C-H, et al. Inference and analysis of cell-cell communication using CellChat. Nat Commun. 2021;12:1088.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi W, Hakkak R. Soy Protein Concentrate Diets Inversely Affect LPS-Binding Protein Expression in Colon and Liver, Reduce Liver Inflammation, and Increase Fecal LPS Excretion in Obese Zucker Rats. Nutrients. 2024;16:982.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang D, Lou J, Ouyang C, Chen W, Liu Y, Liu X et al. Ras-related protein Rab10 facilitates TLR4 signaling by promoting replenishment of TLR4 onto the plasma membrane. Proceedings of the National Academy of Sciences of the United States of America [Internet]. 2010 [cited 2024 Oct 30];107:13806. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pmc.ncbi.nlm.nih.gov/articles/PMC2922283/\u003c/span\u003e\u003cspan address=\"https://pmc.ncbi.nlm.nih.gov/articles/PMC2922283/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCheishvili D, Stefanska B, Yi C, Li CC, Yu P, Arakelian A et al. A common promoter hypomethylation signature in invasive breast, liver and prostate cancer cell lines reveals novel targets involved in cancer invasiveness. Oncotarget [Internet]. 2015 [cited 2024 Oct 30];6:33253. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pmc.ncbi.nlm.nih.gov/articles/PMC4741763/\u003c/span\u003e\u003cspan address=\"https://pmc.ncbi.nlm.nih.gov/articles/PMC4741763/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang M, An K, Huang J, Mprah R, Ding H. A novel model based on necroptosis to assess progression for polycystic ovary syndrome and identification of potential therapeutic drugs. Front Endocrinol (Lausanne). 2023;14:1193992.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003e1,25-. Dihydroxyvitamin D3 alleviates hyperandrogen-induced ferroptosis in KGN cells - PubMed [Internet]. [cited 2024 Nov 8]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/36884209/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/36884209/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu M, Wu K, Wu Y. The emerging role of ferroptosis in female reproductive disorders. Biomed Pharmacother. 2023;166:115415.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFerroptosis. mechanisms, biology and role in disease - PubMed [Internet]. [cited 2024 Nov 11]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/33495651/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/33495651/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang Y, Li N, Zeng Z, Tang L, Zhao S, Zhou F, et al. Humanin regulates oxidative stress in the ovaries of polycystic ovary syndrome patients via the Keap1/Nrf2 pathway. Mol Hum Reprod. 2021;27:gaaa081.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSuppressive effects of androgens on the immune system. - PubMed [Internet]. [cited 2024 Nov 19]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/25708485/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/25708485/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLima PDA, Nivet A-L, Wang Q, Chen Y-A, Leader A, Cheung A, et al. Polycystic ovary syndrome: possible involvement of androgen-induced, chemerin-mediated ovarian recruitment of monocytes/macrophages. Biol Reprod. 2018;99:838\u0026ndash;52.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTrim WV, Lynch L. Immune and non-immune functions of adipose tissue leukocytes. Nat Rev Immunol. 2022;22:371\u0026ndash;86.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu Q, Lei C. Neuroprotective effects of miR-331-3p through improved cell viability and inflammatory marker expression: Correlation of serum miR-331-3p levels with diagnosis and severity of Alzheimer\u0026rsquo;s disease. Exp Gerontol. 2021;144:111187.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRaitoharju E, Sepp\u0026auml;l\u0026auml; I, Oksala N, Lyytik\u0026auml;inen L-P, Raitakari O, Viikari J, et al. Blood microRNA profile associates with the levels of serum lipids and metabolites associated with glucose metabolism and insulin resistance and pinpoints pathways underlying metabolic syndrome: the cardiovascular risk in Young Finns Study. Mol Cell Endocrinol. 2014;391:41\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ede Klerk E, Xiao Y, Emfinger CH, Keller MP, Berrios DI, Loconte V et al. Loss of ZNF148 enhances insulin secretion in human pancreatic β cells. JCI Insight [Internet]. 2023 [cited 2024 Nov 22];8:e157572. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pmc.ncbi.nlm.nih.gov/articles/PMC10393241/\u003c/span\u003e\u003cspan address=\"https://pmc.ncbi.nlm.nih.gov/articles/PMC10393241/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePatel S, Ganbold K, Cho CH, Siddiqui J, Yildiz R, Sparman N, et al. Transcription factor PATZ1 promotes adipogenesis by controlling promoter regulatory loci of adipogenic factors. Nat Commun. 2024;15:8533.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePetr\u0026iacute;kov\u0026aacute; J, Laz\u0026uacute;rov\u0026aacute; I, Yehuda S. Polycystic ovary syndrome and autoimmunity. Eur J Intern Med. 2010;21:369\u0026ndash;71.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi N, Wang X, Wang X, Yu H, Lin L, Sun C, et al. Upregulation of FoxO 1 Signaling Mediates the Proinflammatory Cytokine Upregulation in the Macrophage from Polycystic Ovary Syndrome Patients. Clin Lab. 2017;63:301\u0026ndash;11.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDetection of T lymphocyte subsets. and related functional molecules in follicular fluid of patients with polycystic ovary syndrome - PubMed [Internet]. [cited 2024 Nov 26]. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pubmed.ncbi.nlm.nih.gov/30988342/\u003c/span\u003e\u003cspan address=\"https://pubmed.ncbi.nlm.nih.gov/30988342/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDai M, Hong L, Yin T, Liu S. Disturbed Follicular Microenvironment in Polycystic Ovary Syndrome: Relationship to Oocyte Quality and Infertility. Endocrinology. 2024;165:bqae023.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Polycystic ovarian syndrome, Bacterial lipopolysaccharide, Single-cell RNA sequencing, Machine learning, Biomarkers, Diagnostic performance","lastPublishedDoi":"10.21203/rs.3.rs-7357877/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7357877/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eBacterial lipopolysaccharide (LPS), a critical component of the outer membrane of Gram-negative bacteria, activates the host immune system via pattern recognition receptors (PRRs), triggering inflammatory responses. However, the role of LPS-related genes (BLRGs) in polycystic ovary syndrome (PCOS) remains unclear. This study integrated PCOS transcriptomic and single-cell RNA sequencing (scRNA-seq) data with BLRGs from the Comparative Toxicogenomics Database (CTD). Differential expression analysis, weighted gene co-expression network analysis (WGCNA), and consensus clustering identified candidate genes, while extreme gradient boosting (XGBoost) and random forest algorithms further screened C11orf68 and EVI5L as key biomarkers. Both genes were significantly downregulated in PCOS patients and linked to functions such as iron metabolism and heme clearance. Immune infiltration analysis revealed a significant negative correlation between activated mast cells and these biomarkers. Notably, the proportion of T cells was altered in PCOS samples, and scRNA-seq highlighted a dynamic \"rising-plateau\" expression pattern of C11orf68 and EVI5L during T-cell differentiation. A nomogram confirmed the predictive efficacy of these biomarkers for PCOS. Drug prediction and molecular regulatory network analysis provided insights into targeted therapies. This study is the first to uncover the regulatory role of LPS-related genes in PCOS, offering novel perspectives for early diagnosis and intervention strategies.\u003c/p\u003e","manuscriptTitle":"Exploring the mechanism of bacterial lipopolysaccharide-related genes involved in polycystic ovary syndrome and its significance in diagnosis","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-30 12:07:11","doi":"10.21203/rs.3.rs-7357877/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"7d46a43c-3cf4-48d0-9e25-4a6add0abb04","owner":[],"postedDate":"October 30th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-03-03T15:26:33+00:00","versionOfRecord":[],"versionCreatedAt":"2025-10-30 12:07:11","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7357877","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7357877","identity":"rs-7357877","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.