Are Combined Risk Factors Linked to In Vitro Fertilization Failure in Polycystic Ovary Syndrome? An Association Rule Mining Approach

preprint OA: closed
Full text JSON View at publisher
Full text 141,232 characters · extracted from preprint-html · click to expand
Are Combined Risk Factors Linked to In Vitro Fertilization Failure in Polycystic Ovary Syndrome? An Association Rule Mining Approach | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Are Combined Risk Factors Linked to In Vitro Fertilization Failure in Polycystic Ovary Syndrome? An Association Rule Mining Approach Xuehong Zhu, Lina Ge, Guanghui Dong, Zhong Lin, Feng Han This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7020094/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background Polycystic ovary syndrome (PCOS) is a common indication for in vitro fertilization (IVF). Previous studies have often assessed risk factors independently, overlooking the multifactorial interactions that frequently influence clinical pregnancy outcomes. We retrospectively analyzed electronic medical record (EMR) data from PCOS patients who were undergoing IVF. Methods Key clinical variables (age, body mass index [BMI], duration of infertility, hormonal or metabolic disorders, tubal or uterine abnormalities, ovarian conditions including luteinized unruptured follicle syndrome [LUFS], and treatment details) were one-hot-encoded. Apriori association rule mining (ARM) was applied to identify patterns linked to clinical pregnancy failure, using thresholds of support ≥ 0.05, confidence ≥ 0.60, and lift > 1. This approach revealed that multifactorial risk associations were not evident in traditional single-variable analyses. Results The overall clinical pregnancy success rate in the cohort was ~ 40%. ARM revealed several clinically meaningful patterns; notably, maternal age > 35 years frequently appeared in high-risk combinations, often with metabolic or anatomical abnormalities. For instance, the combination of LUFS and tubal obstruction was strongly associated with failure, suggesting a synergistic negative effect. Many of these multifactorial associations may be overlooked using a traditional single-variable analysis approach. Conclusions Apriori rule mining effectively identified complex combinations of risk factors for in vitro fertilization failure in polycystic ovary syndrome, informing individualized strategies. Clinically, identifying advanced-age patients with specific reproductive or metabolic abnormalities may support targeted interventions. This study shows the broader potential of combining ARM with electronic medical record data to reveal hidden patterns for personalized clinical decision-making. Association Rule Mining In Vitro Fertilization Polycystic Ovary Syndrome Medical Data Analyst Risk Factor Combinations Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 1. Background Polycystic ovary syndrome (PCOS) is a common endocrine and metabolic disorder affecting women of reproductive age [ 1 ]. It is characterized by hyperandrogenism, ovulatory dysfunction, and polycystic ovarian morphology, and affects approximately 5–20% of women worldwide. PCOS is the leading cause of anovulatory infertility [ 2 ]. While lifestyle modifications and ovulation induction enable many patients to conceive, a substantial proportion ultimately requires assisted reproductive technology (ART) such as in vitro fertilization (IVF). When women with PCOS undergo IVF, they face distinct clinical challenges. Controlled ovarian stimulation in PCOS often yields a large cohort of oocytes, but many are of suboptimal quality, resulting in poor-quality embryos [ 3 ]. This response also increases the risk of ovarian hyperstimulation syndrome (OHSS). Consequently, patients with PCOS tend to have lower implantation and clinical pregnancy rates per cycle. Despite advances in IVF techniques and a deeper understanding of PCOS pathophysiology, many patients fail to conceive following treatment. A variety of patient factors influence IVF success in PCOS. Advanced maternal age, obesity (high body mass index [BMI]), prolonged infertility, hormonal imbalances, tubal disease, and uterine abnormalities have all been linked to poorer outcomes. In particular, metabolic and hormonal disturbances inherent to PCOS, especially insulin resistance and hyperandrogenism, can synergistically impair reproduction. For example, elevated androgen levels combined with insulin resistance have been shown to negatively affect oocyte development and endometrial receptivity [ 4 ]. A recent review has highlighted that these pathophysiological mechanisms result in abnormal follicle growth, poor oocyte maturation, and dysfunctional endometrial lining, all contributing to infertility in PCOS. As the interplay of these factors is complex, predicting IVF success in PCOS is challenging using traditional statistical methods. Recently, machine learning techniques have shown promise in this field. Advanced algorithms such as random forests and gradient boosting decision trees can manage numerous variables and nonlinear interactions. Studies indicate that machine learning models incorporating patient age, BMI, and other clinical features often outperform simple logistic models for forecasting IVF outcomes [ 5 ]. These data-driven models can capture subtle patterns and enable individualized risk prediction. Association rule mining (ARM) is a powerful data-mining technique for uncovering hidden “if–then” patterns across multiple variables. It has been widely applied in healthcare to identify interpretable relationships among clinical factors [ 6 ]. For example, association rules have been used to construct diagnostic knowledge bases and summarize how combinations of patient characteristics jointly influence outcomes [ 7 ]. Previous studies have often assessed risk factors independently, overlooking the multifactorial interactions that frequently influence clinical pregnancy outcomes. In this study, we applied ARM to clinical data from women with PCOS who were undergoing IVF, aiming to identify combinations of risk factors that predict treatment failure. By revealing such multifactorial patterns, we aim to provide clinicians with more precise decision support, and ultimately, improve IVF success rates in this population [ 4 ]. 2. Methods and Materials 2.1 Data Source In this study, we retrospectively analyzed de-identified electronic medical records (EMRs) from the Reproductive Hospital of Guangxi, encompassing all IVF treatment cycles between 2018 and 2023 for females diagnosed with PCOS. Inclusion was restricted to patients diagnosed with PCOS according to standard clinical criteria, who were undergoing IVF with or without intracytoplasmic sperm injection during the study period. The dataset included each patient’s baseline characteristics, relevant diagnoses, treatment details, and IVF outcomes. Clinical pregnancy, the primary outcome, was defined as the presence of a fetal heartbeat on ultrasound approximately 6–7 weeks after embryo transfer, and this binary outcome (pregnant or not pregnant) was recorded for each cycle. Only records with complete information on key variables and outcomes were retained for analysis. The study was approved by the hospital’s institutional ethics board and owing to its retrospective nature and use of de-identified data, informed consent was waived. 2.2 Data Preprocessing The data from raw EMR extracts were rigorously preprocessed to ensure high-quality inputs for analysis. The following steps were implemented: Terminology standardization : Clinical descriptions and diagnoses were normalized using consistent vocabulary. In the raw EMRs, identical concepts were often documented using different terms or abbreviations. For example, “ polycystic ovary syndrome ” appeared as “ polycystic ovarian syndrome ,” “ PCOS,” or “ poly-ovary syndrome ,” depending on the physician’s documentation style. To address this, all symptoms and diagnosis labels were mapped to a standardized terminology, harmonizing synonyms and abbreviations into a single descriptor. This normalization ensured that each clinical concept, such as hyperprolactinemia, was represented uniformly across records, preventing duplicate features caused by inconsistent naming. Record filtering and de-duplication : Duplicate entries and records with substantial missing or incomplete data were removed. Secondary-use healthcare data often contains redundancies or omissions, which can introduce bias. Repeated patient records and cases lacking essential fields, such as clinical outcome or key diagnostic information, were identified and excluded. This step enhances data integrity and reliability by restricting the analysis to unique, complete patient records. One-hot encoding : One-hot encoding is a standard transformation technique for converting categorical variables into a numerical format suitable for machine learning [ 8 ]. Each distinct feature value, such as primary infertility, secondary infertility, or specific stimulation protocols, was converted into a separate column with a binary value: “1” indicating presence and “0” indicating absence in a given record. This process produced a Boolean feature matrix, where each row represented a patient’s IVF cycle and each column represented a specific condition or attribute. Continuous variables (e.g., age, BMI, and duration of infertility) were discretized into clinically meaningful categories based on standards from the World Health Organization [ 9 , 10 ]. All data cleaning and preprocessing steps followed best practices for secondary analysis of clinical data. This workflow resulted in a curated dataset of PCOS IVF cases that was standardized, high-quality, and ready for ARM. 2.3 Feature Selection In this study, a comprehensive set of clinical and treatment variables associated with IVF outcomes in patients with PCOS, based on prior research, was extracted from the cleaned EMRs. These variables included demographic characteristics, comorbidities, anatomical abnormalities, and treatment parameters, as detailed in Table 1 . Table 1 Features extracted from EMRs Domain Extracted Variables Reference Demographics Female age: discretized ≤ 35, > 35 [ 9 , 10 ] Anthropometry Female BMI (kg m⁻ 2 ): discretized ≤ 2, > 24 Reproductive history Years of infertility: discretized ≤ 5, > 5; infertility type: primary/secondary [ 11 , 12 ] Hormonal and metabolic disorders Hyperprolactinemia, sub-clinical hypothyroidism, insulin resistance [ 13 – 16 ] Tubo-uterine factors Unilateral tubal obstruction, bilateral tubal obstruction, hydrosalpinx, pelvic adhesion, intra-uterine adhesion [ 17 , 18 ]s Ovarian and endocrine factors Luteinized unruptured follicle syndrome (LUFS), ovarian cysts [ 17 , 18 ] Uterine malformations Septate uterus, uterine fibroids (leiomyomas), adenomyosis [ 19 ] ART treatment data Stimulation protocol (categorical), number of oocytes retrieved (binned), fertilization method (IVF vs. ICSI) [ 20 ] Outcome Clinical pregnancy: 1 = success, 0 = failure BMI: body mass index, ART: assisted reproductive technology, IVF: in vitro fertilization, ICSI: intracytoplasmic sperm injection, EMR: electronic medical record All features were converted into binary (0/1) variables following preprocessing. Continuous measures, such as age, BMI, duration of infertility, and oocyte count, were binned into clinically relevant categories to enable their inclusion as categorical items in the ARM. Feature selection was informed by clinical expertise and existing literature on PCOS and infertility, ensuring that the dataset captured the most relevant factors potentially influencing IVF success in this population. 2.4 ARM ARM is an unsupervised, data-driven, pattern discovery technique. It automatically identifies frequent “if–then” relationships among features without requiring a predefined outcome variable [ 21 ]. In ARM, an association rule is a rule-based implication between feature sets (antecedent ⇒ consequent) discovered from the data. As ARM does not rely on a target variable, it is well-suited for exploratory analysis: it can scan all factor combinations and uncover multi-factor interactions in an unbiased manner [ 7 ]. In contrast, conventional supervised approaches, such as logistic regression or other classifiers, require labeled outcomes and hypothesis-driven modeling. Supervised models focus on predicting a specific response and typically assess individual variables or pre-specified interactions. Exhaustively testing all possible feature combinations in regression becomes combinatorially prohibitive, as the number of candidate models grows exponentially with additional interactions, offering no guarantee of interpretability or clinical utility [ 22 ]. In summary, ARM’s unsupervised pattern-mining strategy allows the detection of complex, potentially unexpected feature associations, whereas standard models must be guided by a predefined outcome and model structure [ 23 ]. This study applied the classical Apriori ARM algorithm to the one-hot encoded dataset prepared during the data preprocessing stage. The Apriori algorithm involves two main steps: generation of frequent itemsets and extraction of association rules. Frequent itemset generation : The one-hot-encoded dataset was used to identify frequent itemsets. The Apriori algorithm systematically explores combinations of features and counts their occurrences within the dataset prepared during the data preprocessing stage. A minimum support threshold of 0.05 was applied, meaning that an itemset had to occur in at least 5% of patient records to be quantified as frequent. This threshold balances the need to detect non-trivial patterns while excluding extremely rare combinations, thereby ensuring that any reported association reflects a clinically meaningful subset of the cohort. Association rule extraction : After identifying frequent itemsets, the association rule generation process calculates rule metrics. A minimum confidence threshold of 0.60 was applied to filter rules. Confidence measures the conditional probability of the consequent given the antecedent and is used to assess the likelihood of pregnancy under specific clinical conditions. Rules with confidence ≥ 60% were retained, ensuring that among patients with the antecedent feature set, at least 60% exhibited the corresponding outcome. Additionally, a lift criterion > 1.0 was applied to all rules. Lift, defined as the ratio of the rule’s confidence to the baseline probability of the consequent, reflects how much more frequently the antecedent and outcome co-occur than expected by chance. The lift > 1 filter ensured that all retained rules represented meaningful positive associations, indicating improved predictive value beyond random chance. After generating the initial set of rules, we specifically filtered for those in which consequents of the form ⇒ clinical pregnancy failed. This yielded rules structured as antecedent feature sets associated with failed conception, highlighting combinations of patient characteristics linked to IVF failure in this PCOS cohort. The resulting rules were then evaluated for clinical plausibility and ranked by their associated metrics. Ultimately, only rules that met all predefined thresholds and demonstrated statistical significance based on Fisher’s test were retained for interpretation. 2.5 Computational Environment All analyses were conducted using Python (version 3.12) in a Jupyter Notebook environment. Data handling and preprocessing were performed using the pandas and scikit-learn libraries, while ARM was implemented using the mlxtend library. The tools and models used in this study are summarized in Table 2 . Table 2 Computing environment for this study Category Specification Purpose Programming language Python 3.12 Core scripting/analysis Interactive IDE Jupyter Notebook Reproducible, stepwise workflow Key libraries pandas, scikit-learn, mlxtend Data wrangling; preprocessing; Apriori and rule generation Plotting matplotlib, seaborn Descriptive and network visualization Hardware 2 × Intel Xeon (48 physical cores), 128 GB RAM Parallel support-counting and rule filtering Operating system Ubuntu 22.04 LTS Stable Linux environment for multi-threaded tasks IDE: integrated development environment, RAM: random access memory, LTS: long-term support Throughout the analysis, we followed an academically rigorous approach: ensuring high-quality clinical data through systematic cleaning, encoding features appropriately for pattern discovery, and applying validated data mining methods with well-justified parameter settings. The overall methodology is illustrated in Fig. 1 . 3. Results 3.1 Clinical Pregnancy Outcomes The overall clinical pregnancy rate among patients diagnosed with PCOS who were undergoing their first IVF treatment was approximately 18.1%, indicating that approximately one-fifth of IVF cycles resulted in confirmed clinical pregnancy. This result aligns with previously reported clinical pregnancy rates for PCOS populations who were undergoing IVF [ 24 ]. 3.1.1 Demographic and Clinical Characteristics The demographic data indicated a relatively young patient population, consistent with the typical age profile of PCOS. The mean age was 31.39 ± 4.45 years (range: 20–46 years). The BMI averaged 23.53 ± 3.43 kg/m 2 , ranging between 14.42 and 35.58 kg/m 2 , suggesting that most patients fell within the normal-to-overweight range. The average duration of infertility was 4.45 ± 3.17 years, with a wide range from 0.1–22 years, reflecting heterogeneous fertility histories among participants (Fig. 2 ).Clinical conditions were analyzed based on their prevalence among successful and unsuccessful IVF cycles, as presented in Fig. 3 . Conditions such as bilateral tubal obstruction and recurrent miscarriage appeared predominantly in cases without clinical pregnancy, suggesting a negative association with IVF success. Pelvic adhesions and undiagnosed adnexal masses were frequently diagnosed in both outcomes, indicating high prevalence but limited predictive specificity. In contrast, conditions such as hypertension, insulin resistance, and hyperprolactinemia were less prevalent overall, thus limiting their individual interpretative value.Comparison of Pregnancy Outcomes by Demographic and Clinical Groups The relative prevalence of each clinical feature between the clinical pregnancy and non-pregnancy groups is illustrated in Fig. 4 .Conditions such as luteinized unruptured follicle syndrome (LUFS), secondary infertility, and pelvic adhesion were highly prevalent in both successful and unsuccessful outcomes, indicating their common presence among patients who were undergoing PCOS-IVF. Features notably more frequent in the clinical pregnancy failure group included bilateral tubal obstruction (39.3% in failure vs. 31.2% in success) and maternal age of > 35 years (18.4% in failure vs. 17.7% in success). Conversely, some conditions, such as undiagnosed adnexal mass, hyperprolactinemia, and hypertension, demonstrated very low overall prevalence, limiting their discriminatory value. These findings suggest that specific clinical factors, particularly tubal obstruction and advanced maternal age, may serve as meaningful indicators of IVF prognosis, highlighting their relevance in clinical assessment and pre-treatment counseling. 3.1.2 Correlation of Features with Pregnancy Outcome Pearson correlation analysis was conducted to quantify relationships between individual clinical features and IVF pregnancy outcomes, as shown in Fig. 5 : Bilateral tubal obstruction showed the strongest negative correlation (− 0.064), aligning with the clinical understanding of impaired fertility due to tubal pathology. Pelvic adhesion (0.048) and undiagnosed adnexal mass (undiagnosed) (0.045) demonstrated a small positive correlation with pregnancy success, an unexpected finding, possibly influenced by confounding clinical interventions. Habitual abortion (− 0.033) and secondary infertility (− 0.024) negatively correlated with pregnancy success, confirming their role as adverse prognostic indicators. Other factors, including BMI, insulin resistance, hypertension, and maternal age, exhibited negligible or minimal correlations, suggesting limited predictive value when evaluated individually in this cohort. Overall, the correlation analysis indicated that no single factor exerted a strong effect on IVF outcomes in patients with PCOS. Advanced data mining methods are required to uncover interactive, multifactorial patterns that may better explain pregnancy outcomes.ARM Outcomes ARM was employed to identify clinical diagnoses and patient characteristics significantly associated with an increased risk of IVF clinical pregnancy failure in patients with PCOS. 3.2 Overall Rule Summary ARM analysis was specifically directed toward uncovering associations where the consequent was fixed as clinical pregnancy failure, thereby enabling the identification of clinical factors and conditions that increase the risk of unsuccessful IVF outcomes. After applying stringent thresholds, minimum support ≥ 0.05, confidence ≥ 0.60, and lift > 1.0, and statistical validation via the chi-square (χ 2 ) test, a total of 26 significant rules were generated, as shown in Fig. 6 . These rules frequently involved antecedents including ovarian dysfunction factors (e.g., LUFS), structural uterine anomalies, tubal factors (e.g., bilateral tubal obstruction), and advanced maternal age. The frequency and strength of associations of individual clinical features (single itemsets) with IVF clinical pregnancy failure in patients with PCOS are summarized in Table 3 : The most frequent clinical features associated with IVF failure were LUFS and secondary infertility, each identified in 13 instances. LUFS had a high support rate (79.67%), indicating its high prevalence, although it demonstrated a relatively weak lift (1.0017), suggesting limited discriminative power when considered alone. Bilateral tubal obstruction exhibited the strongest association with clinical pregnancy failure, showing the highest lift value (1.0422) and a confidence of 85.25%, emphasizing its significance as a risk factor. Other important features, such as infertility duration of > 5 years and BMI > 24, showed moderate frequencies and lifts (1.0104 and 1.0196, respectively), suggesting modest but relevant individual contributions. Pelvic adhesion lacked complete metrics in this analysis, limiting assessment of its standalone predictive impact. Table 3 Frequency and association metrics of individual clinical features Itemset Frequency Support Confidence Lift Luteinized Unruptured Follicle Syndrome (LUFS) 13 0.7967 0.8194 1.0017 Secondary Infertility 13 0.2505 0.8251 1.0086 Pelvic Adhesion 7 - - - Years of Infertility > 5 7 0.1751 0.8266 1.0104 Bilateral Tubal Obstruction 6 0.3122 0.8525 1.0422 BMI > 24 4 0.1711 0.8341 1.0196 Age > 35 years 4 0.1502 0.8234 1.0065 BMI: body mass index In summary, the statistical results indicated that most individual features had limited predictive power for clinical pregnancy outcomes in patients with PCOS who were undergoing IVF. This underscores that no single factor sufficiently explains treatment success or failure, emphasizing the need to examine multifactorial risk combinations for more accurate prediction. 3.3 Top-ranked Association Rules The top 10 association rules, ranked by lift and confidence, are summarized in Fig. 7 . All rules share a common consequent of clinical pregnancy failure and were identified using the Apriori algorithm, applying thresholds of support ≥ 0.05, confidence ≥ 0.60, and lift > 1.0. These top-ranked rules demonstrate that combinations of anatomical, ovarian, and demographic risk factors significantly increase the likelihood of IVF failure: Bilateral tubal obstruction appears in 7 of the top 10 rules, highlighting its central role as a structural barrier to successful implantation or embryo transport. LUFS commonly co-occurs with both tubal factors and metabolic risks (e.g., BMI > 24), suggesting that ovarian dysfunction and metabolic dysregulation synergistically compromise reproductive outcomes. Secondary infertility and prolonged infertility (> 5 years) frequently appear in multifactorial rule sets, reflecting the added difficulty of achieving pregnancy in patients with prior conception failure. These insights underscore the clinical utility of multi-feature pattern recognition, which enables a more refined risk stratification than evaluating single risk factors in isolation. They also lay the groundwork for developing AI-driven decision support tools to predict IVF failure and guide personalized treatment planning. 4. Discussion 4.1 Comparative Analysis 4.1.1 Comparison with Previous Literature and Known Clinical Evidence The findings of this research provide a multidimensional perspective on IVF outcomes in patients with PCOS that complements prior knowledge derived from more reductionist analyses. Several of our key associations, such as the detrimental effects of hydrosalpinx (bilateral tubal obstruction) and prolonged infertility, are strongly supported by the literature. For example, Ou et al. [ 25 ] reported that hydrosalpinx reduces implantation rates and increases early pregnancy loss in IVF due to embryotoxic fluid and impaired endometrial receptivity. This aligns with our observation that patients with PCOS with untreated bilateral tubal blockage (likely hydrosalpinx) rarely achieve conception through IVF unless the tubal pathology is corrected. The appropriate clinical intervention is salpingectomy or proximal tubal occlusion prior to IVF, an approach supported by multiple studies showing improved IVF outcomes following hydrosalpinx treatment [ 26 , 27 ]. The strong influence of infertility duration on IVF outcome in this study reinforces a consistent theme: the sooner, the better. An early meta-analysis [ 28 ] identified a negative association between infertility duration and IVF success, a finding supported by more recent analyses showing that beyond approximately 3–5 years of trying, each additional year further reduces pregnancy odds [ 29 , 30 ]. Our study’s PCOS-specific data indicate that even in a relatively young cohort, patients with infertility lasting ≥ 5 years experienced significantly lower success rates. While this may partly reflect associations with older age or other confounders, infertility duration remained an independent risk factor in several multi-feature rules, even when age was accounted for. One interpretation is that prolonged infertility reflects underlying intractable conditions, such as poor egg/embryo quality or endometrial dysfunction, that persist despite IVF. It might also reflect that these patients have undergone multiple prior treatments or IVF cycles without success, hinting at recurrent implantation failure scenarios. Clinically, this emphasizes that practitioners might consider escalating treatment or exploring adjunct therapies (immunological work-ups, use of donor gametes, etc.) when faced with a patient who has had many years of unexplained infertility. This research reaffirms the critical impact of female age, which remains the single strongest determinant of IVF success across all populations. Advanced age (≥ 35, especially ≥ 40) dramatically elevated failure risk in our PCOS cohort, consistent with general IVF outcomes. Likewise, obesity and metabolic factors are well-documented to impair fertility treatment outcomes [ 31 ]. A 2024 systematic review noted that, in women with PCOS, high BMI independently lowers clinical pregnancy and live birth rates and raises miscarriage risk. In this article, obesity featured in some rules, though not the top rule, implying that while obesity is indeed harmful, other factors, such as tubal status or duration, were even more dominant in our dataset. It is possible that as the majority of our patients with PCOS were overweight, BMI did not differentiate outcomes as sharply, reflecting a type of range restriction effect. However, we observed that lean patients with PCOS had slightly better success rates than obese patients with PCOS, aligning with the consensus that weight management can improve IVF outcomes in PCOS. 4.1.2 Unexpected Findings in the Current Analysis The data-driven discovery of the LUFS plus tubal factor combination as a high-failure profile appears to be a novel insight with limited direct precedent in the literature. LUFS is a subtle form of ovulatory dysfunction, and while it is known to cause infertility [ 32 ], it is not commonly discussed in the context of IVF outcomes because ovarian stimulation with a human chorionic gonadotropin trigger is expected to circumvent follicle rupture problems. However, the results suggest that some patients with PCOS may still experience issues analogous to LUFS even in IVF (e.g., follicles that luteinize without yielding an egg). A recent study by Li et al. [ 33 ] noted that LUF cycles negatively affected pregnancy outcomes in natural-cycle frozen embryo transfer, highlighting that luteinization without ovulation can disrupt timing and endometrial preparation. In stimulated IVF cycles, an argument can be made that if a patient tends to have LUFS, careful monitoring and trigger timing are crucial, or alternatives such as a dual trigger (human chorionic gonadotropin plus gonadotropin-releasing hormone agonist) might be beneficial to ensure oocyte release. The combination with the tubal factor is likely a proxy for the overall severity of infertility: these patients effectively have two strikes against them, and indeed, our analysis shows that they fare poorly. While not previously reported as a combined risk in the literature, this finding is intuitive and underscores the importance of addressing all known factors in a multidisciplinary management approach to give the best chance of success. Another novel discovery is that our approach identified interactions that traditional multivariable models might miss or not emphasize. For instance, using logistic regression, Liu et al. [ 34 ] found that PCOS per se was not an independent predictor of live birth after adjusting for confounders, meaning that if age, BMI, etc., were controlled, patients with PCOS performed as well as others. However, that same analysis showed that within the PCOS group, factors such as younger age, shorter infertility duration, and good embryo quality were associated with higher live birth rates. The results of our study complement this by explicitly highlighting combinations (e.g., older age plus fewer good embryos) that lead to failure. Essentially, ARM provides a clinically interpretable set of rules that align with what an experienced clinician might surmise through years of practice. The benefit is that ARM can systematically scan through dozens of features to flag combinations that merit clinical attention, possibly revealing less obvious patterns. Besides, no individual predictor exhibited a robust, statistically significant effect. However, a small subset of variables, such as bilateral tubal obstruction, displayed modest associations (lift = 1.042). 4.2 Strengths and Innovation of this Study A strength of this study is the demonstration of how data mining techniques such as Apriori can be applied in reproductive medicine. This approach can handle many variables and uncover associations without the need to pre-specify an outcome model. The rules generated are intuitively understandable (“IF X and Y, THEN Z”), which could aid clinical decision-making more directly than a complex predictive model. To our knowledge, this is the first study to report association rules in the context of IVF outcomes for PCOS. Thus, it aids the use of similar methods on larger IVF databases to perhaps discover phenotype-specific patterns (for example, does a combination of certain hormone levels predict OHSS risk in PCOS?). Additionally, prior literature largely focuses on singular risk factors or additive models. For example, large reviews emphasize the need to individualize IVF protocols for patients with PCOS and note common comorbidities such as obesity and metabolic syndrome [ 2 ], but they do not emphasize which specific factor pairs are the most dangerous. Likewise, regression studies typically identify age, BMI (obesity), hormone levels, etc., as independent predictors [ 35 ], without assessing their combined impact. In sharp contrast, our ARM approach identified novel synergistic rules. To our knowledge, no previous study has described the combined effect of LUFS and tubal obstruction on IVF failure. Hydrosalpinx, a classic example of tubal pathology, is known to halve IVF pregnancy rates, but its interaction with ovulatory dysfunction (such as LUFS) has not been explored [ 36 ]. By pointing out this hidden synergy, we fill a gap in the literature: existing meta-analyses and reviews simply document the problems (age/BMI, OHSS risk, tubal factors) separately, whereas our findings show that certain combinations (e.g., LUFS + tubal factor) confer especially high risk. This underscores ARM’s strength: revealing clinically meaningful patterns beyond standard models. 4.3 Limitations of this Study Despite its insights, this research has certain limitations. First, the study was retrospective and observational; association does not imply causation. The rules we found do not prove that LUFS causes IVF failure, but only show that they occur together frequently. There could be underlying confounders (for example, perhaps women with LUFS also had poor ovarian reserve, which was the real driver of failure). We attempted to mitigate spurious findings by requiring relatively high support and by conducting statistical tests, but some associations might still be coincidental or attributed to biases in the dataset. Second, the dataset size is moderate; a larger sample would allow the detection of associations with lower support (rarer but potentially important scenarios). Our choice of a 10% support threshold meant we likely missed associations involving very rare conditions (e.g., uncommon genetic factors or severe male factor cases); these might still be clinically significant for individual patients but were not detectable in our analysis. Third, our feature encoding, while comprehensive, was limited to what was recorded. We did not include some potentially relevant variables such as Anti-Müllerian hormone levels, insulin resistance indices, or detailed embryo morphology scores. The inclusion of such data might yield additional associations (for instance, a combination of low Anti-Müllerian hormone plus PCOS could predict poor ovarian response). Fourth, the analysis was confined to patients with PCOS at a single center; thus, the associations reflect that specific population and practice (e.g., the stimulation protocols used and the prevalence of certain issues in that clinic). Caution is needed in generalizing the results to all patients with PCOS or other IVF centers. What holds true in our data (e.g., proportion of patients with hydrosalpinx) may vary across other populations. Validating these findings in external datasets would strengthen their generalizability and clinical reliability. Part of the results were inconsistent with established conclusions; for example, the influence of BMI on IVF clinical outcomes was not observed, despite being widely recognized by researchers [ 37 , 38 ] This discrepancy may stem from over 80% of patients having a BMI below the 24 kg/m² threshold (World Health Organization Asian standard), reducing its discriminatory power. Future multi-center datasets with broader BMI distributions may better clarify obesity-related effects. Finally, similar to other data mining studies, there is a risk of overfitting or detecting patterns lacking biological plausibility. We focused on interpreting only rules that were clinically meaningful and supported by external evidence. It is reassuring that our top findings align with known mechanisms (e.g., tubal fluid impairing implantation, prolonged infertility indicating more difficult cases). Nonetheless, caution is warranted to avoid overinterpreting combinations that might be artifacts. In conclusion, ARM was applied to identify key combinations of factors associated with IVF failure in patients with PCOS. The results highlight that it is often the convergence of multiple adverse factors, such as ovulatory dysfunction, tubal pathology, and long-term infertility, that dramatically lowers the chances of pregnancy in this high-risk group. These findings largely corroborate with those in the existing literature on individual risk factors while also providing a novel, integrated view of their interactions. For clinicians, the insights underscore the importance of comprehensive infertility work-ups in PCOS: patients with both PCOS and an additional infertility factor (e.g., tubal obstruction) should be counseled about the reduced probability of success and the potential need to address remediable factors before IVF. Similarly, earlier and more aggressive management may be warranted for those with prolonged infertility, rather than continuing extended attempts with less intensive treatments. 5. Conclusions This study demonstrates the utility of data-driven approaches in reproductive medicine. By uncovering patterns that might be overlooked by traditional analyses, ARM can generate hypotheses for further research (e.g., investigating the mechanistic link between LUFS and IVF outcomes in PCOS) and potentially inform clinical decision support systems. Future studies should validate these rules in larger, multi-center cohorts and assess their predictive value prospectively. It would also be worthwhile to extend this analysis to identify IVF success profiles (patients who succeed) and to compare PCOS with non-PCOS infertile populations to determine whether different rules apply. Ultimately, translating these findings into practice, for example, developing a risk score or checklist based on the presence of multiple factors, could help personalize IVF counseling and treatment for patients with PCOS. In summary, women with PCOS are a heterogeneous group, and our study highlights that the cumulative burden of their reproductive challenges determines IVF outcomes. Recognizing and addressing each element of that burden holds promise for improving fertility success in this prevalent and challenging condition. Abbreviations AI: artificial intelligence, ARM: association rule mining, BMI: body mass index, IVF: in vitro fertilization, LUFS: luteinized unruptured follicle syndrome, OHSS: ovarian hyperstimulation syndrome, PCOS: polycystic ovary syndrome Declarations Ethics approval and consent to participate Not Applicable Consent for publication Not Applicable Availability of data and materials The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request. Competing interests The authors declare that they have no competing interests. Funding This research was funded by the National Natural Science Foundation of China, [grant number: 82460309], and the APC was funded by the Reproductive Hospital of Guangxi. Authors' contributions Xuehong Zhu performed data curation, formal analysis, investigation, and wrote the original draft. Guanghui Dong contributed equally to project administration and also participated in writing the original draft. Zhong Lin provided supervision and secured funding. Lina Ge was involved in the investigation and formal analysis. Feng Han conceived the study, constructed the framework, and contributed to writing, reviewing, and editing the manuscript. All authors read and approved the final manuscript. Acknowledgement We thank the Reproductive Hospital of Guangxi for data management. References Azziz R, Carmina E, Chen Z, Dunaif A, Laven JS, Legro RS, et al. Polycystic ovary syndrome. Nat Rev Dis Primer. 2016;2:1–18. Kotlyar AM, Seifer DB. Women with PCOS who undergo IVF: a comprehensive review of therapeutic strategies for successful outcomes. Reprod Biol Endocrinol. 2023;21:70. 10.1186/s12958-023-01120-7 . Tan X, Wen Y, Chen H, Zhang L, Wang B, Wen H, et al. Follicular output rate tends to improve clinical pregnancy outcomes in patients with polycystic ovary syndrome undergoing in vitro fertilization-embryo transfer treatment. J Int Med Res. 2019;47:5146–54. 10.1177/0300060519860680 . Kicińska AM, Maksym RB, Zabielska-Kaczorowska MA, Stachowska A, Babińska A. Immunological and metabolic causes of infertility in polycystic ovary syndrome. Biomedicines. 2023;11:1567. 10.3390/biomedicines11061567 . Fu K, Li Y, Lv H, Wu W, Song J, Xu J. Development of a model predicting the outcome of in vitro fertilization cycles by a robust decision tree method. Front Endocrinol. 2022;13. 10.3389/fendo.2022.877518 . Veroneze R, Cruz Tfaile Corbi S, Roque da Silva B, de Rocha S, Maurer-Morelli C, Perez Orrico CV. Using association rule mining to jointly detect clinical features and differentially expressed genes related to chronic inflammatory diseases. PLoS ONE. 2020;15:e0240269. 10.1371/journal.pone.0240269 . Stilou S, Bamidis PD, Maglaveras N, Pappas C. Mining association rules from clinical databases: an intelligent diagnostic process in healthcare. Stud Health Technol Inf. 2001;84:1399–403. Samuels J. In One-hot encoding and two-hot encoding: an introduction; 2024. World Health Organization. Standards for maternal and neonatal care; 2007. World Health Organization. WHO Recommendations on antenatal care for a positive pregnancy experience; 2016. Ombelet W. WHO fact sheet on infertility gives hope to millions of infertile couples worldwide. Facts Views Vis Obgyn. 2020;12:249. World Health Organization. Infertility prevalence estimates, 1990–2021. World Health Organization. 2023. ISBN 92-4-006831-7. Coussa A, Hasan HA, Barber TM. Impact of contraception and IVF hormones on metabolic, endocrine, and inflammatory status. J Assist Reprod Genet. 2020;37:1267–72. 10.1007/s10815-020-01756-z . Herman T, Csehely S, Orosz M, Bhattoa HP, Deli T, Torok P, et al. Impact of endocrine disorders on IVF outcomes: results from a large, single-centre, prospective study. Reprod Sci. 2023;30:1878–90. 10.1007/s43032-022-01137-0 . Vannuccini S, Clifton VL, Fraser IS, Taylor HS, Critchley H, Giudice LC, Petraglia F. Infertility and reproductive disorders: impact of hormonal and inflammatory mechanisms on pregnancy outcome. Hum Reprod Update. 2016;22:104–15. Wang L, Yu X, Xiong D, Leng M, Liang M, Li R, et al. Hormonal and metabolic influences on outcomes in PCOS undergoing assisted reproduction: the role of BMI in fresh embryo transfers. BMC Pregnancy Childbirth. 2025;25:368. 10.1186/s12884-025-07480-9 . Harrison RF, Bonnar J, Thompson W. Diagnosis and management of tubo-uterine factors in infertility. Volume 4. Springer Science & Business Media; 2012. Ozgur K, Bulut H, Berkkanoglu M, Coetzee K, Kaya G. ICSI pregnancy outcomes following hysteroscopic placement of Essure devices for hydrosalpinx in laparoscopic contraindicated patients. Reprod Biomed Online. 2014;29:113–8. Qiu J, Du T, Chen C, Lyu Q, Mol BW, Zhao M, Kuang Y. Impact of uterine malformations on pregnancy and neonatal outcomes of IVF/ICSI–frozen embryo transfer. Hum Reprod. 2022;37:428–46. Tournaye H. Male factor infertility and ART. Asian J Androl. 2011;14:103. Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I. Machine learning and data mining methods in diabetes research. Comput Struct Biotechnol J. 2017;15:104–16. Genga L, Allodi L, Zannone N. Association rule mining meets regression analysis: an automated approach to unveil systematic biases in decision-making processes. J Cybersecur Priv. 2022;2:191–219. 10.3390/jcp2010011 . Pradhan GN, Prabhakaran B. Association rule mining in multiple, multi-dimensional time series medical data. J Healthc Inf Res. 2017;1:92–118. 10.1007/s41666-017-0001-x . Melo AS, Ferriani RA, Navarro PA. Treatment of infertility in women with polycystic ovary syndrome: approach to clinical practice. Clinics. 2015;70:765–9. 10.6061/clinics/2015(11)09 . Ou H, Sun J, Lin L, Ma X. Ovarian response, pregnancy outcomes, and complications between salpingectomy and proximal tubal occlusion in hydrosalpinx patients before in vitro fertilization: a meta-analysis. Front Surg. 2022;9:830612. 10.3389/fsurg.2022.830612 . Capmas P, Suarthana E, Tulandi T. Management of hydrosalpinx in the era of assisted reproductive technology: a systematic review and meta-analysis. J Minim Invasive Gynecol. 2021;28:418–41. 10.1016/j.jmig.2020.08.017 . Xu B, Zhang Q, Zhao J, Wang Y, Xu D, Li Y. Pregnancy outcome of in vitro fertilization after Essure and laparoscopic management of hydrosalpinx: a systematic review and meta-analysis. Fertil Steril. 2017;108:84–e955. 10.1016/j.fertnstert.2017.05.005 . Zhang L, Cai H, Li W, Tian L, Shi J. Duration of infertility and assisted reproductive outcomes in non-male factor infertility: can use of ICSI turn the tide? BMC Womens Health. 2022;22:480. 10.1186/s12905-022-02062-9 . Huang C, Shi Q, Xing J, Yan Y, Shen X, Shan H, Sun H, Mei J. The relationship between duration of infertility and clinical outcomes of intrauterine insemination for younger women: a retrospective clinical study. BMC Pregnancy Childbirth. 2024;24:199. 10.1186/s12884-024-06398-y . Wang X, Tian P, Zhao Y, Lu J, Dong C, Zhang C. The association between female age and pregnancy outcomes in patients receiving first elective single embryo transfer cycle: a retrospective cohort study. Sci Rep. 2024;14:19216. 10.1038/s41598-024-70249-1 . Alenezi SA, Khan R, Amer S. The impact of high BMI on pregnancy outcomes and complications in women with PCOS undergoing IVF—a systematic review and meta-analysis. J Clin Med. 2024;13:1578. 10.3390/jcm13061578 . Azmoodeh A, Pejman Manesh M, Akbari Asbagh F, Ghaseminejad A, Hamzehgardeshi Z. Effects of letrozole-HMG and clomiphene-HMG on incidence of luteinized unruptured follicle syndrome in infertile women undergoing induction ovulation and intrauterine insemination: a randomised trial. Glob J Health Sci. 2015;8:244. 10.5539/gjhs.v8n4p244 . Li S, Liu L, Meng T, Miao B, Sun M, Zhou C, Xu Y. Impact of luteinized unruptured follicles on clinical outcomes of natural cycles for frozen/thawed blastocyst transfer. Front Endocrinol. 2021;12:738005. Liu S, Mo M, Xiao S, Li L, Hu X, Hong L, Wang L, Lian R, Huang C, Zeng Y, et al. Pregnancy outcomes of women with polycystic ovary syndrome for the first in vitro fertilization treatment: a retrospective cohort study with 7,678 patients. Front Endocrinol. 2020;11:575337. 10.3389/fendo.2020.575337 . Jiang X, Liu R, Liao T, He Y, Li C, Guo P, Zhou P, Cao Y, Wei Z. A predictive model of live birth based on obesity and metabolic parameters in patients with PCOS undergoing frozen-thawed embryo transfer. Front Endocrinol. 2021;12:799871. 10.3389/fendo.2021.799871 . Palagiano A, Cozzolino M, Ubaldi FM, Palagiano C, Coccia ME. Effects of hydrosalpinx on endometrial implantation failures: evaluating salpingectomy in women undergoing in vitro fertilization. RBGO Gynecol Obstet. 2021;43:304–10. 10.1055/s-0040-1722155 . Vågenes H, Pranić SM. Analysis of the quality, accuracy, and readability of patient information on polycystic ovarian syndrome (PCOS) on the Internet available in English: a cross-sectional study. Reprod Biol Endocrinol. 2023;21:44. 10.1186/s12958-023-01100-x . Shang J, Wang S, Wang A, Li F, Zhang J, Wang J, Lv R, Chen H, Mu X, Zhang K, et al. Intra-ovarian inflammatory states and their associations with embryo quality in normal-BMI PCOS patients undergoing IVF treatment. Reprod Biol Endocrinol. 2024;22:11. 10.1186/s12958-023-01183-6 . Additional Declarations No competing interests reported. Supplementary Files finalstandardizedonehotrule.xlsx rulesvisualization.ipynb aprioriclinicalfailure.ipynb transactionsforarm.csv StatisticPregance.ipynb GraphicAbstract.tif Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7020094","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":479045364,"identity":"b777fd1d-903e-4c4d-ad2d-f526b47562ab","order_by":0,"name":"Xuehong Zhu","email":"","orcid":"","institution":"The Reproductive Hospital of Guangxi","correspondingAuthor":false,"prefix":"","firstName":"Xuehong","middleName":"","lastName":"Zhu","suffix":""},{"id":479045365,"identity":"ddc631a5-ec65-4482-9ee4-9ccd0b39b5e6","order_by":1,"name":"Lina Ge","email":"","orcid":"","institution":"Guangxi Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Lina","middleName":"","lastName":"Ge","suffix":""},{"id":479045366,"identity":"71500bfa-9251-4b38-b2a1-0766fd9fd505","order_by":2,"name":"Guanghui Dong","email":"","orcid":"","institution":"Sun Yat-Sen University","correspondingAuthor":false,"prefix":"","firstName":"Guanghui","middleName":"","lastName":"Dong","suffix":""},{"id":479045367,"identity":"da1c1681-bd39-4729-b9cb-446388b29196","order_by":3,"name":"Zhong Lin","email":"","orcid":"","institution":"The Reproductive Hospital of Guangxi","correspondingAuthor":false,"prefix":"","firstName":"Zhong","middleName":"","lastName":"Lin","suffix":""},{"id":479045368,"identity":"48511789-328a-40b0-8561-3360239185af","order_by":4,"name":"Feng Han","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA6klEQVRIiWNgGAWjYBAC+wMMDMxAWg4mwNhASIsBA0SLMelaEmEqidDC3nv4dUHNnfT5EdnJn3kYbGQ3HGB+9gCfFnuec2nWM449y9145uwGYx6GNOMNB9jMDfDaIpFjZszDdjh3Y3vvhmQehsOJGw7wsEng1SL/Bqjl3+F0w2beDYd5GP4ToUWCx/gxb9vhBHn23o3NPAwHiNDCk2PGzNt32HADz9nNjHMMko1nHmYzw6+F/YzxZ55vh+XlZ+Ru/vCmwk6273jzM7xagADiDIMDYJIBEk0EAPMHECnfQFjlKBgFo2AUjFAAAKxbSAswj8VzAAAAAElFTkSuQmCC","orcid":"","institution":"Guangxi Minzu University","correspondingAuthor":true,"prefix":"","firstName":"Feng","middleName":"","lastName":"Han","suffix":""}],"badges":[],"createdAt":"2025-07-01 12:08:13","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7020094/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7020094/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":85995132,"identity":"ab9db377-f785-4ef8-92b8-f1702fb2e384","added_by":"auto","created_at":"2025-07-04 05:55:38","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":504028,"visible":true,"origin":"","legend":"\u003cp\u003eFlow of research methods and materials\u003c/p\u003e","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/777e1491438335126eaaf819.png"},{"id":85995644,"identity":"bdd69e2d-5fe8-4bb3-b4ac-72a998cf9e67","added_by":"auto","created_at":"2025-07-04 06:03:39","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":661738,"visible":true,"origin":"","legend":"\u003cp\u003eBoxplots of age, BMI, and infertility duration distributions\u003c/p\u003e","description":"","filename":"Figure2..png","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/4636db024e90be822358fa3c.png"},{"id":85995142,"identity":"22b9cacf-6c91-4dd8-b3fa-ee21c476ff5a","added_by":"auto","created_at":"2025-07-04 05:55:39","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":1732084,"visible":true,"origin":"","legend":"\u003cp\u003ePrevalence of clinical features by pregnancy outcome\u003c/p\u003e","description":"","filename":"Figure3.png","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/60504e6b0c533c50ba6bc404.png"},{"id":85995163,"identity":"c7b443d4-e854-4357-aa23-f945241b6638","added_by":"auto","created_at":"2025-07-04 05:55:40","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":1345674,"visible":true,"origin":"","legend":"\u003cp\u003ePrevalence of clinical features in in vitro fertilization cycles resulting in pregnancy success versus failure\u003c/p\u003e","description":"","filename":"Figure4.png","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/3a0aa9f0c79f9e073f4c800a.png"},{"id":85995156,"identity":"0c736ac7-68f9-44a3-a2aa-e56740666650","added_by":"auto","created_at":"2025-07-04 05:55:40","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":633454,"visible":true,"origin":"","legend":"\u003cp\u003eHeatmap of correlations between clinical features and clinical pregnancy outcomes\u003c/p\u003e","description":"","filename":"Figure5.png","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/ca8fdfdc25296e1b27f40d76.png"},{"id":85995151,"identity":"656cfe69-c46a-4548-afd5-8ed35fd17051","added_by":"auto","created_at":"2025-07-04 05:55:39","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":7125982,"visible":true,"origin":"","legend":"\u003cp\u003eEffective rules leading to failure of clinical pregnancy in patients with polycystic ovary syndrome\u003c/p\u003e","description":"","filename":"Figure6.png","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/1f083624fa5fecacd1772122.png"},{"id":85995650,"identity":"e15a3662-3199-4be8-805f-3c6707b09ce5","added_by":"auto","created_at":"2025-07-04 06:03:40","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":1111025,"visible":true,"origin":"","legend":"\u003cp\u003eTop 10 association rules predicting clinical pregnancy failure in patients with polycystic ovary syndrome\u003c/p\u003e","description":"","filename":"Figure7.png","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/da0a76950b90fa2feb06b8a9.png"},{"id":85995146,"identity":"ada7df71-8fe7-4e9c-b645-ad3847c01e5a","added_by":"auto","created_at":"2025-07-04 05:55:39","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":195669,"visible":true,"origin":"","legend":"","description":"","filename":"finalstandardizedonehotrule.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/a2d9de892c2790b313005d92.xlsx"},{"id":85995135,"identity":"ce5e59cf-09f1-4627-932f-d4b29a473e7b","added_by":"auto","created_at":"2025-07-04 05:55:39","extension":"ipynb","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":274280,"visible":true,"origin":"","legend":"","description":"","filename":"rulesvisualization.ipynb","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/cb862a7d87ebfb61deb76ca0.ipynb"},{"id":85995143,"identity":"a9a14e5e-aa1a-4019-a077-69b0b22a11fd","added_by":"auto","created_at":"2025-07-04 05:55:39","extension":"ipynb","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":852665,"visible":true,"origin":"","legend":"","description":"","filename":"aprioriclinicalfailure.ipynb","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/07c930a290edc4118bb110bb.ipynb"},{"id":85995645,"identity":"c497e8ac-44bf-4d68-8fbe-b3649ec62cff","added_by":"auto","created_at":"2025-07-04 06:03:39","extension":"csv","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":445849,"visible":true,"origin":"","legend":"","description":"","filename":"transactionsforarm.csv","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/f1247491203dd1937487c3e6.csv"},{"id":85995136,"identity":"ff162762-5499-46d7-9bb7-0230016531e4","added_by":"auto","created_at":"2025-07-04 05:55:39","extension":"ipynb","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":643102,"visible":true,"origin":"","legend":"","description":"","filename":"StatisticPregance.ipynb","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/0964da696e1c30f343b35a6e.ipynb"},{"id":85995655,"identity":"7b3a7f13-e467-4c67-ae0e-caea4dcdc01d","added_by":"auto","created_at":"2025-07-04 06:03:40","extension":"tif","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":1510511,"visible":true,"origin":"","legend":"","description":"","filename":"GraphicAbstract.tif","url":"https://assets-eu.researchsquare.com/files/rs-7020094/v1/231e51bc22580be8aeeb02ba.tif"}],"financialInterests":"No competing interests reported.","formattedTitle":"Are Combined Risk Factors Linked to In Vitro Fertilization Failure in Polycystic Ovary Syndrome? An Association Rule Mining Approach","fulltext":[{"header":"1. Background","content":"\u003cp\u003ePolycystic ovary syndrome (PCOS) is a common endocrine and metabolic disorder affecting women of reproductive age [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. It is characterized by hyperandrogenism, ovulatory dysfunction, and polycystic ovarian morphology, and affects approximately 5\u0026ndash;20% of women worldwide. PCOS is the leading cause of anovulatory infertility [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. While lifestyle modifications and ovulation induction enable many patients to conceive, a substantial proportion ultimately requires assisted reproductive technology (ART) such as in vitro fertilization (IVF).\u003c/p\u003e \u003cp\u003eWhen women with PCOS undergo IVF, they face distinct clinical challenges. Controlled ovarian stimulation in PCOS often yields a large cohort of oocytes, but many are of suboptimal quality, resulting in poor-quality embryos [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. This response also increases the risk of ovarian hyperstimulation syndrome (OHSS). Consequently, patients with PCOS tend to have lower implantation and clinical pregnancy rates per cycle. Despite advances in IVF techniques and a deeper understanding of PCOS pathophysiology, many patients fail to conceive following treatment.\u003c/p\u003e \u003cp\u003eA variety of patient factors influence IVF success in PCOS. Advanced maternal age, obesity (high body mass index [BMI]), prolonged infertility, hormonal imbalances, tubal disease, and uterine abnormalities have all been linked to poorer outcomes. In particular, metabolic and hormonal disturbances inherent to PCOS, especially insulin resistance and hyperandrogenism, can synergistically impair reproduction. For example, elevated androgen levels combined with insulin resistance have been shown to negatively affect oocyte development and endometrial receptivity [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. A recent review has highlighted that these pathophysiological mechanisms result in abnormal follicle growth, poor oocyte maturation, and dysfunctional endometrial lining, all contributing to infertility in PCOS.\u003c/p\u003e \u003cp\u003eAs the interplay of these factors is complex, predicting IVF success in PCOS is challenging using traditional statistical methods. Recently, machine learning techniques have shown promise in this field. Advanced algorithms such as random forests and gradient boosting decision trees can manage numerous variables and nonlinear interactions. Studies indicate that machine learning models incorporating patient age, BMI, and other clinical features often outperform simple logistic models for forecasting IVF outcomes [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. These data-driven models can capture subtle patterns and enable individualized risk prediction.\u003c/p\u003e \u003cp\u003eAssociation rule mining (ARM) is a powerful data-mining technique for uncovering hidden \u0026ldquo;if\u0026ndash;then\u0026rdquo; patterns across multiple variables. It has been widely applied in healthcare to identify interpretable relationships among clinical factors [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. For example, association rules have been used to construct diagnostic knowledge bases and summarize how combinations of patient characteristics jointly influence outcomes [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Previous studies have often assessed risk factors independently, overlooking the multifactorial interactions that frequently influence clinical pregnancy outcomes.\u003c/p\u003e \u003cp\u003eIn this study, we applied ARM to clinical data from women with PCOS who were undergoing IVF, aiming to identify combinations of risk factors that predict treatment failure. By revealing such multifactorial patterns, we aim to provide clinicians with more precise decision support, and ultimately, improve IVF success rates in this population [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e].\u003c/p\u003e"},{"header":"2. Methods and Materials","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e2.1 Data Source\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eIn this study, we retrospectively analyzed de-identified electronic medical records (EMRs) from the Reproductive Hospital of Guangxi, encompassing all IVF treatment cycles between 2018 and 2023 for females diagnosed with PCOS. Inclusion was restricted to patients diagnosed with PCOS according to standard clinical criteria, who were undergoing IVF with or without intracytoplasmic sperm injection during the study period. The dataset included each patient\u0026rsquo;s baseline characteristics, relevant diagnoses, treatment details, and IVF outcomes. Clinical pregnancy, the primary outcome, was defined as the presence of a fetal heartbeat on ultrasound approximately 6\u0026ndash;7 weeks after embryo transfer, and this binary outcome (pregnant or not pregnant) was recorded for each cycle. Only records with complete information on key variables and outcomes were retained for analysis. The study was approved by the hospital\u0026rsquo;s institutional ethics board and owing to its retrospective nature and use of de-identified data, informed consent was waived.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e2.2 Data Preprocessing\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThe data from raw EMR extracts were rigorously preprocessed to ensure high-quality inputs for analysis. The following steps were implemented:\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eTerminology standardization\u003c/b\u003e: Clinical descriptions and diagnoses were normalized using consistent vocabulary. In the raw EMRs, identical concepts were often documented using different terms or abbreviations. For example, \u0026ldquo;\u003cem\u003epolycystic ovary syndrome\u003c/em\u003e\u0026rdquo; appeared as \u0026ldquo;\u003cem\u003epolycystic ovarian syndrome\u003c/em\u003e,\u0026rdquo; \u0026ldquo;\u003cem\u003ePCOS,\u0026rdquo;\u003c/em\u003e or \u0026ldquo;\u003cem\u003epoly-ovary syndrome\u003c/em\u003e,\u0026rdquo; depending on the physician\u0026rsquo;s documentation style. To address this, all symptoms and diagnosis labels were mapped to a standardized terminology, harmonizing synonyms and abbreviations into a single descriptor. This normalization ensured that each clinical concept, such as hyperprolactinemia, was represented uniformly across records, preventing duplicate features caused by inconsistent naming.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eRecord filtering and de-duplication\u003c/b\u003e: Duplicate entries and records with substantial missing or incomplete data were removed. Secondary-use healthcare data often contains redundancies or omissions, which can introduce bias. Repeated patient records and cases lacking essential fields, such as clinical outcome or key diagnostic information, were identified and excluded. This step enhances data integrity and reliability by restricting the analysis to unique, complete patient records.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eOne-hot encoding\u003c/b\u003e: One-hot encoding is a standard transformation technique for converting categorical variables into a numerical format suitable for machine learning [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. Each distinct feature value, such as primary infertility, secondary infertility, or specific stimulation protocols, was converted into a separate column with a binary value: \u0026ldquo;1\u0026rdquo; indicating presence and \u0026ldquo;0\u0026rdquo; indicating absence in a given record. This process produced a Boolean feature matrix, where each row represented a patient\u0026rsquo;s IVF cycle and each column represented a specific condition or attribute. Continuous variables (e.g., age, BMI, and duration of infertility) were discretized into clinically meaningful categories based on standards from the World Health Organization [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e].\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eAll data cleaning and preprocessing steps followed best practices for secondary analysis of clinical data. This workflow resulted in a curated dataset of PCOS IVF cases that was standardized, high-quality, and ready for ARM.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e2.3 Feature Selection\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eIn this study, a comprehensive set of clinical and treatment variables associated with IVF outcomes in patients with PCOS, based on prior research, was extracted from the cleaned EMRs. These variables included demographic characteristics, comorbidities, anatomical abnormalities, and treatment parameters, as detailed in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eFeatures extracted from EMRs\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDomain\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eExtracted Variables\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eReference\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eDemographics\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFemale age: discretized\u0026thinsp;\u0026le;\u0026thinsp;35, \u0026gt;\u0026thinsp;35\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eAnthropometry\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFemale BMI (kg m⁻\u003csup\u003e2\u003c/sup\u003e): discretized\u0026thinsp;\u0026le;\u0026thinsp;2, \u0026gt;\u0026thinsp;24\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eReproductive history\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eYears of infertility: discretized\u0026thinsp;\u0026le;\u0026thinsp;5, \u0026gt;\u0026thinsp;5; infertility type: primary/secondary\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eHormonal and metabolic disorders\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eHyperprolactinemia, sub-clinical hypothyroidism, insulin resistance\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e[\u003cspan additionalcitationids=\"CR14 CR15\" citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eTubo-uterine factors\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eUnilateral tubal obstruction, bilateral tubal obstruction, hydrosalpinx, pelvic adhesion, intra-uterine adhesion\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]s\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eOvarian and endocrine factors\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLuteinized unruptured follicle syndrome (LUFS), ovarian cysts\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eUterine malformations\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSeptate uterus, uterine fibroids (leiomyomas), adenomyosis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eART treatment data\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eStimulation protocol (categorical), number of oocytes retrieved (binned), fertilization method (IVF vs. ICSI)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eOutcome\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eClinical pregnancy: 1\u0026thinsp;=\u0026thinsp;success, 0\u0026thinsp;=\u0026thinsp;failure\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"3\"\u003eBMI: body mass index, ART: assisted reproductive technology, IVF: in vitro fertilization, ICSI: intracytoplasmic sperm injection, EMR: electronic medical record\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eAll features were converted into binary (0/1) variables following preprocessing. Continuous measures, such as age, BMI, duration of infertility, and oocyte count, were binned into clinically relevant categories to enable their inclusion as categorical items in the ARM. Feature selection was informed by clinical expertise and existing literature on PCOS and infertility, ensuring that the dataset captured the most relevant factors potentially influencing IVF success in this population.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003e2.4 ARM\u003c/h2\u003e \u003cp\u003eARM is an unsupervised, data-driven, pattern discovery technique. It automatically identifies frequent \u0026ldquo;if\u0026ndash;then\u0026rdquo; relationships among features without requiring a predefined outcome variable [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. In ARM, an association rule is a rule-based implication between feature sets (antecedent \u0026rArr; consequent) discovered from the data. As ARM does not rely on a target variable, it is well-suited for exploratory analysis: it can scan all factor combinations and uncover multi-factor interactions in an unbiased manner [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. In contrast, conventional supervised approaches, such as logistic regression or other classifiers, require labeled outcomes and hypothesis-driven modeling. Supervised models focus on predicting a specific response and typically assess individual variables or pre-specified interactions. Exhaustively testing all possible feature combinations in regression becomes combinatorially prohibitive, as the number of candidate models grows exponentially with additional interactions, offering no guarantee of interpretability or clinical utility [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. In summary, ARM\u0026rsquo;s unsupervised pattern-mining strategy allows the detection of complex, potentially unexpected feature associations, whereas standard models must be guided by a predefined outcome and model structure [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e].\u003cdiv class=\"BlockQuote\"\u003e\u003cp\u003eThis study applied the classical Apriori ARM algorithm to the one-hot encoded dataset prepared during the data preprocessing stage. The Apriori algorithm involves two main steps: generation of frequent itemsets and extraction of association rules.\u003c/p\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eFrequent itemset generation\u003c/b\u003e: The one-hot-encoded dataset was used to identify frequent itemsets. The Apriori algorithm systematically explores combinations of features and counts their occurrences within the dataset prepared during the data preprocessing stage. A minimum support threshold of 0.05 was applied, meaning that an itemset had to occur in at least 5% of patient records to be quantified as frequent. This threshold balances the need to detect non-trivial patterns while excluding extremely rare combinations, thereby ensuring that any reported association reflects a clinically meaningful subset of the cohort.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eAssociation rule extraction\u003c/b\u003e: After identifying frequent itemsets, the association rule generation process calculates rule metrics. A minimum confidence threshold of 0.60 was applied to filter rules. Confidence measures the conditional probability of the consequent given the antecedent and is used to assess the likelihood of pregnancy under specific clinical conditions. Rules with confidence\u0026thinsp;\u0026ge;\u0026thinsp;60% were retained, ensuring that among patients with the antecedent feature set, at least 60% exhibited the corresponding outcome. Additionally, a lift criterion\u0026thinsp;\u0026gt;\u0026thinsp;1.0 was applied to all rules. Lift, defined as the ratio of the rule\u0026rsquo;s confidence to the baseline probability of the consequent, reflects how much more frequently the antecedent and outcome co-occur than expected by chance. The lift\u0026thinsp;\u0026gt;\u0026thinsp;1 filter ensured that all retained rules represented meaningful positive associations, indicating improved predictive value beyond random chance.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eAfter generating the initial set of rules, we specifically filtered for those in which consequents of the form \u0026rArr; clinical pregnancy failed. This yielded rules structured as antecedent feature sets associated with failed conception, highlighting combinations of patient characteristics linked to IVF failure in this PCOS cohort. The resulting rules were then evaluated for clinical plausibility and ranked by their associated metrics. Ultimately, only rules that met all predefined thresholds and demonstrated statistical significance based on Fisher\u0026rsquo;s test were retained for interpretation.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003e2.5 Computational Environment\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eAll analyses were conducted using Python (version 3.12) in a Jupyter Notebook environment. Data handling and preprocessing were performed using the pandas and scikit-learn libraries, while ARM was implemented using the mlxtend library. The tools and models used in this study are summarized in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eComputing environment for this study\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCategory\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSpecification\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePurpose\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eProgramming language\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePython 3.12\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eCore scripting/analysis\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eInteractive IDE\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eJupyter Notebook\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eReproducible, stepwise workflow\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eKey libraries\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003epandas, scikit-learn, mlxtend\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eData wrangling; preprocessing; Apriori and rule generation\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003ePlotting\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003ematplotlib, seaborn\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eDescriptive and network visualization\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eHardware\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2 \u0026times; Intel Xeon (48 physical cores), 128 GB RAM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eParallel support-counting and rule filtering\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eOperating system\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eUbuntu 22.04 LTS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eStable Linux environment for multi-threaded tasks\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"3\"\u003eIDE: integrated development environment, RAM: random access memory, LTS: long-term support\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThroughout the analysis, we followed an academically rigorous approach: ensuring high-quality clinical data through systematic cleaning, encoding features appropriately for pattern discovery, and applying validated data mining methods with well-justified parameter settings. The overall methodology is illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"3. Results","content":"\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003e3.1 Clinical Pregnancy Outcomes\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThe overall clinical pregnancy rate among patients diagnosed with PCOS who were undergoing their first IVF treatment was approximately 18.1%, indicating that approximately one-fifth of IVF cycles resulted in confirmed clinical pregnancy. This result aligns with previously reported clinical pregnancy rates for PCOS populations who were undergoing IVF [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e].\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cdiv id=\"Sec10\" class=\"Section3\"\u003e \u003ch2\u003e3.1.1 Demographic and Clinical Characteristics\u003c/h2\u003e \u003cp\u003eThe demographic data indicated a relatively young patient population, consistent with the typical age profile of PCOS. The mean age was 31.39\u0026thinsp;\u0026plusmn;\u0026thinsp;4.45 years (range: 20\u0026ndash;46 years). The BMI averaged 23.53\u0026thinsp;\u0026plusmn;\u0026thinsp;3.43 kg/m\u003csup\u003e2\u003c/sup\u003e, ranging between 14.42 and 35.58 kg/m\u003csup\u003e2\u003c/sup\u003e, suggesting that most patients fell within the normal-to-overweight range. The average duration of infertility was 4.45\u0026thinsp;\u0026plusmn;\u0026thinsp;3.17 years, with a wide range from 0.1\u0026ndash;22 years, reflecting heterogeneous fertility histories among participants (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).Clinical conditions were analyzed based on their prevalence among successful and unsuccessful IVF cycles, as presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e. Conditions such as bilateral tubal obstruction and recurrent miscarriage appeared predominantly in cases without clinical pregnancy, suggesting a negative association with IVF success. Pelvic adhesions and undiagnosed adnexal masses were frequently diagnosed in both outcomes, indicating high prevalence but limited predictive specificity. In contrast, conditions such as hypertension, insulin resistance, and hyperprolactinemia were less prevalent overall, thus limiting their individual interpretative value.Comparison of Pregnancy Outcomes by Demographic and Clinical Groups\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThe relative prevalence of each clinical feature between the clinical pregnancy and non-pregnancy groups is illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e.Conditions such as luteinized unruptured follicle syndrome (LUFS), secondary infertility, and pelvic adhesion were highly prevalent in both successful and unsuccessful outcomes, indicating their common presence among patients who were undergoing PCOS-IVF. Features notably more frequent in the clinical pregnancy failure group included bilateral tubal obstruction (39.3% in failure vs. 31.2% in success) and maternal age of \u0026gt;\u0026thinsp;35 years (18.4% in failure vs. 17.7% in success). Conversely, some conditions, such as undiagnosed adnexal mass, hyperprolactinemia, and hypertension, demonstrated very low overall prevalence, limiting their discriminatory value. These findings suggest that specific clinical factors, particularly tubal obstruction and advanced maternal age, may serve as meaningful indicators of IVF prognosis, highlighting their relevance in clinical assessment and pre-treatment counseling.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section3\"\u003e \u003ch2\u003e3.1.2 Correlation of Features with Pregnancy Outcome\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003ePearson correlation analysis was conducted to quantify relationships between individual clinical features and IVF pregnancy outcomes, as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e:\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eBilateral tubal obstruction showed the strongest negative correlation (\u0026minus;\u0026thinsp;0.064), aligning with the clinical understanding of impaired fertility due to tubal pathology.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003ePelvic adhesion (0.048) and undiagnosed adnexal mass (undiagnosed) (0.045) demonstrated a small positive correlation with pregnancy success, an unexpected finding, possibly influenced by confounding clinical interventions.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eHabitual abortion (\u0026minus;\u0026thinsp;0.033) and secondary infertility (\u0026minus;\u0026thinsp;0.024) negatively correlated with pregnancy success, confirming their role as adverse prognostic indicators.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eOther factors, including BMI, insulin resistance, hypertension, and maternal age, exhibited negligible or minimal correlations, suggesting limited predictive value when evaluated individually in this cohort.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eOverall, the correlation analysis indicated that no single factor exerted a strong effect on IVF outcomes in patients with PCOS. Advanced data mining methods are required to uncover interactive, multifactorial patterns that may better explain pregnancy outcomes.ARM Outcomes\u003c/p\u003e \u003cp\u003eARM was employed to identify clinical diagnoses and patient characteristics significantly associated with an increased risk of IVF clinical pregnancy failure in patients with PCOS.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003e3.2 Overall Rule Summary\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eARM analysis was specifically directed toward uncovering associations where the consequent was fixed as clinical pregnancy failure, thereby enabling the identification of clinical factors and conditions that increase the risk of unsuccessful IVF outcomes. After applying stringent thresholds, minimum support\u0026thinsp;\u0026ge;\u0026thinsp;0.05, confidence\u0026thinsp;\u0026ge;\u0026thinsp;0.60, and lift\u0026thinsp;\u0026gt;\u0026thinsp;1.0, and statistical validation via the chi-square (χ\u003csup\u003e2\u003c/sup\u003e) test, a total of 26 significant rules were generated, as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e. These rules frequently involved antecedents including ovarian dysfunction factors (e.g., LUFS), structural uterine anomalies, tubal factors (e.g., bilateral tubal obstruction), and advanced maternal age.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe frequency and strength of associations of individual clinical features (single itemsets) with IVF clinical pregnancy failure in patients with PCOS are summarized in Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eThe most frequent clinical features associated with IVF failure were LUFS and secondary infertility, each identified in 13 instances.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eLUFS had a high support rate (79.67%), indicating its high prevalence, although it demonstrated a relatively weak lift (1.0017), suggesting limited discriminative power when considered alone.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eBilateral tubal obstruction exhibited the strongest association with clinical pregnancy failure, showing the highest lift value (1.0422) and a confidence of 85.25%, emphasizing its significance as a risk factor.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eOther important features, such as infertility duration of \u0026gt;\u0026thinsp;5 years and BMI\u0026thinsp;\u0026gt;\u0026thinsp;24, showed moderate frequencies and lifts (1.0104 and 1.0196, respectively), suggesting modest but relevant individual contributions.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003ePelvic adhesion lacked complete metrics in this analysis, limiting assessment of its standalone predictive impact.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eFrequency and association metrics of individual clinical features\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eItemset\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFrequency\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eSupport\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eConfidence\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eLift\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLuteinized Unruptured Follicle Syndrome (LUFS)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e13\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.7967\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.8194\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1.0017\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSecondary Infertility\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e13\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.2505\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.8251\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1.0086\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePelvic Adhesion\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYears of Infertility\u0026thinsp;\u0026gt;\u0026thinsp;5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.1751\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.8266\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1.0104\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBilateral Tubal Obstruction\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.3122\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.8525\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1.0422\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBMI\u0026thinsp;\u0026gt;\u0026thinsp;24\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.1711\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.8341\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1.0196\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAge\u0026thinsp;\u0026gt;\u0026thinsp;35 years\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.1502\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.8234\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1.0065\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"5\"\u003eBMI: body mass index\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eIn summary, the statistical results indicated that most individual features had limited predictive power for clinical pregnancy outcomes in patients with PCOS who were undergoing IVF. This underscores that no single factor sufficiently explains treatment success or failure, emphasizing the need to examine multifactorial risk combinations for more accurate prediction.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003e3.3 Top-ranked Association Rules\u003c/h2\u003e \u003cp\u003eThe top 10 association rules, ranked by lift and confidence, are summarized in Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e. All rules share a common consequent of clinical pregnancy failure and were identified using the Apriori algorithm, applying thresholds of support\u0026thinsp;\u0026ge;\u0026thinsp;0.05, confidence\u0026thinsp;\u0026ge;\u0026thinsp;0.60, and lift\u0026thinsp;\u0026gt;\u0026thinsp;1.0.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThese top-ranked rules demonstrate that combinations of anatomical, ovarian, and demographic risk factors significantly increase the likelihood of IVF failure:\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eBilateral tubal obstruction appears in 7 of the top 10 rules, highlighting its central role as a structural barrier to successful implantation or embryo transport.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eLUFS commonly co-occurs with both tubal factors and metabolic risks (e.g., BMI\u0026thinsp;\u0026gt;\u0026thinsp;24), suggesting that ovarian dysfunction and metabolic dysregulation synergistically compromise reproductive outcomes.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eSecondary infertility and prolonged infertility (\u0026gt;\u0026thinsp;5 years) frequently appear in multifactorial rule sets, reflecting the added difficulty of achieving pregnancy in patients with prior conception failure.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cp\u003eThese insights underscore the clinical utility of multi-feature pattern recognition, which enables a more refined risk stratification than evaluating single risk factors in isolation. They also lay the groundwork for developing AI-driven decision support tools to predict IVF failure and guide personalized treatment planning.\u003c/p\u003e \u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003e4.1 Comparative Analysis\u003c/h2\u003e \u003cdiv id=\"Sec16\" class=\"Section3\"\u003e \u003ch2\u003e4.1.1 Comparison with Previous Literature and Known Clinical Evidence\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThe findings of this research provide a multidimensional perspective on IVF outcomes in patients with PCOS that complements prior knowledge derived from more reductionist analyses. Several of our key associations, such as the detrimental effects of hydrosalpinx (bilateral tubal obstruction) and prolonged infertility, are strongly supported by the literature. For example, Ou et al. [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e] reported that hydrosalpinx reduces implantation rates and increases early pregnancy loss in IVF due to embryotoxic fluid and impaired endometrial receptivity. This aligns with our observation that patients with PCOS with untreated bilateral tubal blockage (likely hydrosalpinx) rarely achieve conception through IVF unless the tubal pathology is corrected. The appropriate clinical intervention is salpingectomy or proximal tubal occlusion prior to IVF, an approach supported by multiple studies showing improved IVF outcomes following hydrosalpinx treatment [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e, \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe strong influence of infertility duration on IVF outcome in this study reinforces a consistent theme: the sooner, the better. An early meta-analysis [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e] identified a negative association between infertility duration and IVF success, a finding supported by more recent analyses showing that beyond approximately 3\u0026ndash;5 years of trying, each additional year further reduces pregnancy odds [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. Our study\u0026rsquo;s PCOS-specific data indicate that even in a relatively young cohort, patients with infertility lasting\u0026thinsp;\u0026ge;\u0026thinsp;5 years experienced significantly lower success rates. While this may partly reflect associations with older age or other confounders, infertility duration remained an independent risk factor in several multi-feature rules, even when age was accounted for. One interpretation is that prolonged infertility reflects underlying intractable conditions, such as poor egg/embryo quality or endometrial dysfunction, that persist despite IVF. It might also reflect that these patients have undergone multiple prior treatments or IVF cycles without success, hinting at recurrent implantation failure scenarios. Clinically, this emphasizes that practitioners might consider escalating treatment or exploring adjunct therapies (immunological work-ups, use of donor gametes, etc.) when faced with a patient who has had many years of unexplained infertility.\u003c/p\u003e \u003cp\u003eThis research reaffirms the critical impact of female age, which remains the single strongest determinant of IVF success across all populations. Advanced age (\u0026ge;\u0026thinsp;35, especially\u0026thinsp;\u0026ge;\u0026thinsp;40) dramatically elevated failure risk in our PCOS cohort, consistent with general IVF outcomes. Likewise, obesity and metabolic factors are well-documented to impair fertility treatment outcomes [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. A 2024 systematic review noted that, in women with PCOS, high BMI independently lowers clinical pregnancy and live birth rates and raises miscarriage risk. In this article, obesity featured in some rules, though not the top rule, implying that while obesity is indeed harmful, other factors, such as tubal status or duration, were even more dominant in our dataset. It is possible that as the majority of our patients with PCOS were overweight, BMI did not differentiate outcomes as sharply, reflecting a type of range restriction effect. However, we observed that lean patients with PCOS had slightly better success rates than obese patients with PCOS, aligning with the consensus that weight management can improve IVF outcomes in PCOS.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section3\"\u003e \u003ch2\u003e4.1.2 Unexpected Findings in the Current Analysis\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThe data-driven discovery of the LUFS plus tubal factor combination as a high-failure profile appears to be a novel insight with limited direct precedent in the literature. LUFS is a subtle form of ovulatory dysfunction, and while it is known to cause infertility [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e], it is not commonly discussed in the context of IVF outcomes because ovarian stimulation with a human chorionic gonadotropin trigger is expected to circumvent follicle rupture problems. However, the results suggest that some patients with PCOS may still experience issues analogous to LUFS even in IVF (e.g., follicles that luteinize without yielding an egg). A recent study by Li et al. [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e] noted that LUF cycles negatively affected pregnancy outcomes in natural-cycle frozen embryo transfer, highlighting that luteinization without ovulation can disrupt timing and endometrial preparation. In stimulated IVF cycles, an argument can be made that if a patient tends to have LUFS, careful monitoring and trigger timing are crucial, or alternatives such as a dual trigger (human chorionic gonadotropin plus gonadotropin-releasing hormone agonist) might be beneficial to ensure oocyte release. The combination with the tubal factor is likely a proxy for the overall severity of infertility: these patients effectively have two strikes against them, and indeed, our analysis shows that they fare poorly. While not previously reported as a combined risk in the literature, this finding is intuitive and underscores the importance of addressing all known factors in a multidisciplinary management approach to give the best chance of success.\u003c/p\u003e \u003cp\u003eAnother novel discovery is that our approach identified interactions that traditional multivariable models might miss or not emphasize. For instance, using logistic regression, Liu et al. [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e] found that PCOS per se was not an independent predictor of live birth after adjusting for confounders, meaning that if age, BMI, etc., were controlled, patients with PCOS performed as well as others. However, that same analysis showed that within the PCOS group, factors such as younger age, shorter infertility duration, and good embryo quality were associated with higher live birth rates. The results of our study complement this by explicitly highlighting combinations (e.g., older age plus fewer good embryos) that lead to failure. Essentially, ARM provides a clinically interpretable set of rules that align with what an experienced clinician might surmise through years of practice. The benefit is that ARM can systematically scan through dozens of features to flag combinations that merit clinical attention, possibly revealing less obvious patterns.\u003c/p\u003e \u003cp\u003eBesides, no individual predictor exhibited a robust, statistically significant effect. However, a small subset of variables, such as bilateral tubal obstruction, displayed modest associations (lift\u0026thinsp;=\u0026thinsp;1.042).\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003e4.2 Strengths and Innovation of this Study\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eA strength of this study is the demonstration of how data mining techniques such as Apriori can be applied in reproductive medicine. This approach can handle many variables and uncover associations without the need to pre-specify an outcome model. The rules generated are intuitively understandable (\u0026ldquo;IF X and Y, THEN Z\u0026rdquo;), which could aid clinical decision-making more directly than a complex predictive model. To our knowledge, this is the first study to report association rules in the context of IVF outcomes for PCOS. Thus, it aids the use of similar methods on larger IVF databases to perhaps discover phenotype-specific patterns (for example, does a combination of certain hormone levels predict OHSS risk in PCOS?).\u003c/p\u003e \u003cp\u003eAdditionally, prior literature largely focuses on singular risk factors or additive models. For example, large reviews emphasize the need to individualize IVF protocols for patients with PCOS and note common comorbidities such as obesity and metabolic syndrome [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e], but they do not emphasize which specific factor pairs are the most dangerous. Likewise, regression studies typically identify age, BMI (obesity), hormone levels, etc., as independent predictors [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e], without assessing their combined impact. In sharp contrast, our ARM approach identified novel synergistic rules. To our knowledge, no previous study has described the combined effect of LUFS and tubal obstruction on IVF failure. Hydrosalpinx, a classic example of tubal pathology, is known to halve IVF pregnancy rates, but its interaction with ovulatory dysfunction (such as LUFS) has not been explored [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e]. By pointing out this hidden synergy, we fill a gap in the literature: existing meta-analyses and reviews simply document the problems (age/BMI, OHSS risk, tubal factors) separately, whereas our findings show that certain combinations (e.g., LUFS\u0026thinsp;+\u0026thinsp;tubal factor) confer especially high risk. This underscores ARM\u0026rsquo;s strength: revealing clinically meaningful patterns beyond standard models.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003e4.3 Limitations of this Study\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eDespite its insights, this research has certain limitations. First, the study was retrospective and observational; association does not imply causation. The rules we found do not prove that LUFS causes IVF failure, but only show that they occur together frequently. There could be underlying confounders (for example, perhaps women with LUFS also had poor ovarian reserve, which was the real driver of failure). We attempted to mitigate spurious findings by requiring relatively high support and by conducting statistical tests, but some associations might still be coincidental or attributed to biases in the dataset. Second, the dataset size is moderate; a larger sample would allow the detection of associations with lower support (rarer but potentially important scenarios). Our choice of a 10% support threshold meant we likely missed associations involving very rare conditions (e.g., uncommon genetic factors or severe male factor cases); these might still be clinically significant for individual patients but were not detectable in our analysis. Third, our feature encoding, while comprehensive, was limited to what was recorded. We did not include some potentially relevant variables such as Anti-M\u0026uuml;llerian hormone levels, insulin resistance indices, or detailed embryo morphology scores. The inclusion of such data might yield additional associations (for instance, a combination of low Anti-M\u0026uuml;llerian hormone plus PCOS could predict poor ovarian response). Fourth, the analysis was confined to patients with PCOS at a single center; thus, the associations reflect that specific population and practice (e.g., the stimulation protocols used and the prevalence of certain issues in that clinic). Caution is needed in generalizing the results to all patients with PCOS or other IVF centers. What holds true in our data (e.g., proportion of patients with hydrosalpinx) may vary across other populations. Validating these findings in external datasets would strengthen their generalizability and clinical reliability.\u003c/p\u003e \u003cp\u003ePart of the results were inconsistent with established conclusions; for example, the influence of BMI on IVF clinical outcomes was not observed, despite being widely recognized by researchers [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e, \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e] This discrepancy may stem from over 80% of patients having a BMI below the 24 kg/m\u0026sup2; threshold (World Health Organization Asian standard), reducing its discriminatory power. Future multi-center datasets with broader BMI distributions may better clarify obesity-related effects.\u003c/p\u003e \u003cp\u003eFinally, similar to other data mining studies, there is a risk of overfitting or detecting patterns lacking biological plausibility. We focused on interpreting only rules that were clinically meaningful and supported by external evidence. It is reassuring that our top findings align with known mechanisms (e.g., tubal fluid impairing implantation, prolonged infertility indicating more difficult cases). Nonetheless, caution is warranted to avoid overinterpreting combinations that might be artifacts.\u003c/p\u003e \u003cp\u003eIn conclusion, ARM was applied to identify key combinations of factors associated with IVF failure in patients with PCOS. The results highlight that it is often the convergence of multiple adverse factors, such as ovulatory dysfunction, tubal pathology, and long-term infertility, that dramatically lowers the chances of pregnancy in this high-risk group. These findings largely corroborate with those in the existing literature on individual risk factors while also providing a novel, integrated view of their interactions. For clinicians, the insights underscore the importance of comprehensive infertility work-ups in PCOS: patients with both PCOS and an additional infertility factor (e.g., tubal obstruction) should be counseled about the reduced probability of success and the potential need to address remediable factors before IVF. Similarly, earlier and more aggressive management may be warranted for those with prolonged infertility, rather than continuing extended attempts with less intensive treatments.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"5. Conclusions","content":"\u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThis study demonstrates the utility of data-driven approaches in reproductive medicine. By uncovering patterns that might be overlooked by traditional analyses, ARM can generate hypotheses for further research (e.g., investigating the mechanistic link between LUFS and IVF outcomes in PCOS) and potentially inform clinical decision support systems. Future studies should validate these rules in larger, multi-center cohorts and assess their predictive value prospectively. It would also be worthwhile to extend this analysis to identify IVF success profiles (patients who succeed) and to compare PCOS with non-PCOS infertile populations to determine whether different rules apply. Ultimately, translating these findings into practice, for example, developing a risk score or checklist based on the presence of multiple factors, could help personalize IVF counseling and treatment for patients with PCOS. In summary, women with PCOS are a heterogeneous group, and our study highlights that the cumulative burden of their reproductive challenges determines IVF outcomes. Recognizing and addressing each element of that burden holds promise for improving fertility success in this prevalent and challenging condition.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003eAI: artificial intelligence, ARM: association rule mining, BMI: body mass index, IVF: in vitro fertilization, LUFS: luteinized unruptured follicle syndrome, OHSS: ovarian hyperstimulation syndrome, PCOS: polycystic ovary syndrome\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot Applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u0026nbsp;Consent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot Applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research was funded by the National Natural Science Foundation of China, [grant number: 82460309], and the APC was funded by the Reproductive Hospital of Guangxi.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors\u0026apos; contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eXuehong Zhu performed data curation, formal analysis, investigation, and wrote the original draft. Guanghui Dong contributed equally to project administration and also participated in writing the original draft. Zhong Lin provided supervision and secured funding. Lina Ge was involved in the investigation and formal analysis. Feng Han conceived the study, constructed the framework, and contributed to writing, reviewing, and editing the manuscript. All authors read and approved the final manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgement\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe thank the Reproductive Hospital of Guangxi for data management.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eAzziz R, Carmina E, Chen Z, Dunaif A, Laven JS, Legro RS, et al. Polycystic ovary syndrome. Nat Rev Dis Primer. 2016;2:1\u0026ndash;18.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKotlyar AM, Seifer DB. Women with PCOS who undergo IVF: a comprehensive review of therapeutic strategies for successful outcomes. Reprod Biol Endocrinol. 2023;21:70. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12958-023-01120-7\u003c/span\u003e\u003cspan address=\"10.1186/s12958-023-01120-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTan X, Wen Y, Chen H, Zhang L, Wang B, Wen H, et al. Follicular output rate tends to improve clinical pregnancy outcomes in patients with polycystic ovary syndrome undergoing in vitro fertilization-embryo transfer treatment. J Int Med Res. 2019;47:5146\u0026ndash;54. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1177/0300060519860680\u003c/span\u003e\u003cspan address=\"10.1177/0300060519860680\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKicińska AM, Maksym RB, Zabielska-Kaczorowska MA, Stachowska A, Babińska A. Immunological and metabolic causes of infertility in polycystic ovary syndrome. Biomedicines. 2023;11:1567. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/biomedicines11061567\u003c/span\u003e\u003cspan address=\"10.3390/biomedicines11061567\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFu K, Li Y, Lv H, Wu W, Song J, Xu J. Development of a model predicting the outcome of in vitro fertilization cycles by a robust decision tree method. Front Endocrinol. 2022;13. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fendo.2022.877518\u003c/span\u003e\u003cspan address=\"10.3389/fendo.2022.877518\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVeroneze R, Cruz Tfaile Corbi S, Roque da Silva B, de Rocha S, Maurer-Morelli C, Perez Orrico CV. Using association rule mining to jointly detect clinical features and differentially expressed genes related to chronic inflammatory diseases. PLoS ONE. 2020;15:e0240269. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pone.0240269\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0240269\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStilou S, Bamidis PD, Maglaveras N, Pappas C. Mining association rules from clinical databases: an intelligent diagnostic process in healthcare. Stud Health Technol Inf. 2001;84:1399\u0026ndash;403.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSamuels J. In One-hot encoding and two-hot encoding: an introduction; 2024.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWorld Health Organization. Standards for maternal and neonatal care; 2007.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWorld Health Organization. WHO Recommendations on antenatal care for a positive pregnancy experience; 2016.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOmbelet W. WHO fact sheet on infertility gives hope to millions of infertile couples worldwide. Facts Views Vis Obgyn. 2020;12:249.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWorld Health Organization. Infertility prevalence estimates, 1990\u0026ndash;2021. World Health Organization. 2023. ISBN 92-4-006831-7.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCoussa A, Hasan HA, Barber TM. Impact of contraception and IVF hormones on metabolic, endocrine, and inflammatory status. J Assist Reprod Genet. 2020;37:1267\u0026ndash;72. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s10815-020-01756-z\u003c/span\u003e\u003cspan address=\"10.1007/s10815-020-01756-z\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHerman T, Csehely S, Orosz M, Bhattoa HP, Deli T, Torok P, et al. Impact of endocrine disorders on IVF outcomes: results from a large, single-centre, prospective study. Reprod Sci. 2023;30:1878\u0026ndash;90. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s43032-022-01137-0\u003c/span\u003e\u003cspan address=\"10.1007/s43032-022-01137-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVannuccini S, Clifton VL, Fraser IS, Taylor HS, Critchley H, Giudice LC, Petraglia F. Infertility and reproductive disorders: impact of hormonal and inflammatory mechanisms on pregnancy outcome. Hum Reprod Update. 2016;22:104\u0026ndash;15.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang L, Yu X, Xiong D, Leng M, Liang M, Li R, et al. Hormonal and metabolic influences on outcomes in PCOS undergoing assisted reproduction: the role of BMI in fresh embryo transfers. BMC Pregnancy Childbirth. 2025;25:368. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12884-025-07480-9\u003c/span\u003e\u003cspan address=\"10.1186/s12884-025-07480-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHarrison RF, Bonnar J, Thompson W. Diagnosis and management of tubo-uterine factors in infertility. Volume 4. Springer Science \u0026amp; Business Media; 2012.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOzgur K, Bulut H, Berkkanoglu M, Coetzee K, Kaya G. ICSI pregnancy outcomes following hysteroscopic placement of Essure devices for hydrosalpinx in laparoscopic contraindicated patients. Reprod Biomed Online. 2014;29:113\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eQiu J, Du T, Chen C, Lyu Q, Mol BW, Zhao M, Kuang Y. Impact of uterine malformations on pregnancy and neonatal outcomes of IVF/ICSI\u0026ndash;frozen embryo transfer. Hum Reprod. 2022;37:428\u0026ndash;46.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTournaye H. Male factor infertility and ART. Asian J Androl. 2011;14:103.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I. Machine learning and data mining methods in diabetes research. Comput Struct Biotechnol J. 2017;15:104\u0026ndash;16.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGenga L, Allodi L, Zannone N. Association rule mining meets regression analysis: an automated approach to unveil systematic biases in decision-making processes. J Cybersecur Priv. 2022;2:191\u0026ndash;219. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/jcp2010011\u003c/span\u003e\u003cspan address=\"10.3390/jcp2010011\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePradhan GN, Prabhakaran B. Association rule mining in multiple, multi-dimensional time series medical data. J Healthc Inf Res. 2017;1:92\u0026ndash;118. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s41666-017-0001-x\u003c/span\u003e\u003cspan address=\"10.1007/s41666-017-0001-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMelo AS, Ferriani RA, Navarro PA. Treatment of infertility in women with polycystic ovary syndrome: approach to clinical practice. Clinics. 2015;70:765\u0026ndash;9. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.6061/clinics/2015(11)09\u003c/span\u003e\u003cspan address=\"10.6061/clinics/2015(11)09\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOu H, Sun J, Lin L, Ma X. Ovarian response, pregnancy outcomes, and complications between salpingectomy and proximal tubal occlusion in hydrosalpinx patients before in vitro fertilization: a meta-analysis. Front Surg. 2022;9:830612. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fsurg.2022.830612\u003c/span\u003e\u003cspan address=\"10.3389/fsurg.2022.830612\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCapmas P, Suarthana E, Tulandi T. Management of hydrosalpinx in the era of assisted reproductive technology: a systematic review and meta-analysis. J Minim Invasive Gynecol. 2021;28:418\u0026ndash;41. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.jmig.2020.08.017\u003c/span\u003e\u003cspan address=\"10.1016/j.jmig.2020.08.017\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXu B, Zhang Q, Zhao J, Wang Y, Xu D, Li Y. Pregnancy outcome of in vitro fertilization after Essure and laparoscopic management of hydrosalpinx: a systematic review and meta-analysis. Fertil Steril. 2017;108:84\u0026ndash;e955. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.fertnstert.2017.05.005\u003c/span\u003e\u003cspan address=\"10.1016/j.fertnstert.2017.05.005\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang L, Cai H, Li W, Tian L, Shi J. Duration of infertility and assisted reproductive outcomes in non-male factor infertility: can use of ICSI turn the tide? BMC Womens Health. 2022;22:480. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12905-022-02062-9\u003c/span\u003e\u003cspan address=\"10.1186/s12905-022-02062-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang C, Shi Q, Xing J, Yan Y, Shen X, Shan H, Sun H, Mei J. The relationship between duration of infertility and clinical outcomes of intrauterine insemination for younger women: a retrospective clinical study. BMC Pregnancy Childbirth. 2024;24:199. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12884-024-06398-y\u003c/span\u003e\u003cspan address=\"10.1186/s12884-024-06398-y\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang X, Tian P, Zhao Y, Lu J, Dong C, Zhang C. The association between female age and pregnancy outcomes in patients receiving first elective single embryo transfer cycle: a retrospective cohort study. Sci Rep. 2024;14:19216. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41598-024-70249-1\u003c/span\u003e\u003cspan address=\"10.1038/s41598-024-70249-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlenezi SA, Khan R, Amer S. The impact of high BMI on pregnancy outcomes and complications in women with PCOS undergoing IVF\u0026mdash;a systematic review and meta-analysis. J Clin Med. 2024;13:1578. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/jcm13061578\u003c/span\u003e\u003cspan address=\"10.3390/jcm13061578\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAzmoodeh A, Pejman Manesh M, Akbari Asbagh F, Ghaseminejad A, Hamzehgardeshi Z. Effects of letrozole-HMG and clomiphene-HMG on incidence of luteinized unruptured follicle syndrome in infertile women undergoing induction ovulation and intrauterine insemination: a randomised trial. Glob J Health Sci. 2015;8:244. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.5539/gjhs.v8n4p244\u003c/span\u003e\u003cspan address=\"10.5539/gjhs.v8n4p244\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi S, Liu L, Meng T, Miao B, Sun M, Zhou C, Xu Y. Impact of luteinized unruptured follicles on clinical outcomes of natural cycles for frozen/thawed blastocyst transfer. Front Endocrinol. 2021;12:738005.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu S, Mo M, Xiao S, Li L, Hu X, Hong L, Wang L, Lian R, Huang C, Zeng Y, et al. Pregnancy outcomes of women with polycystic ovary syndrome for the first in vitro fertilization treatment: a retrospective cohort study with 7,678 patients. Front Endocrinol. 2020;11:575337. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fendo.2020.575337\u003c/span\u003e\u003cspan address=\"10.3389/fendo.2020.575337\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJiang X, Liu R, Liao T, He Y, Li C, Guo P, Zhou P, Cao Y, Wei Z. A predictive model of live birth based on obesity and metabolic parameters in patients with PCOS undergoing frozen-thawed embryo transfer. Front Endocrinol. 2021;12:799871. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fendo.2021.799871\u003c/span\u003e\u003cspan address=\"10.3389/fendo.2021.799871\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePalagiano A, Cozzolino M, Ubaldi FM, Palagiano C, Coccia ME. Effects of hydrosalpinx on endometrial implantation failures: evaluating salpingectomy in women undergoing in vitro fertilization. RBGO Gynecol Obstet. 2021;43:304\u0026ndash;10. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1055/s-0040-1722155\u003c/span\u003e\u003cspan address=\"10.1055/s-0040-1722155\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eV\u0026aring;genes H, Pranić SM. Analysis of the quality, accuracy, and readability of patient information on polycystic ovarian syndrome (PCOS) on the Internet available in English: a cross-sectional study. Reprod Biol Endocrinol. 2023;21:44. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12958-023-01100-x\u003c/span\u003e\u003cspan address=\"10.1186/s12958-023-01100-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShang J, Wang S, Wang A, Li F, Zhang J, Wang J, Lv R, Chen H, Mu X, Zhang K, et al. Intra-ovarian inflammatory states and their associations with embryo quality in normal-BMI PCOS patients undergoing IVF treatment. Reprod Biol Endocrinol. 2024;22:11. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12958-023-01183-6\u003c/span\u003e\u003cspan address=\"10.1186/s12958-023-01183-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Association Rule Mining, In Vitro Fertilization, Polycystic Ovary Syndrome, Medical Data Analyst, Risk Factor Combinations","lastPublishedDoi":"10.21203/rs.3.rs-7020094/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7020094/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003ePolycystic ovary syndrome (PCOS) is a common indication for in vitro fertilization (IVF). Previous studies have often assessed risk factors independently, overlooking the multifactorial interactions that frequently influence clinical pregnancy outcomes. We retrospectively analyzed electronic medical record (EMR) data from PCOS patients who were undergoing IVF.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003eKey clinical variables (age, body mass index [BMI], duration of infertility, hormonal or metabolic disorders, tubal or uterine abnormalities, ovarian conditions including luteinized unruptured follicle syndrome [LUFS], and treatment details) were one-hot-encoded. Apriori association rule mining (ARM) was applied to identify patterns linked to clinical pregnancy failure, using thresholds of support\u0026thinsp;\u0026ge;\u0026thinsp;0.05, confidence\u0026thinsp;\u0026ge;\u0026thinsp;0.60, and lift\u0026thinsp;\u0026gt;\u0026thinsp;1. This approach revealed that multifactorial risk associations were not evident in traditional single-variable analyses.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eThe overall clinical pregnancy success rate in the cohort was ~\u0026thinsp;40%. ARM revealed several clinically meaningful patterns; notably, maternal age\u0026thinsp;\u0026gt;\u0026thinsp;35 years frequently appeared in high-risk combinations, often with metabolic or anatomical abnormalities. For instance, the combination of LUFS and tubal obstruction was strongly associated with failure, suggesting a synergistic negative effect. Many of these multifactorial associations may be overlooked using a traditional single-variable analysis approach.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e \u003cp\u003eApriori rule mining effectively identified complex combinations of risk factors for in vitro fertilization failure in polycystic ovary syndrome, informing individualized strategies. Clinically, identifying advanced-age patients with specific reproductive or metabolic abnormalities may support targeted interventions. This study shows the broader potential of combining ARM with electronic medical record data to reveal hidden patterns for personalized clinical decision-making.\u003c/p\u003e","manuscriptTitle":"Are Combined Risk Factors Linked to In Vitro Fertilization Failure in Polycystic Ovary Syndrome? An Association Rule Mining Approach","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-07-04 05:55:33","doi":"10.21203/rs.3.rs-7020094/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"d2df1d1d-71c6-4d2d-9e62-5fae99e181be","owner":[],"postedDate":"July 4th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-07-31T15:38:17+00:00","versionOfRecord":[],"versionCreatedAt":"2025-07-04 05:55:33","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7020094","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7020094","identity":"rs-7020094","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00