Critical factors influencing live birth rates in fresh embryo transfer for IVF: insights from cluster ensemble algorithms.

OA: gold CC-BY-NC-ND-4.0
Full text 46,672 characters · extracted from pmc-nxml · 8 sections · click to expand

Ranking

In our study, we extended our analysis to compare the efficacy of NMF, AMU-NMF, GDLC, and NMFE algorithms. Following the random masking of the original data by three different sets of random numbers, NMFE consistently exhibited higher accuracy and purity values compared to other algorithms across the majority of cases (Table  2 ). Table 2 Accuracy and purity values of different algorithms by masking each feature group by three different sets of random numbers. NMFE consistently demonstrated higher accuracy and purity values compared to other algorithms in the majority of cases. *The maximum value in the same group. Groups Values NMF AMU-NMF GDLC NMFE Female Basic Information ACC 0.496 0.504 0.500 0.574 0.567 0.538 0.627 0.573 0.579 0.722* 0.643* 0.717* Purity 0.510 0.515 0.512 0.616 0.600 0.583 0.660 0.604 0.609 0.846* 0.738* 0.885* Male Basic Information ACC 0.643 0.570 0.566 0.617 0.512 0.506 0.648 0.557 0.566 0.752* 0.742* 0.697* Purity 0.564 0.638 0.631 0.580 0.543 0.540 0.606 0.595 0.608 0.638* 0.882* 0.784* Menstrual History ACC 0.498 0.488 0.465 0.558 0.552 0.469 0.612 0.584 0.574* 0.640* 0.603* 0.449 Purity 0.533 0.509 0.484 0.616 0.621 0.469 0.651 0.629 0.604* 0.756* 0.705* 0.471 Obstetric History ACC 0.538 0.475 0.468 0.415 0.443 0.541 0.515 0.465 0.462 0.691* 0.642* 0.631* Purity 0.607 0.492 0.483 0.431 0.518 0.569 0.544 0.490 0.500 0.699* 0.802* 0.712* Therapeutic Interventions ACC 0.557 0.465 0.845* 0.566 0.587* 0.548 0.612* 0.577 0.589 0.451 0.448 0.566 Purity 0.605 0.509 0.922* 0.627 0.646* 0.575 0.651* 0.619 0.633 0.467 0.475 0.662 Factors Associated with Embryo Quality ACC 0.482 0.568 0.549 0.560 0.606* 0.620 0.494 0.451 0.452 0.583* 0.446 0.666* Purity 0.512 0.630* 0.607 0.594 0.617 0.642 0.520 0.496 0.500 0.650* 0.477 0.806* Female Diagnosis ACC 0.475 0.466 0.466 0.577 0.572 0.569 0.525 0.480 0.480 0.667* 0.673* 0.688* Purity 0.490 0.475 0.472 0.627 0.626 0.622 0.560 0.505 0.505 0.792* 0.802* 0.834* Embryo Transfer-Related Indicators ACC 0.512 0.547 0.536 0.568 0.553* 0.538 0.591 0.547 0.545* 0.670* 0.475 0.537 Purity 0.563 0.620* 0.613* 0.572 0.589 0.570 0.635 0.586 0.573 0.772* 0.484 0.603 Hormone Levels after Transplantation ACC 0.649* 0.671* 0.790* 0.440 0.421 0.399 0.449 0.456 0.442 0.615 0.571 0.649 Purity 0.687* 0.724* 0.878* 0.499 0.469 0.433 0.472 0.491 0.457 0.637 0.722 0.687 Embryo Transfer Outcomes ACC 0.534 0.462 0.486 0.583* 0.589* 0.565 0.530 0.453 0.467 0.534 0.536 0.583* Purity 0.588 0.473 0.498 0.639* 0.661* 0.605 0.574 0.494 0.500 0.602 0.583 0.618* Complications During Pregnancy ACC 0.517 0.509 0.478 0.589 0.598 0.549 0.602 0.555 0.561* 0.498 0.724* 0.465 Purity 0.542 0.539 0.498 0.652 0.663 0.604 0.645 0.593 0.600* 0.528 0.891* 0.503 Previous History of ART ACC 0.537 0.528 0.512 0.597 0.570* 0.838* 0.624 0.572 0.573 0.723* 0.498 0.580 Purity 0.592 0.572 0.536 0.650 0.635* 0.838* 0.657 0.603 0.604 0.867* 0.513 0.638 Ovarian Response Assessment Indicators ACC 0.462 0.583* 0.533 0.371 0.382 0.361 0.521* 0.481 0.502 0.498 0.576 0.593* Purity 0.469 0.670* 0.588 0.444 0.439 0.380 0.557* 0.514 0.524 0.519 0.647 0.679* The bold is employed to signify that the corresponding algorithm demonstrates the best performance. Accuracy and purity values of different algorithms by masking each feature group by three different sets of random numbers. NMFE consistently demonstrated higher accuracy and purity values compared to other algorithms in the majority of cases. *The maximum value in the same group. The bold is employed to signify that the corresponding algorithm demonstrates the best performance. To further investigate the impact of specific feature groups on the IVF-ET model, we masked the data for groups such as Therapeutic Interventions, Embryo Transfer Outcomes, and Ovarian Response Assessment Indicators with random numbers. This manipulation resulted in a significant decrease in the overall accuracy value of the IVF-ET model, suggesting that these feature groups exert a substantial influence on the model’s performance. To quantify the influence of various feature groups on the IVF-ET outcome, we computed ACC-GAP and PUR-GAP values by summing the accuracy and purity values after masking the data with the three sets of random numbers. Smaller calculated values for these gaps indicated a more robust influence of the feature group on the overall model and a greater effect on the IVF-ET outcome (Table  3 ; Fig.  2 .). Among the feature groups, Therapeutic Interventions exhibited the smallest ACC-GAP and PUR-GAP values, suggesting that they contribute the most significantly to the model. Table 3 The ACC-GAP and PUR-GAP values for each feature group were calculated using NMFE. The feature group therapeutic interventions demonstrated the lowest ACC-GAP and PUR-GAP values. Conversely, the male Basic Information feature group had the highest ACC-GAP value. The female Basic Information feature group had the highest PUR-GAP value. * The maximum value in the same group, # the minimum value in the same group. Features Group ACC-GAP PUR-GAP Therapeutic Interventions 1.4640 # 1.6034 # Embryo Transfer Outcomes 1.6532 1.8024 Ovarian Response Assessment Indicators 1.6678 1.8453 Embryo Transfer-Related Indicators 1.6826 1.8583 Complications During Pregnancy 1.6867 1.9213 Menstrual History 1.6920 1.9321 Factors Associated with Embryo Quality 1.6951 1.9324 Previous History of ART 1.8010 2.0175 Hormone Levels after Transplantation 1.8347 2.0455 Obstetric History 1.9639 2.2138 Female Diagnosis 2.0277 2.4283 Female Basic Information 2.0818 2.4685* Male Basic Information 2.1913* 2.3049 The bold is employed to indicate both the maximum and minimum values within each respective group. The bolded maximum value suggests that the contribution made by this group of features to the model is relatively small, while the bolded minimum value implies that the contribution of this group of features to the model is relatively large. The ACC-GAP and PUR-GAP values for each feature group were calculated using NMFE. The feature group therapeutic interventions demonstrated the lowest ACC-GAP and PUR-GAP values. Conversely, the male Basic Information feature group had the highest ACC-GAP value. The female Basic Information feature group had the highest PUR-GAP value. * The maximum value in the same group, # the minimum value in the same group. The bold is employed to indicate both the maximum and minimum values within each respective group. The bolded maximum value suggests that the contribution made by this group of features to the model is relatively small, while the bolded minimum value implies that the contribution of this group of features to the model is relatively large. Fig. 2 Rank each feature group according to the ACC-GAP and PUR-GAP values. The top 5 groups that have the greatest influence on the IVF-ET results identified are Therapeutic Interventions, Embryo Transfer Outcomes, Ovarian Response Assessment Indicators, Embryo Transfer-Related Indicators, and Complications During Pregnancy. Rank each feature group according to the ACC-GAP and PUR-GAP values. The top 5 groups that have the greatest influence on the IVF-ET results identified are Therapeutic Interventions, Embryo Transfer Outcomes, Ovarian Response Assessment Indicators, Embryo Transfer-Related Indicators, and Complications During Pregnancy. Based on our influence analysis, we identified the top five groups with the greatest influence on the IVF-ET result: Therapeutic Interventions, Embryo Transfer Outcomes, Ovarian Response Assessment Indicators, Embryo Transfer-Related Indicators, and Complications During Pregnancy. Within the Therapeutic Interventions group, factors such as the ovarian stimulation protocol, ovulation stimulation drugs, and pre-cycle and intra-cycle acupuncture were found to be particularly influential. To gain a deeper understanding of the impact of each treatment plan on the IVF-ET result, we conducted a separate analysis for each intervention factor (Table  4 ; Fig.  3 ). Table 4 ACC-GAP and PUR-GAP value of different intervention factors group by NMFE. Ovarian stimulation protocols, ovulation stimulation drugs, and pre-and intra-cycle acupuncture ranked 7th, 8th, 12th, and 14th based on ACC-GAP values, and 4th, 6th, 13th, and 14th based on PUR-GAP values. * the maximum value in the same group, # the minimum value in the same group. Features Group ACC-GAP PUR-GAP Embryo Transfer Outcomes 1.6532 # 1.8024 # Ovarian Response Assessment Indicators 1.6678 1.8453 Embryo Transfer-Related Indicators 1.6826 1.8583 Complications During Pregnancy 1.6867 1.9213 Menstrual History 1.692 1.9321 Factors Associated with Embryo Quality 1.6951 1.9324 Ovulation Stimulation Drugs 1.7207 1.9106 Ovarian Stimulation Protocols 1.7335 1.9249 Previous History of ART 1.801 2.0175 Hormone Levels after Transplantation 1.8347 2.0455 Obstetric History 1.9639 2.2138 Pre-cycle Acupuncture 2.0175 2.3139 Female Diagnosis 2.0277 2.4283 Intra-cycle Acupuncture 2.0447 2.3258 Female Basic Information 2.0818 2.4685* Male Basic Information 2.1913* 2.3049 The font is employed to indicate both the maximum and minimum values within each respective group. The bolded maximum value suggests that the contribution made by this group of features to the model is relatively small, while the bolded minimum value implies that the contribution of this group of features to the model is relatively large. ACC-GAP and PUR-GAP value of different intervention factors group by NMFE. Ovarian stimulation protocols, ovulation stimulation drugs, and pre-and intra-cycle acupuncture ranked 7th, 8th, 12th, and 14th based on ACC-GAP values, and 4th, 6th, 13th, and 14th based on PUR-GAP values. * the maximum value in the same group, # the minimum value in the same group. The font is employed to indicate both the maximum and minimum values within each respective group. The bolded maximum value suggests that the contribution made by this group of features to the model is relatively small, while the bolded minimum value implies that the contribution of this group of features to the model is relatively large. Fig. 3 Rank of different intervention factors in the IVF-ET model. Ovarian stimulation protocols, ovulation stimulation drugs, and pre-and intra-cycle acupuncture significantly declined in the rankings of IVF-ET models. Rank of different intervention factors in the IVF-ET model. Ovarian stimulation protocols, ovulation stimulation drugs, and pre-and intra-cycle acupuncture significantly declined in the rankings of IVF-ET models. Upon further analyzing the clinical features within the Therapeutic Interventions separately, we observed a shift in the rank of influential factors. Specially, ovulation-stimulating drugs dropped to seventh place, ovarian stimulation protocol dropped to eighth place, and acupuncture treatment was further behind. Whether this shift indicates a synergistic effect among multiple therapies will require further validation to confirm.

Results

We conducted a comparison of the NMFE with some well-known effective algorithms. The algorithms used in the comparison are NMF 17 , AMU-NMF 18 , GDLC 19 , MCLA 22 and DREC 23 . The accuracy (ACC) and purity (PUR) values serve as metrics to assess the performance and effectiveness of different algorithms 34 . A higher accuracy value indicates a greater proportion of correct predictions, whereas a higher purity value indicates a higher percentage of instances that are accurately classified. The accuracy and purity value of NMFE are 0.7912 and 0.8605 respectively, surpassing those of other algorithm models. This indicates that NMFE is more effective (Table  1 ; Fig.  1 ). Table 1 Results of accuracy and purity value on the IVF dataset by different clustering algorithms. The accuracy and purity value of NMFE are 0.7912 and 0.8605 respectively, surpassing those of other algorithm models. *The maximum value in the same group. Algorithm Accuracy Purity NMF 0.6473 0.7327 AMU-NMF 0.7640 0.8096 GDLC 0.7541 0.8181 MCLA 0.6468 0.7327 DREC 0.7439 0.7573 NMFE 0.7912* 0.8605* The bold is employed to signify that the corresponding algorithm demonstrates the best performance. Results of accuracy and purity value on the IVF dataset by different clustering algorithms. The accuracy and purity value of NMFE are 0.7912 and 0.8605 respectively, surpassing those of other algorithm models. *The maximum value in the same group. The bold is employed to signify that the corresponding algorithm demonstrates the best performance. Fig. 1 Area diagram integrating ACC and Purity on the dataset IVF-ET. The algorithm becomes more efficient as the area increases. The NMFE algorithm has the largest area (Area: 3404.13), indicating that its effectiveness is the highest among the above algorithms. Area diagram integrating ACC and Purity on the dataset IVF-ET. The algorithm becomes more efficient as the area increases. The NMFE algorithm has the largest area (Area: 3404.13), indicating that its effectiveness is the highest among the above algorithms.

Proposed

In this paper let \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X=\left\{ {{x_1},{x_2}, \ldots {x_n}} \right\} \in {{\mathbb{R}}^{M,N}}$$\end{document} us denote the dataset of IVF-ET. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x_n}$$\end{document} denotes the n- th sample in the dataset. M denotes the feature dimension of each sample, and each feature is to portray a sample effective attribute. N denotes the sum of all samples involved in this modeling. In this paper, we use non-negative matrix factorization (NMF) 60 and its two variants algorithms to construct an ensemble model. NMF works by approximating the high-dimensional target matrix using two low-dimensional matrices. We obtain effective low representations through multiple variants of the NMF algorithm, and then the ensemble model is constructed by fusing the low-dimensional feature matrices obtained from the training of multiple models. The objective function of NMF is shown in Eq. ( 1 ). 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathop {\arg \hbox{min} }\limits_{{U,V}} {J_1}\left( {U,V} \right)=\left\| {X - UV} \right\|_{F}^{2},s.t.{\text{ }}U \geqslant 0,V \geqslant 0.$$\end{document} where U , V are two low-dimensional matrices. U is the weight matrix and V is the feature matrix, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$U \in {{\mathbb{R}}^{M,K}},V \in {{\mathbb{R}}^{N,K}}$$\end{document} . K is the dimension of the low-dimensional matrix, in which \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K \ll \hbox{min} \left\{ {M,N} \right\}$$\end{document} . In Eq.  1 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\left\| {{\text{ }} \cdot {\text{ }}} \right\|_F}$$\end{document} is the Frobenius norm. To obtain U and V that approximate the original matrix X . The corresponding update rules are usually obtained using multiplicative updating. Furthermore, to accelerate NMF update and improve the effectiveness of the algorithm, a significant acceleration algorithm AMU-NMF was proposed by Gillis et al. 26 . It improves the efficiency of the algorithm while ensuring convergence. Further, to improve the representation ability and convergence speed of the algorithm. Wang et al. proposed a deep matrix factorization representation learning algorithm GDLC based on element update. Its objective function is shown in Eq. ( 2 ). 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{gathered} \mathop {\arg \hbox{min} }\limits_{{U,V}} {J_1}\left( {U,V} \right)=\left\| {X - UV} \right\|_{F}^{2}+\alpha \left\| U \right\|_{F}^{2}+\beta \left\| V \right\|_{F}^{2},s.t.{\text{ }}U \geqslant 0,V \geqslant 0. \hfill \\ ={\sum\limits_{{m=1}}^{M} {\sum\limits_{{n=1}}^{N} {\left( {{x_{m,n}} - \sum\limits_{{k=1}}^{K} {{u_{m,k}}{v_{n,k}}} } \right)} } ^2}+\alpha \cdot {\sum\limits_{{m=1}}^{M} {\sum\limits_{{k=1}}^{K} {\left( {{u_{m,k}}} \right)} } ^2}+\beta \cdot {\sum\limits_{{n=1}}^{N} {\sum\limits_{{k=1}}^{K} {\left( {{v_{n,k}}} \right)} } ^2} \hfill \\ \end{gathered}$$\end{document} To optimize the objective function, a stochastic gradient descent algorithm 61 , 62 and an alternate iterative update strategy 63 are used to minimize the objective function. To better improve the effectiveness of the algorithm, we fused the feature matrices obtained from the learning of the three algorithms NMF, AMU-NMF, and GDLC to construct an NMF-based ensemble algorithm (NMFE). Since the feature matrices are all non-negative matrices, to satisfy the effectiveness of the fusion algorithm and to ensure non-negativity, we propose a deep fusion-based method. The objective function of the method is shown below .The algorithm framework for NMFE is shown in Fig.  6 . Fig. 6 Algorithmic framework for NMFE. It is constructed by fusing the feature matrices that are obtained from NMF, AMU-NMF, and GDLC algorithms. 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{gathered} \mathop {\arg \hbox{min} }\limits_{{U,V}} {J_1}\left( {U,V} \right)=\sum\limits_{{i=1}}^{I} {\left\| {{V^{(i)}} - E} \right\|_{F}^{2}} +{\alpha _1}\left\| E \right\|_{F}^{2},s.t.{\text{ }}U \geqslant 0,V \geqslant 0. \hfill \\ =\sum\limits_{{i=1}}^{I} {{{\sum\limits_{{n=1}}^{N} {\sum\limits_{{k=1}}^{K} {\left( {v_{{n,k}}^{{(i)}} - {e_{n,k}}} \right)} } }^2}} +{\alpha _1} \cdot {\sum\limits_{{n=1}}^{N} {\sum\limits_{{k=1}}^{K} {\left( {{e_{n,k}}} \right)} } ^2} \hfill \\ \end{gathered}$$\end{document} Algorithmic framework for NMFE. It is constructed by fusing the feature matrices that are obtained from NMF, AMU-NMF, and GDLC algorithms. Where I denote the number of models that are used to construct the ensemble model. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{i}} \in \left\{ {1,2,3} \right\}$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${V^{(1)}}$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${V^{(2)}}$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${V^{(3)}}$$\end{document} denote the feature matrices obtained by algorithm NMF, AMU-NMF, and GDLC respectively. The objective function of the matrix is written in elemental form and then the SGD is used to optimize the objective function, which can be obtained as follows concerning the variable \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${e_{n,k}}$$\end{document} . 4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${J_2}\left( {{e_{n,k}}} \right)={\left( {v_{{n,k}}^{{(i)}} - {e_{n,k}}} \right)^2}+{\alpha _1} \cdot {\left( {{e_{n,k}}} \right)^2}$$\end{document} Based on SGD its update rule can be obtained as follows. 5 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_{{n,k}}^{{\left( t \right)}} \leftarrow e_{{n,k}}^{{\left( t \right)}} - \eta \left( {\left( {v_{{n,k}}^{{(i)}} - {e_{n,k}}} \right) \cdot \left( { - 1} \right)+{\alpha _1}{e_{n,k}}} \right)$$\end{document} There is a subtraction operation in Eq. ( 5 ), which does not guarantee that the update value is non-negative. For this reason, we use an activation function with a non-negative value domain to constrain in Eq. ( 4 ), rewriting (4) as 6 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${J_2}\left( {{e_{n,k}}} \right)={\left( {v_{{n,k}}^{{(i)}} - f\left( {{{\hat {e}}_{n,k}}} \right)} \right)^2}+{\alpha _1} \cdot {\left( {f\left( {{{\hat {e}}_{n,k}}} \right)} \right)^2}$$\end{document} We let \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f\left( \cdot \right)=sigmoid\left( \cdot \right)$$\end{document} . Based on the work in (6), the element’s SGD-based gradient values are transformed into weights for constructing the deep network for the update, and we can obtain the following update rule. 7 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left\{ \begin{gathered} {\left( {{{\hat {e}}_{n,k}}} \right)^T}={\left( {{{\hat {e}}_{n,k}}} \right)^1}+\eta {\left( {\Delta {e_{n,k}}} \right)^r}{\text{ }}r<R \hfill \\ {\left( {{{\hat {e}}_{n,k}}} \right)^1}\mathop \leftarrow \limits^{{r+1}} f\left( {{{\left( {{{\hat {e}}_{n,k}}} \right)}^T}} \right){\text{ }}r<R \hfill \\ {e_{n,k}}={\left( {{{\hat {e}}_{n,k}}} \right)^1}+\eta {\left( {\Delta {e_{n,k}}} \right)^r}{\text{ }}r=R \hfill \\ {\left( {\Delta {e_{n,k}}} \right)^r}=\sum\limits_{{t=1}}^{T} {\left( {\left( {v_{{n,k}}^{{(i)}} - f{{\left( {{{\hat {e}}_{n,k}}} \right)}^t}} \right) \cdot \left( { - 1} \right)+{\alpha _1} \cdot f{{\left( {{{\hat {e}}_{n,k}}} \right)}^t}} \right) \cdot f{{\left( {{{\hat {e}}_{n,k}}} \right)}^t} \cdot \left( {1 - f{{\left( {{{\hat {e}}_{n,k}}} \right)}^t}} \right)} \hfill \\ \end{gathered} \right.$$\end{document} Where R denotes the total number of rounds for training, and T denotes the number of times that the element \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat {e}_{n,k}}$$\end{document} is updated in a round. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta {\left( {\Delta {e_{n,k}}} \right)^r}{\text{ }}$$\end{document} denotes the cumulative value of the gradient for update the element \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat {e}_{n,k}}$$\end{document} in the r -th round. By using the update rule of Eq. ( 7 ), we can learn to obtain the matrix E , which will be clustered using the k -means algorithm, and the clustering results of the NMFE model can be obtained.

Materials

From January 2022 to December 2022, a total of 9539 patients underwent IVF at Sichuan Jinxin Xi’nan Women’s and Children’s Hospital, of which 3695 patients received fresh embryo transfer and 2238 patients observed pregnancy outcomes (Fig.  4 ). This study was approved by the Medical Ethics Management Committee of Sichuan Jinxin Xinan Women and Children’s Hospital (Ethnic number: No.2023-043) and was conducted according to all relevant guidelines and regulations. Since the data collected is anonymous, the requirement for informed consent was waived by the committee. Fig. 4 Flowchart of patient inclusion. Flowchart of patient inclusion. The dataset used in this study contains clinical features before and after IVF-ET. It consists of a total of 85 independent features, 69 clinical features before fresh embryo transfer, and 16 clinical features after transfer. Based on the correlation between features, we grouped them into 13 categories: Female Basic Information (3 items), Male Basic Information (5 items), Menstrual History (3 items), Obstetric History (12 items), Previous History of Assisted Reproduction (3 items), Ovarian Response Assessment Indicators (10 items), Therapeutic Interventions (4 items), Factors Associated with Embryo Quality (13 items), Female Diagnosis (10 items), Embryo Transfer-related Indicators (6 items), Hormone Levels After Transplantation (2 items), Embryo Transfer Outcomes (9 items), and Complications During Pregnancy (5 items) (Table  5 ; Fig.  5 ). The dataset contains one dependent feature, referred to as “Result”, which is divided into categories such as no pregnancy, miscarriage, and live birth. Of the cases in the dataset, 1,660 were not pregnant, 298 were miscarriages including ectopic pregnancy, biochemical pregnancy, and abortion, and 280 were live births. A comparative analysis of statistical differences in clinical features between groups with and without live births after fresh embryo transfer is shown in Supplementary Tables 1 & Supplementary Table 2 . Table 5 Features in IVF-ET dataset. The dataset used in this study contains 85 independent clinical features. Based on the correlation between features, we grouped them into 13 categories. Group Features Group Features Features related to pre-fresh embryo transfer Female Basic Information Female_age Factors Associated with Embryo Quality Total_Number_Of_Ovum Female_Ethnicity MII Female_Education Retrieved_Ovum Female_Occupation Mature_Ovum Menstrual History Menarche 0PN Menstrual_Blood_Volume 1PN Dysmenorrhoea 2PN Obstetric History Conceived before 3PN Gravity D3_Embryo Parity D5_Embryo Artificial_Abortion D6_Embryo Spontaneous_Abortion Cleavage_Stage_Good_Quality_Embryos Medical_Abortion Blastocyst_Good_Quality_Embryo Induced_Labour Embryo Transfer-Related Indicators Endometrial thickness on HCG day Premature_Labor Embryo_transfer_count Left_Ectopic_Pregnancy Endothelial _Thickness_on_ET Right_Ectopic_Pregnancy Transferred_Good_Quality_Embryos_Count Living_Children Transferred_D3_Good_Quality_Embryos_Count Infertility_Duration Transferred_D5_Good_Quality_Embryos_Count Previous History of ART Previous _ART_Cycle Female Diagnosis Fallopian_Disorder Previous_ART_Pregnancy_Cycle Ovulatory_Dysfunction Previous_ART_Cycle_Without _Pregnancy Immunological_Infertility Ovarian Response Assessment Indicators Body Mass Index Endometriosis Baseline_FSH Female_Chromosomal_ Abnormality Baseline_E2 Female_Monogenic_Diseases Baseline_P Female_Genetic_Factors PRL Oocyte_Development_and_Quality_Abnormalities Baseline_LH Ovarian_Insufficiency T Female_Other factors AMH Male_Age Antral_Follicle_Count Male_Ethnicity Therapeutic Interventions Pre-cycle_Acupuncture Male Basic Information Male_Education Intra-cycle_Acupuncture Male_Occupation Ovulation_Stimulation_Drugs Semen_Quality Ovarian_Stimulation_Protocol Features related to post-fresh embryo transfer Transfer Outcomes Ectopic_Pregnancy Hormone Levels after Transplantation P _14days_after_ET Miscarriage E2 _14days_after_ET Reason_of_Miscarriage Complications During Pregnancy Gestational_Diabetes Gestation_Period Gestational_Hypertension Premature_Delivery Intrahepatic_Cholestasis_of _Pregnancy Delivery_Way Fetofetal_Transfusion Boy_Baby Premature_Rupture_of_Membranes Girl_Baby Number_of_Babies FSH: follicle-stimulating hormone. LH: luteinizing hormone. IVF: in-vitro fertilization; E2, estradiol; P: progesterone; T, testosterone; AMH: anti-Müllerian hormone; GnRH-a: gonadotropin-releasing hormone agonist; GnRH-A: gonadotropin-releasing hormone antagonist; D3, Cleavage embryo; D5, Blastocyst embryo; EMT, endometrium thickness; ET, embryo transfer. Features in IVF-ET dataset. The dataset used in this study contains 85 independent clinical features. Based on the correlation between features, we grouped them into 13 categories. FSH: follicle-stimulating hormone. LH: luteinizing hormone. IVF: in-vitro fertilization; E2, estradiol; P: progesterone; T, testosterone; AMH: anti-Müllerian hormone; GnRH-a: gonadotropin-releasing hormone agonist; GnRH-A: gonadotropin-releasing hormone antagonist; D3, Cleavage embryo; D5, Blastocyst embryo; EMT, endometrium thickness; ET, embryo transfer. Fig. 5 The clinical features in the dataset. All the clinical features were grouped into 13 categories and calculated the fraction of features in each group relative to the total number of features. Female Basic Information (5 items, 6%), Male Basic Information (5 items, 6%), Menstrual History (3 items, 4%), Obstetric History (12 items, 14%), Previous History of Assisted Reproduction (3 items, 14%), Ovarian Response Assessment Indicators (8 items, 9%), Therapeutic Interventions (4 items, 5%), Factors Associated with Embryo Quality (13 items, 15%), Female Diagnosis (10 items, 12%), Embryo Transfer-related Indicators (6 items, 7%), Hormone Levels After Transplantation (2 items, 2%), Embryo Transfer Outcomes (9 items, 11%), and Complications During Pregnancy (5 items, 6%). The clinical features in the dataset. All the clinical features were grouped into 13 categories and calculated the fraction of features in each group relative to the total number of features. Female Basic Information (5 items, 6%), Male Basic Information (5 items, 6%), Menstrual History (3 items, 4%), Obstetric History (12 items, 14%), Previous History of Assisted Reproduction (3 items, 14%), Ovarian Response Assessment Indicators (8 items, 9%), Therapeutic Interventions (4 items, 5%), Factors Associated with Embryo Quality (13 items, 15%), Female Diagnosis (10 items, 12%), Embryo Transfer-related Indicators (6 items, 7%), Hormone Levels After Transplantation (2 items, 2%), Embryo Transfer Outcomes (9 items, 11%), and Complications During Pregnancy (5 items, 6%).

Discussion

In this study, we have proposed an ensemble clustering algorithm model to assess the influence of clinical characteristics on IVF-ET live births of fresh embryo transfer. This algorithm surpasses other algorithms in terms of accuracy and purity, demonstrating its robustness and reliability in handling the IVF-ET dataset. The results revealed that the five leading feature groups with the most substantial impact on live births in IVF-ET are Therapeutic Interventions, Embryo Transfer outcomes, Ovarian Response Assessment Indicators, Embryo Transfer-related Indicators, and Complications During Pregnancy. Conversely, factors such as basic male and female information, female diagnosis, and obstetric history had a relatively minor influence. Among these, Therapeutic interventions as the most influential factor, encompassing multiple aspects of the treatment plan, including ovulation stimulation drugs (recombinant human follicle-stimulating hormone (rFSH) and human menopausal gonadotropin (hMG)), the ovarian stimulation protocol, and the utilization of acupuncture before and during the IVF cycle. The European Society for Human Reproduction (ESHRE) guideline on ovarian stimulation in IVF/ICSI recommends both rFSH and hMG as viable options 35 . However, the initial dosage of gonadotrophin is pivotal in determining the outcome of controlled ovarian stimulation (COS) and subsequent IVF outcomes 36 . Thus, it’s crucial to consider an individual’s ovarian potential before initiating stimulation, as a standardized prescription may adversely affect women’s outcomes 37 . For example, low doses may result in insufficient follicular development in women with normal or high ovarian reserve while excessive doses could lead to ovarian hyperstimulation syndrome (OHSS) 38 . Once the follicle has reached a certain size, gonadotropin-releasing hormone (GnRH) -agonists can be used to stimulate the maturation and increase ovum count. On the other hand, recombinant GnRH -antagonists can be employed to inhibit the release of natural luteinizing hormone, thereby preserving eggs for further development. The selection of the ovarian stimulation protocol closely correlates with OHSS occurrence and clinical pregnancy rate 39 . In the general IVF population, GnRH antagonists were associated with a lower ongoing pregnancy rate after fresh embryo transfer compared to long-protocol agonists with lower OHSS rates. This underscores the challenge of selecting the most suitable protocol for individual patients. Individualizing treatment in IVF aims to maximize pregnancy chances while minimizing ovarian stimulation risks 38 . Thus, the selection of ovarian stimulation drugs and protocols is a crucial factor for IVF-ET outcomes, and treatment should be individualized based on ovarian response 35 . Our model highlights ovarian response as a key factor, recommending antral follicle count (AFC) or AMH for predicting high or poor ovarian response 26 . Since age and BMI inversely correlate with AMH, they are also important considerations when personalizing treatment plans 40 – 43 . Acupuncture, as traditional adjuvant therapy, is being increasingly chosen by subfertility couples to improve the success rate of IVF-ET 44 , 45 . In the United States, 44% of infertile women undergoing IVF-ET administrate acupuncture 46 . However, the potential of acupuncture to enhance the live birth rate of IVF-ET remains debatable 47 , 48 . Recent clinical studies have indicated several positive effects of acupuncture. It has been found to reduce anxiety during embryo transfer 49 , improve oocyte quality 50 , and enhance endometrial blood flow and receptivity 51 , ultimately leading to improved outcomes in IVF-assisted pregnancy. Additionally, when examining the impact of the ovulation stimulation drugs, ovarian stimulation protocols, and acupuncture (pre-cycle and intra-cycle), we observed a significant decrease in their influence on the IVF-ET model, with acupuncture showing the least effect. To investigate this further, we conducted additional analysis and data mining. We found that the majority of patients in our dataset did not receive acupuncture treatment. Only 198 patients received intra-cycle acupuncture and 144 patients received pre-cycle acupuncture. It is important to note that the efficacy of acupuncture is closely related to the number of sessions 52 , 53 Therefore, the limited use of acupuncture in our dataset may not accurately reflect its true potential in enhancing IVF-ET outcomes. Hence, concluding that acupuncture is ineffective based solely on our findings would be premature. Upon comprehensive consideration of the intervention factors, their combined influence remains significant, hinting at potential synergistic effects among multiple therapies. However, further validation is required to substantiate this observation. Our results indicated that multiple clinical features after embryo transfer significantly impact the IVF-ET model. Specifically, we considered the Embryo Transfer Outcomes group and the Complications During Pregnancy group. The Embryo Transfer Outcomes group encompassed conditions such as ectopic pregnancy, miscarriage, and premature delivery, while the Complications During Pregnancy group included gestational hypertension, gestational diabetes, intrahepatic cholestasis of pregnancy, fetal transfusion, and premature rupture of membranes. These findings aligned with established clinical patterns 54 – 57 , suggesting good validity for our model in analyzing the IVF-ET dataset. Additionally, the Embryo Transfer-related Indicators group comprised factors like endometrial thickness, transferred embryo count, and transferred good-quality embryo count. These factors are widely acknowledged as critical determinants of live birth outcomes in the context of IVF-ET 58 , 59 . Our research findings emphasize the significance of various factors in IVF-ET outcomes. While obstetric history, which includes past pregnancies and deliveries, is generally considered relevant to IVF success, our data mining model does not assign it significant importance compared to other features. A history of successful pregnancies may suggest fertility capability, while previous failed pregnancies or miscarriages could indicate underlying fertility issues. Similarly, cesarean sections or uterine surgeries may affect uterus shape and integrity, potentially impacting embryo implantation. However, our model found that prior obstetric history did not significantly affect IVF-ET outcomes. It is important to note that patients seeking ART assistance often face significant fertility challenges and may have compromised natural conception abilities. Although past reproductive history may influence future pregnancies, it is not decisive in determining IVF-ET success. Our model also indicates that the cause of a woman’s infertility does not play a significant role in IVF-ET outcomes. Additionally, we considered the ethnicity of both partners in our analysis, given China’s multi-ethnic nature. Our dataset included patients from 30 ethnic groups, with the largest representation being Han ( n  = 2006), followed by Tibetan ( n  = 52) and Yi ( n  = 138). Other ethnicities, such as Hui, Tujia, Qiang, and Miao, were less prevalent. Interestingly, our results show minimal impact of ethnicity on the model. Furthermore, the educational background and occupation of both partners had minimal influence on the model, indicating that these factors may not significantly affect IVF-ET success.

Conclusions

Our data mining results indicate that therapeutic intervention, ovarian function, and embryo quality are the primary factors influencing pregnancy outcomes in fresh embryo transfer. Conversely, ethnic background, occupational status, educational levels, female infertility cause, and previous pregnancy history do not significantly impact pregnancy outcomes. Using NMFE, we evaluated and ranked the influence of various factors on patients undergoing fresh embryo transfer.Several limitations point to avenues for future research. Firstly, we did not explore in detail how specific characteristics impact IVF-ET outcomes. For instance, we did not determine optimal ovarian stimulation protocols tailored to individual patients. Similarly, we did not investigate the efficacy of acupuncture administered before and during the IVF cycle, nor did we establish the ideal number of acupuncture sessions. Furthermore, our model did not establish optimal dosages for medications or guide combining clinical interventions to achieve the best results. As such, our next steps involve enriching the dataset and conducting an in-depth analysis of these issues. Additionally, we plan to develop an artificial intelligence-driven personalized IVF support model to assist clinicians in selecting better treatment plans. Moreover, insights from this study will be used to further investigate matters related to frozen embryo transfer, with the ultimate goal of reducing economic costs for patients seeking assisted reproduction.

Introduction

Infertility affects approximately one in six couples worldwide 1 . Assisted reproductive technology (ART) is recommended for couples with unresolved infertility. However, achieving a satisfactory pregnancy rate remains challenging. It indicates that the live birth rate (LBR) per initiated cycle was 40.1% for women under 35 and 4.5% for women over 42 in the United States in 2013 2 . Previous research 3 has highlighted key factors—including weight, ovarian function, and comorbidity—that significantly impact the success of assisted reproduction programs 4 . Recently, additional research has highlighted ethnic origin 5 , male age 6 , and embryo cryopreservation durations 7 as potential variables. However, identifying the key influencing factors remains challenging. Machine learning techniques offer a promising solution. By extracting insights from historical data, machine learning allows for comprehensive analysis and ranking of factors influencing ART outcomes. This cutting-edge discipline leverages complex big data to acquire valuable knowledge efficiently 8 and has found extensive applications across various fields, including healthcare 9 . For instance, dynamic systems design and control in applications such as robotics, autonomous vehicles, and industrial process plants 10 . In the medical domain, machine learning has demonstrated its utility in tasks such as COVID-19 diagnosis and epidemic forecasting 11 , medical image analysis 12 , cancer diagnosis and treatment selection 13 , and electronic health record management 14 . This technology enables pattern recognition and prediction of disease risk, treatment responses, and patient outcomes 15 . Within the realm of ART, machine learning has been used to assess embryo quality 16 , analyze sperm characteristics 17 , and explore predictive models for ovarian reserve function (such as anti-Mullerian hormone(AMH) level, follicle-stimulating hormone(FSH) level, and age) 18 . However, despite these advancements, the relative importance of different influencing factors in the in vitro fertilization and embryo transfer (IVF-ET) process has not been thoroughly studied. To address this gap, we propose the use of a clustering ensemble approach to analyze the significance of each feature in the IVF-ET algorithm model. Cluster analysis, an unsupervised machine learning technique, is particularly employed to extract insights from unlabeled data 19 . Effective clustering algorithms are widely applied across various fields, including Vehicular Ad hoc Networks (VANETs) 20 and other contexts where search efficiency and coverage of critical scenarios are key considerations 21 . Ensemble classifiers distinguish themselves in reducing false positives in high-risk scenarios 22 , thereby enhancing clustering accuracy. Their adaptability to various datasets 23 , and robustness against data noise, bolstered by integrating multiple deep networks, further underscore their superiority 24 . Effective clustering algorithms such as non-negative matrix factorization (NMF) 25 , accelerated multiplicative updates for non-negative matrix factorization (AMU-NMF) 26 , generalized deep learning clustering (GDLC) algorithm based on NMF 27 , Multi-view clustering (MVC) algorithm based on deep semi-NMF 28 , generalized deep learning algorithm based on NMF for multi-view clustering 29 , Meta-CLustering Algorithm (MCLA) 30 , and dense representation based ensemble clustering (DREC) algorithm 31 have been developed. These algorithms have been applied to identify signature genes associated with recurrent implantation failure (RIF) 32 and gene co-clusters in two species 33 , demonstrating their potential in complex biological datasets. Given the existing gaps in the literature and the promise of machine learning techniques, we conducted a retrospective study to assess the significance of various influencing factors in the IVF-ET process. Data comprising clinical characteristics and live birth outcomes of IVF-ET patients at Sichuan Jinxin Xi’nan Women’s and Children’s Hospital between January 2022 and December 2022 were collected and analyzed using a self-developed ensemble algorithm called NMF-based ensemble algorithm (NMFE). This algorithm combines the strengths of NMF, AMU-NMF, and GDLC, aiming to improve the efficiency of data clustering and provide valuable insights aimed at enhancing the success rate of IVF-ET.

Supplementary Material

Below is the link to the electronic supplementary material. Supplementary Material 1 Supplementary Material 1 Supplementary Material 2 Supplementary Material 2

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: pmc-nxml

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-06-21T06:12:49.409960+00:00
unpaywall
last seen: 2026-05-21T05:10:58.409756+00:00
License: CC-BY-NC-ND-4.0