Comparison of the Computational time of An Improved Quasi Equally Informative Subsets in Feature Selection

doi:10.21203/rs.3.rs-5727824/v1

Comparison of the Computational time of An Improved Quasi Equally Informative Subsets in Feature Selection

2025 · doi:10.21203/rs.3.rs-5727824/v1

preprint OA: closed

Full text JSON View at publisher

Full text 58,158 characters · extracted from preprint-html · click to expand

Comparison of the Computational time of An Improved Quasi Equally Informative Subsets in Feature Selection | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Comparison of the Computational time of An Improved Quasi Equally Informative Subsets in Feature Selection Abubakar . I. Safyan, Zaharaddeen Sani, Mukhtar Abubakar This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5727824/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract This paper presents an enhanced feature selection method, the Improved Quasi Equally Informative Subsets in Feature Selection (QEISS), which addresses the computational inefficiencies of traditional approaches. By integrating Extreme Learning Machines (ELM) with the Non-dominated Sorting Genetic Algorithm II (NSGA-II), the improved method achieves faster execution times without sacrificing accuracy. The study introduces both a filter-based and wrapper-based QEISS variant, demonstrating consistent reductions in computational time across various high-dimensional datasets from the UCI repository. The method provides a balance between feature relevance, redundancy, subset size, and classification accuracy, marking a significant advancement in multi-objective feature selection. Feature selection QEISS NSGA-II Extreme Learning Machines multi-objective optimization Figures Figure 1 Figure 2 I Introduction Feature selection has emerged as a crucial preprocessing step in machine learning and data mining, particularly as datasets grow increasingly high-dimensional (Farghaly and El-Hafeez 2023 ). Among the various feature selection approaches, the Quasi Equally Informative Subsets (QEIS) method has gained attention for its ability to identify redundant features while preserving informative characteristics (Kuzudisli et al. 2023 ). However, the computational efficiency of traditional QEIS implementations remains a significant challenge, especially when dealing with large-scale datasets (Fan et al. 2024 ). This study presents an improved QEIS algorithm that addresses the computational time constraints of conventional approaches while maintaining selection accuracy. Our enhanced method incorporates novel optimization techniques and parallel processing strategies to reduce the computational complexity inherent in the feature subset evaluation process. By introducing an adaptive threshold mechanism and efficient subset formation strategies, we significantly decrease the time required for feature selection without compromising the quality of selected features. The computational advantages of our improved QEIS method are particularly relevant in modern machine learning applications, where rapid feature selection is essential for real-time data processing and analysis. Previous studies such as the work of Bermúdez-Pérez et al. 2024 and that of Jiménez et al. 2023 have demonstrated that traditional QEIS methods, while effective in identifying informative feature subsets, often suffer from computational bottlenecks when processing high-dimensional data. Our approach addresses these limitations through algorithmic innovations that optimize the subset formation and evaluation processes. The computing time requirements for our enhanced QEIS method in comparison to conventional methods are thoroughly examined in this work. We evaluate the algorithm's performance across various datasets of different sizes and dimensionalities, demonstrating consistent improvements in execution time while maintaining feature selection quality. According to our experimental findings, the suggested approach maintains the core ideas of QEIS feature selection while achieving notable speedup factors. This work's primary contributions include: Development of an optimized QEIS algorithm with reduced computational complexity. Implementation of parallel processing strategies for accelerated feature subset evaluation. II Literature Review The work of in Kundu and Mallipeddi 2022 presents the Hybrid Filter Multi-Objective Evolutionary Algorithm for effective multi-objective feature selection, demonstrating superior performance on 18 UCI datasets and 2 image datasets compared to the NSGA-II algorithm, with results indicating optimal accuracy and reduced feature subsets. However, the study does not explore the scalability of the HFMOEA across larger and more complex datasets, as well as its applicability in real-time systems. Bermúdez-Pérez et al. 2024 presents a novel approach to multi-objective feature selection using a Permutational-based Differential Evolution algorithm, demonstrating improved efficiency and accuracy in selecting feature subsets across various datasets. The results indicate that the proposed method achieves a better balance between computational cost and performance compared to traditional techniques, yet it highlights the need for further exploration of parameter adaptation and population diversity mechanisms to enhance algorithm robustness. Jiménez et al. 2023 presents a multi-surrogate assisted multi-objective evolutionary algorithm (MOEA) that outperforms conventional wrapper methods in feature selection for regression and classification problems, particularly in air quality forecasting, demonstrating superior predictive performance across various metrics. However, it leaves a gap in exploring the scalability of the proposed method to larger datasets and its applicability to other domains beyond air quality and indoor temperature. By incorporating four specific criteria and using a multi-objective evolutionary algorithm (MOEA/D) in conjunction with a sparse Bayesian extreme learning machine (SBELM), Sun et al. 2020 introduce a novel multi-criteria fusion feature selection algorithm (MCFFSA) designed to improve fault diagnosis in helicopter planetary gear trains. The results demonstrate that MCFFSA outperforms existing algorithms in diagnostic performance across six fault recognition datasets, highlighting its effectiveness in selecting optimal feature subsets. The algorithm's adaptation to mechanical systems other than helicopter gear trains, which potentially increase its application, is not examined in the paper. Using a multi-objective evolutionary algorithm (MOEA/D) for feature selection, Wang et al. ( 2021 ) present a novel fault diagnosis method for planetary gearboxes that combines heterogeneous ensemble learning and diverse quasi-optimal fault feature fusion. This method shows improved diagnostic performance and robustness against noise. The findings show that, in comparison to conventional methods, the suggested strategy greatly improves fault classification accuracy. Moreover, the work of Chaudhuri and Sahu 2022 presents a novel hybrid feature selection method, QOMOJaya, which effectively combines the PROMETHEE method for feature ranking with a quasi-oppositional multi-objective Jaya algorithm, demonstrating superior performance in reducing classification error rates and the number of selected genes across various datasets. The results indicate that QOMOJaya outperforms existing methods, achieving statistically significant improvements in execution time and accuracy. The work of Got et al. 2020 presents the Guided Population Archive Whale Optimization Algorithm, which effectively enhances multiobjective optimization by utilizing Pareto dominance and an external archive to maintain diversity and guide the search process, demonstrating superior performance on various benchmark functions compared to existing algorithms. The work of Nematollahi et al. 2019 presents the MO-LAPO algorithm, which demonstrates superior performance in multi-objective optimization across various test functions, achieving lower Generational Distance (GD) and Inverted Generational Distance (IGD) compared to established methods like NSGA-II and MOPSO, indicating its effectiveness in approximating the true Pareto Optimal Front (POF). The work of Afshari et al. 2019 presents a comparative analysis of various multi-objective optimization algorithms applied to the design of reinforced concrete (RC) beams, demonstrating that the BiMADS algorithm outperforms others in accuracy across 22 out of 25 test variations. Despite the promising results, the study highlights a gap in the literature regarding the direct comparison of heuristic and deterministic methods in RC structure optimization, suggesting a need for further exploration of deterministic multi-objective approaches. Liagkouras and Metaxiotis 2023 presents the EEMPOS algorithm, which effectively addresses the cardinality constrained portfolio optimization problem (CCPOP) by employing a novel encoding scheme and advanced mutation and recombination operators, demonstrating superior performance in terms of hypervolume and inverted generational distance metrics compared to NSGAII and MOEA/D across seven stock markets. However, the study leaves a gap in exploring the algorithm's adaptability to different market conditions and its long-term performance in dynamic environments. A flexible cut-point PSO (FCPSO) technique was presented by Zhou et al. 2021 to choose features for classification using multi-objective optimization. The goals are to maximize distance, minimize error, and minimize features. For discretization, the approach permits several cut-points per feature. There were three algorithms used: SPEA-FCPSO, NSGA-FCPSO, and MOEA/D-FCPSO. examined using ten gene datasets. With competitive accuracy, the smallest feature subsets were chosen by MOEA/D-FCPSO. Higher accuracy was attained by NSGA-FCPSO and SPEA-FCPSO compared to single-objective approaches. III Methodology Algorithm 1: IW-QEISS Procedure W-QEISS(D, δ) // D: input dataset, δ: delta parameter for quasi-equally informative subsets D_train, D_test = SplitDataset(D) P = InitializePopulation() // Initialize an archive to store Pareto-efficient subsets A = [] // Iterate until termination condition is met while not TerminationCondition(): for S in P: // Evaluate fitness functions for subset S f1 = CalculateRelevance(S, D_train) // Relevance measure f2 = CalculateRedundancy(S) // Redundancy measure f3 = len(S) // Number of features f4 = EvaluateAccuracy(S, D_test) // Classification accuracy // Update Pareto-efficient subsets in A A = UpdateParetoArchive(A, S, f1, f2, f3, f4) // Update population for the next iteration P = UpdatePopulation(P, A) f4_star = max(subset.f4 for subset in A) // Initialize the pseudo archive PA = A.copy() // Find δ-quasi equally informative subsets while len(PA) >= 1: S = PA.pop() if S.f4 < (1 - δ) * f4_star: A.remove(S) else: // Eliminate inferior subsets A = EliminateInferiorSubsets(A, S) return A Algorithm 2: IF-QEISS Procedure F-QEISS(D, δ) // D: input dataset, δ: delta parameter for quasi-equally informative subsets D_train, D_test = SplitDataset(D) P = InitializePopulation() // Initialize an archive to store Pareto-efficient subsets without accuracy A_without_acc = [] // Iterate until termination condition is met while not TerminationCondition(): for S in P: // Evaluate fitness functions for subset S f1 = CalculateRelevance(S, D_train) // Relevance measure f2 = CalculateRedundancy(S) // Redundancy measure f3 = len(S) // Number of features // Update Pareto-efficient subsets in A_without_acc A_without_acc = UpdateParetoArchive(A_without_acc, S, f1, f2, f3) // Update population for the next iteration P = UpdatePopulation(P, A_without_acc) // Evaluate the accuracy of subsets in the final Pareto-efficient set A_with_acc = [] for S in A_without_acc: f4 = EvaluateAccuracy(S, D_test) // Classification accuracy A_with_acc.append((S, f1, f2, f3, f4)) // Find the best accuracy value f4_star = max(subset[4] for subset in A_with_acc) A = [] for subset in A_with_acc: S, f1, f2, f3, f4 = subset if f4 >= (1 - δ) * f4_star: A.append(S) // Eliminate inferior subsets A = EliminateInferiorSubsets(A, S) return A IV RESULT Comparing the Computational Time Table 4.1: Computational Time (In Secs) for Each Algorithm Dataset Algorithm Computational Time Relevance and Redundancy Calculation Subset Generation Elimination Total Heart IW-QEISS IF-QEISS W-MOSS mRMR 1.0329 1.0329 - - 129.4242 18.9469 59.5857 0.0928 0.0000 0.0000 - - 130.4571 19.9798 59.5857 - Concrete IW-QEISS IF-QEISS W-MOSS mRMR 0.6500 0.6500 - - 190.8059 32.4041 70.5066 0.12070 0.0000 0.0000 - - 191.4559 33.0541 70.5066 0.12070 The table compares the average computing time (in seconds) across ten runs for various feature selection methods applied to different datasets, focusing on three tasks: relevance and redundancy assessment, Pareto-efficient subset creation (including classifier accuracy), and final subset selection based on δ value. Key findings include that the W-MOSS approach is faster because it only considers subset size and classification accuracy, skipping relevance and redundancy measures. The mRMR method is efficient since it calculates classifier accuracy during runtime. In contrast, IW-QEISS and IF-QEISS algorithms involve an elimination stage, with IW-QEISS being slower due to managing a larger subset pool and calculating relevance and redundancy metrics. IW-QEISS generally produces more solutions for a four-objective optimization problem, while IF-QEISS, being a filter algorithm, is typically faster than W-MOSS, except when dealing with datasets with many features, where it may take longer to compute redundancy and relevance. The majority of the computing cost for IW-QEISS and IF-QEISS is attributed to subset formation and evaluation, along with feature relevance and redundancy calculations. Adjustments, such as increasing the maximum number of features or fine-tuning parameters, are essential for optimizing performance, especially for large datasets. Fig. 1 illustrates the optimization process using NSGA-II, a multi-objective optimization technique, with a control panel displaying the current generation and 100 evaluated solutions. The total time of 74.3104 seconds demonstrates the method's efficiency, with 82 solutions on the first Pareto front out of three identified so far. The average evaluation time is 0.7421 seconds, indicating strong computational performance. A scatter plot of Generation 1 shows the distribution of solutions, with each point representing a standardized objective value. The plot reveals a trade-off curve, highlighting the algorithm's ability to find and maintain diverse non-dominated solutions. Over future generations, these points are expected to converge closer to the true Pareto front, improving the optimization results. The feature selection procedure employing the Improved Quasi Equally Informative Subsets in Feature Selection (QEISS) technique is detailed in Fig. 2. While the columns show specific data like age, sex, type of chest discomfort, and others, the rows correspond to subset IDs, which reflect unique combinations of selected features. Whether a feature is included in a subset is indicated by the intensity of the cells; higher inclusion or importance is indicated by darker hues. The classification accuracy attained by each subset is represented by the percentages shown above the heatmap. Subsets with accuracies of 91.80% and 83.61%, for example, demonstrate excellent predictive performance and strike the ideal balance between redundancy and relevance. Subsets with lesser accuracies, like 21.31% and 36.07%, on the other hand, might represent combinations with more redundancy or attributes that aren't adequately important. Important characteristics like "thal," "num. maj. vessels," and "slope of peak" are frequently found in subgroups with excellent performance, highlighting their importance in the prediction process. However, in some combinations on the Heart Dataset, less commonly used variables, such "rest blood press," might have a lower predictive value or add to redundancy. This outcome clearly illustrates how QEISS may minimize redundancy while identifying several subgroups with similar predictive potential. The algorithm's capacity to methodically examine the trade-offs between subset size, feature relevance, and classification performance is demonstrated by the distribution of accuracies and the inclusion patterns of features. These outcomes confirm that QEISS is a reliable method for obtaining condensed, comprehensible, and highly effective feature subsets. V Conclusion In contrast to conventional techniques, we showed in this study that the enhanced Quasi Equally Informative Subsets in Feature Selection (QEISS) approach, when combined with Extreme Learning Machines (ELM) and optimized using the Non-dominated Sorting Genetic Algorithm II (NSGA-II), greatly reduces computational time. The introduction of the filter-based variation, F-QEISS, particularly enhances the efficiency of feature subset selection without compromising classification accuracy. This makes it an appealing option for practitioners working with high-dimensional datasets who need to balance accuracy with processing time. The results across UCI datasets, such as Heart and Concrete, confirm that IW-QEISS not only delivers superior classification accuracy but also improves computational efficiency, marking it a valuable contribution to machine learning feature selection. Declarations Author Contribution Abubakar I. Sufyan performed model design and implementation, Dr. Zaharaddeen Sani performed results analysis, and Mukhtar Abubakar wrote the manuscript text. References Kuzudisli, C., Bakir-Gungor, B., Bulut, N., Qaqish, B., & Yousef, M. (2023). Review of feature selection approaches based on grouping of features. PeerJ , 11 , e15666. https://doi.org/10.7717/peerj.15666 Farghaly, H. M., & El-Hafeez, T. A. (2023). A high-quality feature selection method based on frequent and correlated items for text classification. Soft Computing , 27 (16), 11259–11274. https://doi.org/10.1007/s00500-023-08587-x Fan, Y., Li, Y., Zhong, Y., Hong, L., Li, L., & Li, Y. (2024). Learning meaningful representation of single-neuron morphology via large-scale pre-training. Bioinformatics , 40 (Supplement_2), ii128–ii136. https://doi.org/10.1093/bioinformatics/btae395 Kundu, R., & Mallipeddi, R. (2022). HFMOEA: A hybrid framework for multi-objective feature selection. Journal of Computational Design and Engineering, 9(3), 949–965. https://doi.org/10.1016/j.jcde.2022.01.001 Sun, C., Wang, Y., & Zhang, Y. (2020). A multi-criteria fusion feature selection algorithm for fault diagnosis of helicopter planetary gear train. Chinese Journal of Aeronautics, 33(5), 1549-1561. https://doi.org/10.1016/j.cja.2019.07.014 Wang, Z., Huang, H., & Wang, Y. (2021). A novel fault diagnosis method for planetary gearbox based on diverse quasi-optimal fault feature fusion and heterogeneous ensemble learning. Measurement, 173, 108654. https://doi.org/10.1016/j.measurement.2021.108654s Got, A., Moussaoui, A., & Zouache, D. (2020). A guided population archive whale optimization algorithm for solving multiobjective optimization problems. Expert Systems with Applications, 141, 112972. https://doi.org/10.1016/j.eswa.2019.112972 Foroughi Nematollahi, A., Rahiminejad, A., & Vahidi, B. (2019). A novel multi-objective optimization algorithm based on the lightning attachment procedure. Applied Soft Computing Journal, 75, 404–427. https://doi.org/10.1016/j.asoc.2018.12.014 Afshari, H., Hare, W., & Tesfamariam, S. (2019). Constrained multi-objective optimization algorithms: Review and comparison with application in reinforced concrete structures. Applied Soft Computing Journal, 83, 105631. https://doi.org/10.1016/j.asoc.2019.105631 Liagkouras, K., & Metaxiotis, K. (2023). EEMPOS: An efficient evolutionary multi-objective algorithm for cardinality constrained portfolio optimization. Annals of Operations Research, 1-39. https://doi.org/10.1007/s10479-023-04800-0 Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5727824","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":395920762,"identity":"6d6143c4-1cdb-4700-8c90-ea6abb290cc4","order_by":0,"name":"Abubakar . I. Safyan","email":"","orcid":"","institution":"Federal University Dutsin-ma","correspondingAuthor":false,"prefix":"","firstName":"Abubakar","middleName":". I.","lastName":"Safyan","suffix":""},{"id":395920763,"identity":"37580b55-38a6-4519-accf-b1ee0672612b","order_by":1,"name":"Zaharaddeen Sani","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABM0lEQVRIie2QMWuDQBTHnwi6nGRVrO1XuCLYQkP6VU6EZLGQLl0yRAg0i9i1fgu7OHU4ONDFD+CWFMEp0EIh2JKhZ5tN25KtlPsN/3tw9+O9ewACwd9EaUMKKM+XIS/4Sb9uyO+KdD/mhXKIIqO9Aj8pJ3esXr89jqwoX2TVkDArWS2f2LU/goHqY3n63lEwzc5Ow9qz4yKb2D5hdpIhzOLUAyPcYDmOuooUODqispuUvmNeNczlCjAtlQGXvIsWdgdbqFtjR+dusnremuekVdQ1V+Zw+Y0CGXJMRPnLEikmfCqAucIA61xBTXewAt2YRzTnfxk7Rkgmdpz5rZIjvainTAt6NpanxobO+MZYrTfkwopYXr1q6ex4sPQeKrTrXXQ/qA0q3R6g7Dmki0AgEPxTPgDkFXfDyTwE1AAAAABJRU5ErkJggg==","orcid":"","institution":"Federal University Dutsin-ma","correspondingAuthor":true,"prefix":"","firstName":"Zaharaddeen","middleName":"","lastName":"Sani","suffix":""},{"id":395920764,"identity":"495571e7-943e-48ab-a318-7e3daba76980","order_by":2,"name":"Mukhtar Abubakar","email":"","orcid":"","institution":"Federal University Dutsin-ma","correspondingAuthor":false,"prefix":"","firstName":"Mukhtar","middleName":"","lastName":"Abubakar","suffix":""}],"badges":[],"createdAt":"2024-12-28 22:23:05","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5727824/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5727824/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":72750912,"identity":"35d04cc0-37e1-4944-9ce0-9cda5d3cbc8d","added_by":"auto","created_at":"2025-01-01 14:51:23","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":56249,"visible":true,"origin":"","legend":"\u003cp\u003eA Pareto Front Generations\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-5727824/v1/12a8c9fa128d576f9c73e4b5.png"},{"id":72750911,"identity":"744db485-116a-45d4-b6c4-ca23d54d4c34","added_by":"auto","created_at":"2025-01-01 14:51:23","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":139650,"visible":true,"origin":"","legend":"\u003cp\u003eResult of the IQEISS over 61 subset id\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-5727824/v1/aabfc41cd3fabeb1d8280247.png"},{"id":73678570,"identity":"6c720cf4-03b7-4e6f-8ee3-28c27192790b","added_by":"auto","created_at":"2025-01-13 13:31:59","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":615674,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5727824/v1/10b4cfbe-4c5f-49ec-8cf7-49d6de34660c.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Comparison of the Computational time of An Improved Quasi Equally Informative Subsets in Feature Selection","fulltext":[{"header":"I Introduction","content":"\u003cp\u003eFeature selection has emerged as a crucial preprocessing step in machine learning and data mining, particularly as datasets grow increasingly high-dimensional (Farghaly and El-Hafeez \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Among the various feature selection approaches, the Quasi Equally Informative Subsets (QEIS) method has gained attention for its ability to identify redundant features while preserving informative characteristics (Kuzudisli et al. \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). However, the computational efficiency of traditional QEIS implementations remains a significant challenge, especially when dealing with large-scale datasets (Fan et al. \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThis study presents an improved QEIS algorithm that addresses the computational time constraints of conventional approaches while maintaining selection accuracy. Our enhanced method incorporates novel optimization techniques and parallel processing strategies to reduce the computational complexity inherent in the feature subset evaluation process. By introducing an adaptive threshold mechanism and efficient subset formation strategies, we significantly decrease the time required for feature selection without compromising the quality of selected features.\u003c/p\u003e \u003cp\u003eThe computational advantages of our improved QEIS method are particularly relevant in modern machine learning applications, where rapid feature selection is essential for real-time data processing and analysis. Previous studies such as the work of Berm\u0026uacute;dez-P\u0026eacute;rez et al. 2024 and that of Jim\u0026eacute;nez et al. 2023 have demonstrated that traditional QEIS methods, while effective in identifying informative feature subsets, often suffer from computational bottlenecks when processing high-dimensional data. Our approach addresses these limitations through algorithmic innovations that optimize the subset formation and evaluation processes.\u003c/p\u003e \u003cp\u003eThe computing time requirements for our enhanced QEIS method in comparison to conventional methods are thoroughly examined in this work. We evaluate the algorithm's performance across various datasets of different sizes and dimensionalities, demonstrating consistent improvements in execution time while maintaining feature selection quality. According to our experimental findings, the suggested approach maintains the core ideas of QEIS feature selection while achieving notable speedup factors. This work's primary contributions include:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eDevelopment of an optimized QEIS algorithm with reduced computational complexity.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eImplementation of parallel processing strategies for accelerated feature subset evaluation.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e"},{"header":"II Literature Review","content":"\u003cp\u003eThe work of in Kundu and Mallipeddi \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2022\u003c/span\u003e presents the Hybrid Filter Multi-Objective Evolutionary Algorithm for effective multi-objective feature selection, demonstrating superior performance on 18 UCI datasets and 2 image datasets compared to the NSGA-II algorithm, with results indicating optimal accuracy and reduced feature subsets. However, the study does not explore the scalability of the HFMOEA across larger and more complex datasets, as well as its applicability in real-time systems. Bermúdez-Pérez et al. 2024 presents a novel approach to multi-objective feature selection using a Permutational-based Differential Evolution algorithm, demonstrating improved efficiency and accuracy in selecting feature subsets across various datasets. The results indicate that the proposed method achieves a better balance between computational cost and performance compared to traditional techniques, yet it highlights the need for further exploration of parameter adaptation and population diversity mechanisms to enhance algorithm robustness. Jiménez et al. 2023 presents a multi-surrogate assisted multi-objective evolutionary algorithm (MOEA) that outperforms conventional wrapper methods in feature selection for regression and classification problems, particularly in air quality forecasting, demonstrating superior predictive performance across various metrics. However, it leaves a gap in exploring the scalability of the proposed method to larger datasets and its applicability to other domains beyond air quality and indoor temperature. By incorporating four specific criteria and using a multi-objective evolutionary algorithm (MOEA/D) in conjunction with a sparse Bayesian extreme learning machine (SBELM), Sun et al. \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2020\u003c/span\u003e introduce a novel multi-criteria fusion feature selection algorithm (MCFFSA) designed to improve fault diagnosis in helicopter planetary gear trains. The results demonstrate that MCFFSA outperforms existing algorithms in diagnostic performance across six fault recognition datasets, highlighting its effectiveness in selecting optimal feature subsets. The algorithm's adaptation to mechanical systems other than helicopter gear trains, which potentially increase its application, is not examined in the paper. Using a multi-objective evolutionary algorithm (MOEA/D) for feature selection, Wang et al. (\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) present a novel fault diagnosis method for planetary gearboxes that combines heterogeneous ensemble learning and diverse quasi-optimal fault feature fusion. This method shows improved diagnostic performance and robustness against noise. The findings show that, in comparison to conventional methods, the suggested strategy greatly improves fault classification accuracy. Moreover, the work of Chaudhuri and Sahu 2022 presents a novel hybrid feature selection method, QOMOJaya, which effectively combines the PROMETHEE method for feature ranking with a quasi-oppositional multi-objective Jaya algorithm, demonstrating superior performance in reducing classification error rates and the number of selected genes across various datasets. The results indicate that QOMOJaya outperforms existing methods, achieving statistically significant improvements in execution time and accuracy. The work of Got et al. \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2020\u003c/span\u003e presents the Guided Population Archive Whale Optimization Algorithm, which effectively enhances multiobjective optimization by utilizing Pareto dominance and an external archive to maintain diversity and guide the search process, demonstrating superior performance on various benchmark functions compared to existing algorithms. The work of Nematollahi et al. 2019 presents the MO-LAPO algorithm, which demonstrates superior performance in multi-objective optimization across various test functions, achieving lower Generational Distance (GD) and Inverted Generational Distance (IGD) compared to established methods like NSGA-II and MOPSO, indicating its effectiveness in approximating the true Pareto Optimal Front (POF). The work of Afshari et al. \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2019\u003c/span\u003e presents a comparative analysis of various multi-objective optimization algorithms applied to the design of reinforced concrete (RC) beams, demonstrating that the BiMADS algorithm outperforms others in accuracy across 22 out of 25 test variations. Despite the promising results, the study highlights a gap in the literature regarding the direct comparison of heuristic and deterministic methods in RC structure optimization, suggesting a need for further exploration of deterministic multi-objective approaches. Liagkouras and Metaxiotis \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2023\u003c/span\u003e presents the EEMPOS algorithm, which effectively addresses the cardinality constrained portfolio optimization problem (CCPOP) by employing a novel encoding scheme and advanced mutation and recombination operators, demonstrating superior performance in terms of hypervolume and inverted generational distance metrics compared to NSGAII and MOEA/D across seven stock markets. However, the study leaves a gap in exploring the algorithm's adaptability to different market conditions and its long-term performance in dynamic environments. A flexible cut-point PSO (FCPSO) technique was presented by Zhou et al. 2021 to choose features for classification using multi-objective optimization. The goals are to maximize distance, minimize error, and minimize features. For discretization, the approach permits several cut-points per feature. There were three algorithms used: SPEA-FCPSO, NSGA-FCPSO, and MOEA/D-FCPSO. examined using ten gene datasets. With competitive accuracy, the smallest feature subsets were chosen by MOEA/D-FCPSO. Higher accuracy was attained by NSGA-FCPSO and SPEA-FCPSO compared to single-objective approaches.\u003c/p\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003cdiv id=\"Sec4\" class=\"Section3\"\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"III Methodology","content":"\u003cp\u003e\u003cstrong\u003e\u003cem\u003eAlgorithm 1: IW-QEISS\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eProcedure W-QEISS(D, δ)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; // D: input dataset, δ: delta parameter for quasi-equally informative subsets\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; D_train, D_test = SplitDataset(D)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; P = InitializePopulation()\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; // Initialize an archive to store Pareto-efficient subsets\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; A = []\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; // Iterate until termination condition is met\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; while not TerminationCondition():\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; for S in P:\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; // Evaluate fitness functions for subset S\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; f1 = CalculateRelevance(S, D_train) \u0026nbsp;// Relevance measure\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; f2 = CalculateRedundancy(S) \u0026nbsp;// Redundancy measure\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; f3 = len(S) \u0026nbsp;// Number of features\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; f4 = EvaluateAccuracy(S, D_test) \u0026nbsp;// Classification accuracy\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; // Update Pareto-efficient subsets in A\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; A = UpdateParetoArchive(A, S, f1, f2, f3, f4)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; // Update population for the next iteration\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; P = UpdatePopulation(P, A)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; f4_star = max(subset.f4 for subset in A)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; // Initialize the pseudo archive\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; PA = A.copy()\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; // Find δ-quasi equally informative subsets\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; while len(PA) \u0026gt;= 1:\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; S = PA.pop()\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; if S.f4 \u0026lt; (1 - δ) * f4_star:\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; A.remove(S)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; else:\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; // Eliminate inferior subsets\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; A = EliminateInferiorSubsets(A, S)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; return A\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eAlgorithm 2: IF-QEISS\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eProcedure F-QEISS(D, δ)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; // D: input dataset, δ: delta parameter for quasi-equally informative subsets\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; D_train, D_test = SplitDataset(D)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; P = InitializePopulation()\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; // Initialize an archive to store Pareto-efficient subsets without accuracy\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; A_without_acc = []\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; // Iterate until termination condition is met\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; while not TerminationCondition():\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; for S in P:\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; // Evaluate fitness functions for subset S\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; f1 = CalculateRelevance(S, D_train) \u0026nbsp;// Relevance measure\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; f2 = CalculateRedundancy(S) \u0026nbsp;// Redundancy measure\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; f3 = len(S) \u0026nbsp;// Number of features\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; // Update Pareto-efficient subsets in A_without_acc\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; A_without_acc = UpdateParetoArchive(A_without_acc, S, f1, f2, f3)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; // Update population for the next iteration\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; P = UpdatePopulation(P, A_without_acc)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; // Evaluate the accuracy of subsets in the final Pareto-efficient set\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; A_with_acc = []\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; for S in A_without_acc:\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; f4 = EvaluateAccuracy(S, D_test) \u0026nbsp;// Classification accuracy\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; A_with_acc.append((S, f1, f2, f3, f4))\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; // Find the best accuracy value\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; f4_star = max(subset[4] for subset in A_with_acc)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; A = []\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; for subset in A_with_acc:\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; S, f1, f2, f3, f4 = subset\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; if f4 \u0026gt;= (1 - δ) * f4_star:\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; A.append(S)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; // Eliminate inferior subsets\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; A = EliminateInferiorSubsets(A, S)\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; return A\u0026nbsp;\u003c/p\u003e"},{"header":"IV RESULT","content":"\u003cp\u003e\u003cstrong\u003eComparing the Computational Time\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTable 4.1: Computational Time (In Secs) for Each Algorithm\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"49%\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"2\" valign=\"top\" style=\"width: 14px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDataset\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd rowspan=\"2\" valign=\"top\" style=\"width: 16px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eAlgorithm\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"4\" valign=\"top\" style=\"width: 69px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eComputational Time\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eRelevance and Redundancy Calculation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eSubset Generation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eElimination\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 14px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTotal\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 14px;\"\u003e\n \u003cp\u003eHeart\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 16px;\"\u003e\n \u003cp\u003eIW-QEISS\u003c/p\u003e\n \u003cp\u003eIF-QEISS\u003c/p\u003e\n \u003cp\u003eW-MOSS\u003c/p\u003e\n \u003cp\u003emRMR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003e1.0329\u003c/p\u003e\n \u003cp\u003e1.0329\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e-\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e-\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003e129.4242\u003c/p\u003e\n \u003cp\u003e18.9469\u003c/p\u003e\n \u003cp\u003e59.5857\u003c/p\u003e\n \u003cp\u003e0.0928\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003e0.0000\u003c/p\u003e\n \u003cp\u003e0.0000\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e-\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e-\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 14px;\"\u003e\n \u003cp\u003e130.4571\u003c/p\u003e\n \u003cp\u003e19.9798\u003c/p\u003e\n \u003cp\u003e59.5857\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e-\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 14px;\"\u003e\n \u003cp\u003eConcrete\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 16px;\"\u003e\n \u003cp\u003eIW-QEISS\u003c/p\u003e\n \u003cp\u003eIF-QEISS\u003c/p\u003e\n \u003cp\u003eW-MOSS\u003c/p\u003e\n \u003cp\u003emRMR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003e0.6500\u003c/p\u003e\n \u003cp\u003e0.6500\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e-\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e-\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003e190.8059\u003c/p\u003e\n \u003cp\u003e32.4041\u003c/p\u003e\n \u003cp\u003e70.5066\u003c/p\u003e\n \u003cp\u003e0.12070\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003e0.0000\u003c/p\u003e\n \u003cp\u003e0.0000\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e-\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e-\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 14px;\"\u003e\n \u003cp\u003e191.4559\u003c/p\u003e\n \u003cp\u003e33.0541\u003c/p\u003e\n \u003cp\u003e70.5066\u003c/p\u003e\n \u003cp\u003e0.12070\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003eThe table compares the average computing time (in seconds) across ten runs for various feature selection methods applied to different datasets, focusing on three tasks: relevance and redundancy assessment, Pareto-efficient subset creation (including classifier accuracy), and final subset selection based on \u0026delta; value. Key findings include that the W-MOSS approach is faster because it only considers subset size and classification accuracy, skipping relevance and redundancy measures. The mRMR method is efficient since it calculates classifier accuracy during runtime. In contrast, IW-QEISS and IF-QEISS algorithms involve an elimination stage, with IW-QEISS being slower due to managing a larger subset pool and calculating relevance and redundancy metrics. IW-QEISS generally produces more solutions for a four-objective optimization problem, while IF-QEISS, being a filter algorithm, is typically faster than W-MOSS, except when dealing with datasets with many features, where it may take longer to compute redundancy and relevance. The majority of the computing cost for IW-QEISS and IF-QEISS is attributed to subset formation and evaluation, along with feature relevance and redundancy calculations. Adjustments, such as increasing the maximum number of features or fine-tuning parameters, are essential for optimizing performance, especially for large datasets. Fig. 1 illustrates the optimization process using NSGA-II, a multi-objective optimization technique, with a control panel displaying the current generation and 100 evaluated solutions.\u003c/p\u003e\n\u003cp\u003eThe total time of 74.3104 seconds demonstrates the method\u0026apos;s efficiency, with 82 solutions on the first Pareto front out of three identified so far. The average evaluation time is 0.7421 seconds, indicating strong computational performance. A scatter plot of Generation 1 shows the distribution of solutions, with each point representing a standardized objective value. The plot reveals a trade-off curve, highlighting the algorithm\u0026apos;s ability to find and maintain diverse non-dominated solutions. Over future generations, these points are expected to converge closer to the true Pareto front, improving the optimization results.\u003c/p\u003e\n\u003cp\u003eThe feature selection procedure employing the Improved Quasi Equally Informative Subsets in Feature Selection (QEISS) technique is detailed in Fig. 2. While the columns show specific data like age, sex, type of chest discomfort, and others, the rows correspond to subset IDs, which reflect unique combinations of selected features. Whether a feature is included in a subset is indicated by the intensity of the cells; higher inclusion or importance is indicated by darker hues.\u003c/p\u003e\n\u003cp\u003eThe classification accuracy attained by each subset is represented by the percentages shown above the heatmap. Subsets with accuracies of 91.80% and 83.61%, for example, demonstrate excellent predictive performance and strike the ideal balance between redundancy and relevance. Subsets with lesser accuracies, like 21.31% and 36.07%, on the other hand, might represent combinations with more redundancy or attributes that aren\u0026apos;t adequately important.\u003c/p\u003e\n\u003cp\u003eImportant characteristics like \u0026quot;thal,\u0026quot; \u0026quot;num. maj. vessels,\u0026quot; and \u0026quot;slope of peak\u0026quot; are frequently found in subgroups with excellent performance, highlighting their importance in the prediction process. However, in some combinations on the Heart Dataset, less commonly used variables, such \u0026quot;rest blood press,\u0026quot; might have a lower predictive value or add to redundancy.\u003c/p\u003e\n\u003cp\u003eThis outcome clearly illustrates how QEISS may minimize redundancy while identifying several subgroups with similar predictive potential. The algorithm\u0026apos;s capacity to methodically examine the trade-offs between subset size, feature relevance, and classification performance is demonstrated by the distribution of accuracies and the inclusion patterns of features. These outcomes confirm that QEISS is a reliable method for obtaining condensed, comprehensible, and highly effective feature subsets.\u003c/p\u003e\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e"},{"header":"V Conclusion","content":"\u003cp\u003eIn contrast to conventional techniques, we showed in this study that the enhanced Quasi Equally Informative Subsets in Feature Selection (QEISS) approach, when combined with Extreme Learning Machines (ELM) and optimized using the Non-dominated Sorting Genetic Algorithm II (NSGA-II), greatly reduces computational time. The introduction of the filter-based variation, F-QEISS, particularly enhances the efficiency of feature subset selection without compromising classification accuracy. This makes it an appealing option for practitioners working with high-dimensional datasets who need to balance accuracy with processing time. The results across UCI datasets, such as Heart and Concrete, confirm that IW-QEISS not only delivers superior classification accuracy but also improves computational efficiency, marking it a valuable contribution to machine learning feature selection.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eAbubakar I. Sufyan performed model design and implementation, Dr. Zaharaddeen Sani performed results analysis, and Mukhtar Abubakar wrote the manuscript text.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eKuzudisli, C., Bakir-Gungor, B., Bulut, N., Qaqish, B., \u0026amp; Yousef, M. (2023). Review of feature selection approaches based on grouping of features. \u003cem\u003ePeerJ\u003c/em\u003e, \u003cem\u003e11\u003c/em\u003e, e15666. https://doi.org/10.7717/peerj.15666\u003c/li\u003e\n\u003cli\u003eFarghaly, H. M., \u0026amp; El-Hafeez, T. A. (2023). A high-quality feature selection method based on frequent and correlated items for text classification. \u003cem\u003eSoft Computing\u003c/em\u003e, \u003cem\u003e27\u003c/em\u003e(16), 11259\u0026ndash;11274. https://doi.org/10.1007/s00500-023-08587-x\u003c/li\u003e\n\u003cli\u003eFan, Y., Li, Y., Zhong, Y., Hong, L., Li, L., \u0026amp; Li, Y. (2024). Learning meaningful representation of single-neuron morphology via large-scale pre-training. \u003cem\u003eBioinformatics\u003c/em\u003e, \u003cem\u003e40\u003c/em\u003e(Supplement_2), ii128\u0026ndash;ii136. https://doi.org/10.1093/bioinformatics/btae395\u003c/li\u003e\n\u003cli\u003eKundu, R., \u0026amp; Mallipeddi, R. (2022). HFMOEA: A hybrid framework for multi-objective feature selection. Journal of Computational Design and Engineering, 9(3), 949\u0026ndash;965. https://doi.org/10.1016/j.jcde.2022.01.001\u003c/li\u003e\n\u003cli\u003eSun, C., Wang, Y., \u0026amp; Zhang, Y. (2020). A multi-criteria fusion feature selection algorithm for fault diagnosis of helicopter planetary gear train. Chinese Journal of Aeronautics, 33(5), 1549-1561. https://doi.org/10.1016/j.cja.2019.07.014\u003c/li\u003e\n\u003cli\u003eWang, Z., Huang, H., \u0026amp; Wang, Y. (2021). A novel fault diagnosis method for planetary gearbox based on diverse quasi-optimal fault feature fusion and heterogeneous ensemble learning. Measurement, 173, 108654. https://doi.org/10.1016/j.measurement.2021.108654s\u003c/li\u003e\n\u003cli\u003eGot, A., Moussaoui, A., \u0026amp; Zouache, D. (2020). A guided population archive whale optimization algorithm for solving multiobjective optimization problems. Expert Systems with Applications, 141, 112972. https://doi.org/10.1016/j.eswa.2019.112972\u003c/li\u003e\n\u003cli\u003eForoughi Nematollahi, A., Rahiminejad, A., \u0026amp; Vahidi, B. (2019). A novel multi-objective optimization algorithm based on the lightning attachment procedure. Applied Soft Computing Journal, 75, 404\u0026ndash;427. https://doi.org/10.1016/j.asoc.2018.12.014\u003c/li\u003e\n\u003cli\u003eAfshari, H., Hare, W., \u0026amp; Tesfamariam, S. (2019). Constrained multi-objective optimization algorithms: Review and comparison with application in reinforced concrete structures. Applied Soft Computing Journal, 83, 105631. https://doi.org/10.1016/j.asoc.2019.105631\u003c/li\u003e\n\u003cli\u003eLiagkouras, K., \u0026amp; Metaxiotis, K. (2023). EEMPOS: An efficient evolutionary multi-objective algorithm for cardinality constrained portfolio optimization. Annals of Operations Research, 1-39. https://doi.org/10.1007/s10479-023-04800-0\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Feature selection, QEISS, NSGA-II, Extreme Learning Machines, multi-objective optimization","lastPublishedDoi":"10.21203/rs.3.rs-5727824/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5727824/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThis paper presents an enhanced feature selection method, the Improved Quasi Equally Informative Subsets in Feature Selection (QEISS), which addresses the computational inefficiencies of traditional approaches. By integrating Extreme Learning Machines (ELM) with the Non-dominated Sorting Genetic Algorithm II (NSGA-II), the improved method achieves faster execution times without sacrificing accuracy. The study introduces both a filter-based and wrapper-based QEISS variant, demonstrating consistent reductions in computational time across various high-dimensional datasets from the UCI repository. The method provides a balance between feature relevance, redundancy, subset size, and classification accuracy, marking a significant advancement in multi-objective feature selection.\u003c/p\u003e","manuscriptTitle":"Comparison of the Computational time of An Improved Quasi Equally Informative Subsets in Feature Selection","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-01-01 14:51:19","doi":"10.21203/rs.3.rs-5727824/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"210537b2-d1bb-432f-b0c7-88ddacf85f5c","owner":[],"postedDate":"January 1st, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-01-13T13:23:51+00:00","versionOfRecord":[],"versionCreatedAt":"2025-01-01 14:51:19","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-5727824","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5727824","identity":"rs-5727824","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00