Leveraging Feature Sensitivity and Relevance: A Hybrid Feature Selection Approach for Improved Model Performance in Supervised Classification | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Leveraging Feature Sensitivity and Relevance: A Hybrid Feature Selection Approach for Improved Model Performance in Supervised Classification G Saranya, Rakesh Rajendran, Subash Chandra Bose Jaganathan, V Pandimurugan This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4470015/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Many feature selection algorithms primarily give importance to identifying relevant features and eliminating redundant features. This hybrid work determines the significant features, based on the estimated individual feature sensitivities and the degree of relevance between the feature and target outcome. The majority of works currently in existence employ mutual information (MI) to calculate the degree of information between two variables. By scaling the range of the MI to [0,1], Symmetrical Uncertainty (SU) can be viewed as the normalized MI. In this proposed work, Symmetrical Uncertainty-Relevance (SU-R) is used to measure the relevance between each feature and the target outcome. Per Feature Sensitivity Analysis (PFS) is used to measure the individual feature sensitivity with the target outcome. Features are ranked based on the sum of the ranks of features calculated individually using Symmetrical Uncertainty-Relevance (SU-R) and Per Feature Sensitivity analysis (PFS). Less significant features are iteratively eliminated starting from discarding the least ranked feature identified using the combination of SU-R and PFS Analysis.To evaluate how well our proposed method identifies important features, we assess the influence of each feature on the model's performance using metrics like F1 score and accuracy. This evaluation is conducted on two diverse public datasets from the UCI Machine Learning repository, allowing us to assess the method's robustness across different data types.This hybrid work identified the best 450 significant features out of 754 in the Parkinson’s disease dataset, and the top 150 features out of 562 in the smart phone dataset. The efficacy of the SVM classifier with the selected number of significant features with the proposed hybrid PF and SU-R technique outperforms the SVM when applied with existing feature selection methods. Mutual information Symmetrical Uncertainty-Relevance Per feature sensitivity feature selection SVM Parkinson’s disease Smartphone activity Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4470015","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":309419887,"identity":"adcea494-45fb-4de2-8587-2ffebe50aa2a","order_by":0,"name":"G Saranya","email":"","orcid":"","institution":"SRM Institute of Science and Technology","correspondingAuthor":false,"prefix":"","firstName":"G","middleName":"","lastName":"Saranya","suffix":""},{"id":309419888,"identity":"154eb1c0-2afd-4b5c-8f6a-bc59bf91d5cb","order_by":1,"name":"Rakesh Rajendran","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA/UlEQVRIiWNgGAWjYFACNgYGxgYGBn5m5gMGH0B8dmK1SLa3JRTOAPGZidVicOaMwWcekAAhLebtxxIf/txxJ4/hRoLhZptf2+T5mBkYP3zMwa1F5kzaYWPeM8+KGWckJBvn9t02bGNmYJacuQ23FgmG9DZpxrbDic0SCceMc3tuMwK1sDHz4tPC/7z950+gljaJxPbflj237QlrkUg7xsAL1NLDc5jBmOHH7UQitDxLluZte1Yswd7GYNjbcDu5jZmxGb9f+NMMP/5su5Nnf5j/g8GPP7dt57c3H/zwEY8WKDiQAKYY28BkA0H1CC0Mf4hRPApGwSgYBSMNAACOoVZ6WKkjrQAAAABJRU5ErkJggg==","orcid":"","institution":"Regenesys Institute of Management","correspondingAuthor":true,"prefix":"","firstName":"Rakesh","middleName":"","lastName":"Rajendran","suffix":""},{"id":309419890,"identity":"3c18b56c-817b-4450-acb5-ac1a274bb179","order_by":2,"name":"Subash Chandra Bose Jaganathan","email":"","orcid":"","institution":"VIT Bhopal University","correspondingAuthor":false,"prefix":"","firstName":"Subash","middleName":"Chandra Bose","lastName":"Jaganathan","suffix":""},{"id":309419891,"identity":"f5055d99-d8c0-41ab-adca-bb11765bb2c1","order_by":3,"name":"V Pandimurugan","email":"","orcid":"","institution":"SRM Institute of Science and Technology","correspondingAuthor":false,"prefix":"","firstName":"V","middleName":"","lastName":"Pandimurugan","suffix":""}],"badges":[],"createdAt":"2024-05-24 04:55:08","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4470015/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4470015/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":58074953,"identity":"390c6d39-e78d-44f2-a3ca-8fe1e4f8663b","added_by":"auto","created_at":"2024-06-10 21:04:51","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":948203,"visible":true,"origin":"","legend":"","description":"","filename":"Saranyappaer.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4470015/v1_covered_ccb4c850-9e9f-4082-b7ca-4f0cebc29d53.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Leveraging Feature Sensitivity and Relevance: A Hybrid Feature Selection Approach for Improved Model Performance in Supervised Classification","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Mutual information, Symmetrical Uncertainty-Relevance, Per feature sensitivity, feature selection, SVM, Parkinson’s disease, Smartphone activity","lastPublishedDoi":"10.21203/rs.3.rs-4470015/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4470015/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eMany feature selection algorithms primarily give importance to identifying relevant features and eliminating redundant features. This hybrid work determines the significant features, based on the estimated individual feature sensitivities and the degree of relevance between the feature and target outcome. The majority of works currently in existence employ mutual information (MI) to calculate the degree of information between two variables. By scaling the range of the MI to [0,1], Symmetrical Uncertainty (SU) can be viewed as the normalized MI. In this proposed work, Symmetrical Uncertainty-Relevance (SU-R) is used to measure the relevance between each feature and the target outcome. Per Feature Sensitivity Analysis (PFS) is used to measure the individual feature sensitivity with the target outcome. Features are ranked based on the sum of the ranks of features calculated individually using Symmetrical Uncertainty-Relevance (SU-R) and Per Feature Sensitivity analysis (PFS). Less significant features are iteratively eliminated starting from discarding the least ranked feature identified using the combination of SU-R and PFS Analysis.To evaluate how well our proposed method identifies important features, we assess the influence of each feature on the model's performance using metrics like F1 score and accuracy. This evaluation is conducted on two diverse public datasets from the UCI Machine Learning repository, allowing us to assess the method's robustness across different data types.This hybrid work identified the best 450 significant features out of 754 in the Parkinson\u0026rsquo;s disease dataset, and the top 150 features out of 562 in the smart phone dataset. The efficacy of the SVM classifier with the selected number of significant features with the proposed hybrid PF and SU-R technique outperforms the SVM when applied with existing feature selection methods.\u003c/p\u003e","manuscriptTitle":"Leveraging Feature Sensitivity and Relevance: A Hybrid Feature Selection Approach for Improved Model Performance in Supervised Classification","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-06-06 02:47:21","doi":"10.21203/rs.3.rs-4470015/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"4650bd6e-0654-4dfa-b094-fc95ef0c80ed","owner":[],"postedDate":"June 6th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-06-13T20:23:12+00:00","versionOfRecord":[],"versionCreatedAt":"2024-06-06 02:47:21","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4470015","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4470015","identity":"rs-4470015","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.