Realization of Indoor Positioning Estimation in RF Communication Using Hybrid CNN and Transformer Models | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Realization of Indoor Positioning Estimation in RF Communication Using Hybrid CNN and Transformer Models Nihat DALDAL, Bahadır ARABACI This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6878394/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Indoor localization is a critical technology for numerous applications, yet it remains a challenge due to the unreliability of GPS signals within buildings. Wi-Fi Received Signal Strength Indicator (RSSI) fingerprinting offers a cost-effective alternative, but its accuracy is often compromised by signal fluctuations inherent in complex indoor environments. This paper addresses these challenges by proposing and evaluating two novel hybrid deep learning architectures for robust floor-level identification. The first proposed architecture (Model A) integrates Convolutional Neural Networks (CNNs) for spatial feature extraction with parallel Bidirectional Long Short-Term Memory (BiLSTM) and Multi-Head Self-Attention (MHA) pathways to capture temporal dynamics and contextual signal relationships, achieving an accuracy of 0.9892 on the UJIIndoorLoc dataset. Further investigating alternative hybrid configurations, we introduce a second proposed architecture, the Hybrid CNN-Transformer-Dilated Network (HCTD-Net, Model B). The HCTD-Net combines a CNN frontend with a Transformer encoder for global context modeling and a parallel dilated convolution pathway designed to enhance the temporal receptive field without resolution loss. This HCTD-Net demonstrated strong performance on a relevant indoor localization dataset, achieving an overall accuracy of 0.9754, a macro F1-score of 0.9753, and a macro average specificity of 0.9938. Both proposed models significantly outperform baseline methods, indicating that these distinct hybrid deep learning strategies effectively mitigate RSSI variability and provide highly accurate and reliable solutions for floor determination in multi-building, multi-floor indoor settings. The HCTD-Net, in particular, presents a novel synergistic combination of Transformer and dilated convolution mechanisms for advanced temporal feature learning in this domain. Indoor Localization Wi-Fi Fingerprinting RSSI Hybrid Deep Learning Transformer Network Dilated Convolutions Bidirectional LSTM (BiLSTM) Multi-Head Attention (MHA) Gated Recurrent Unit (GRU) Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6878394","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":492147673,"identity":"94b4be44-a542-4f49-89e9-f866f7e85100","order_by":0,"name":"Nihat DALDAL","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA60lEQVRIie3SLwvCQBjH8UcGszxu9QbiazgQ/IOgb8WxYFkwycCgMjiLb8a2YLhjwXK4uuiKyaBYtHkzaBC3RcN9w4XBh9/BDkCn+8cMVIcJTeDG6ugu80/jCoSbgMBrIX0TXmg+hJFaFdKtN8TtMhuitRcsyKIY7LpPjevuN+mHlkf4wUNHuix1ZQzO5kyBn34TGiMlghlIeU5YDDT1FSm4mSLtu2ALpEnGpjkZVSAdtaK21AS8VkgF0pOHPTppFhJXTpDI05TLIpLIdhrM5i0r8cT1EQ1a9trbHoMC8lX+GEr+pE6n0+lKewKSUVuGgr+GNwAAAABJRU5ErkJggg==","orcid":"","institution":"Bolu Abant Izzet Baysal University","correspondingAuthor":true,"prefix":"","firstName":"Nihat","middleName":"","lastName":"DALDAL","suffix":""},{"id":492147676,"identity":"c139a5c1-0456-4979-ae20-bf8b7d1ace2d","order_by":1,"name":"Bahadır ARABACI","email":"","orcid":"","institution":"Ostim Technical University","correspondingAuthor":false,"prefix":"","firstName":"Bahadır","middleName":"","lastName":"ARABACI","suffix":""}],"badges":[],"createdAt":"2025-06-12 08:53:22","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6878394/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6878394/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":89955335,"identity":"3b74ef88-e69b-46ae-9e6f-ec28584c4a44","added_by":"auto","created_at":"2025-08-26 21:31:26","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":726937,"visible":true,"origin":"","legend":"","description":"","filename":"rev2ManusctriptindoorRFlocalizationWPC.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6878394/v1_covered_89d74fdf-f324-4505-a542-fc71977703ce.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Realization of Indoor Positioning Estimation in RF Communication Using Hybrid CNN and Transformer Models","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Indoor Localization, Wi-Fi Fingerprinting, RSSI, Hybrid Deep Learning, Transformer Network, Dilated Convolutions, Bidirectional LSTM (BiLSTM), Multi-Head Attention (MHA), Gated Recurrent Unit (GRU)","lastPublishedDoi":"10.21203/rs.3.rs-6878394/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6878394/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eIndoor localization is a critical technology for numerous applications, yet it remains a challenge due to the unreliability of GPS signals within buildings. Wi-Fi Received Signal Strength Indicator (RSSI) fingerprinting offers a cost-effective alternative, but its accuracy is often compromised by signal fluctuations inherent in complex indoor environments. This paper addresses these challenges by proposing and evaluating two novel hybrid deep learning architectures for robust floor-level identification. The first proposed architecture (Model A) integrates Convolutional Neural Networks (CNNs) for spatial feature extraction with parallel Bidirectional Long Short-Term Memory (BiLSTM) and Multi-Head Self-Attention (MHA) pathways to capture temporal dynamics and contextual signal relationships, achieving an accuracy of 0.9892 on the UJIIndoorLoc dataset. Further investigating alternative hybrid configurations, we introduce a second proposed architecture, the Hybrid CNN-Transformer-Dilated Network (HCTD-Net, Model B). The HCTD-Net combines a CNN frontend with a Transformer encoder for global context modeling and a parallel dilated convolution pathway designed to enhance the temporal receptive field without resolution loss. This HCTD-Net demonstrated strong performance on a relevant indoor localization dataset, achieving an overall accuracy of 0.9754, a macro F1-score of 0.9753, and a macro average specificity of 0.9938. Both proposed models significantly outperform baseline methods, indicating that these distinct hybrid deep learning strategies effectively mitigate RSSI variability and provide highly accurate and reliable solutions for floor determination in multi-building, multi-floor indoor settings. The HCTD-Net, in particular, presents a novel synergistic combination of Transformer and dilated convolution mechanisms for advanced temporal feature learning in this domain.\u003c/p\u003e","manuscriptTitle":"Realization of Indoor Positioning Estimation in RF Communication Using Hybrid CNN and Transformer Models","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-07-30 05:08:00","doi":"10.21203/rs.3.rs-6878394/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"a0e5a6e2-c2d9-4075-8f9e-41c52454ce60","owner":[],"postedDate":"July 30th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-08-26T21:23:15+00:00","versionOfRecord":[],"versionCreatedAt":"2025-07-30 05:08:00","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6878394","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6878394","identity":"rs-6878394","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.