UMobile-Net: A Dual-Model Deep Learning Framework for Pill Image Detection | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article UMobile-Net: A Dual-Model Deep Learning Framework for Pill Image Detection Amr M. Nagy, Hossam Fares, Fady Maher, László Czúni This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7357332/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Pill image detection is essential in healthcare applications such as hospital workflow optimization, assistance for visually impaired individuals, and elderly care. However, high inter-class similarity and varying imaging conditions make accurate recognition challenging. This paper presents UMobile-Net, a dual-model deep learning framework integrating a segmentation model (U-Net or W 2) with a MobileNet-based classifier in a two-stage pipeline. In stage one, segmentation masks derived from dataset annotations are refined to extract pill regions, followed by image enhancement. In stage two, the processed images are classified using a customized MobileNet architecture. Evaluations on the CURE and OGYEI-v2 datasets show that UMobile-Net consistently surpasses existing methods. The U-Net–based variant achieved accuracies of 97.78% (CURE) and 97.62% (OGYEI-v2), while the W 2 –based variant achieved 95.85% and 97.61%, respectively. The results confirm the robustness and accuracy of the proposed approach for real-world pill recognition. Detection Deep Learning U-Net Segmentation MobileNet Few shot Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7357332","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":502762054,"identity":"c43cc574-5b2a-4871-8ece-2d1b91c4b597","order_by":0,"name":"Amr M. Nagy","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA+klEQVRIiWNgGAWjYDADfuYDDAcYG0DMBCK1SLYlkKrF4BhQJVFa+KedfcBcUFObuPkY88MDjDsOM/Cz5xgwfPiFW4vE7XQD5hnHjiduO8ZmcIDxzGEGyZ43Bowz+/BYczuNgZmH7Vjitvs9QL+0HWYwuJFjwMzbg1uHPFjLv2OJm9t4IFrsQVr+4tFiANLC21aTuIENqsVAAqiF4QduLYZALYd5+w4YzwD5JfFMOo/EmWcFB3sbcGuRu53G+JjnW51sfxvz4w8fd1jL8bcnb3zw4w8e7wPBAQaGw45gYxMYGHjAIoxt+LUAQZ09mgABW0bBKBgFo2BEAQBTzlbwTgCWZQAAAABJRU5ErkJggg==","orcid":"","institution":"Benha Univeristy","correspondingAuthor":true,"prefix":"","firstName":"Amr","middleName":"M.","lastName":"Nagy","suffix":""},{"id":502762056,"identity":"a29ead92-5fe4-4388-9f59-b563744ef3e2","order_by":1,"name":"Hossam Fares","email":"","orcid":"","institution":"Benha Univeristy","correspondingAuthor":false,"prefix":"","firstName":"Hossam","middleName":"","lastName":"Fares","suffix":""},{"id":502762059,"identity":"3c2e5fdc-7268-4cf2-9cea-18555c105689","order_by":2,"name":"Fady Maher","email":"","orcid":"","institution":"Benha Univeristy","correspondingAuthor":false,"prefix":"","firstName":"Fady","middleName":"","lastName":"Maher","suffix":""},{"id":502762061,"identity":"c1fc0e2b-362c-4c0a-8b79-930e8795998a","order_by":3,"name":"László Czúni","email":"","orcid":"","institution":"University of Pannonia","correspondingAuthor":false,"prefix":"","firstName":"László","middleName":"","lastName":"Czúni","suffix":""}],"badges":[],"createdAt":"2025-08-12 15:23:23","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7357332/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7357332/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":91221816,"identity":"83b3b296-7694-45db-a0db-9a10bc3a5d17","added_by":"auto","created_at":"2025-09-12 22:46:16","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1031102,"visible":true,"origin":"","legend":"","description":"","filename":"UMobileNetADualModelDeepLearningFrameworkforPillImageDetection.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7357332/v1_covered_b8ee5ff5-6fd6-4380-a794-f39117cca4a5.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"UMobile-Net: A Dual-Model Deep Learning Framework for Pill Image Detection","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Detection, Deep Learning, U-Net Segmentation, MobileNet, Few shot","lastPublishedDoi":"10.21203/rs.3.rs-7357332/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7357332/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Pill image detection is essential in healthcare applications such as hospital workflow optimization, assistance for visually impaired individuals, and elderly care. However, high inter-class similarity and varying imaging conditions make accurate recognition challenging. This paper presents UMobile-Net, a dual-model deep learning framework integrating a segmentation model (U-Net or W 2) with a MobileNet-based classifier in a two-stage pipeline. In stage one, segmentation masks derived from dataset annotations are refined to extract pill regions, followed by image enhancement. In stage two, the processed images are classified using a customized MobileNet architecture. Evaluations on the CURE and OGYEI-v2 datasets show that UMobile-Net consistently surpasses existing methods. The U-Net–based variant achieved accuracies of 97.78% (CURE) and 97.62% (OGYEI-v2), while the W 2 –based variant achieved 95.85% and 97.61%, respectively. The results confirm the robustness and accuracy of the proposed approach for real-world pill recognition.","manuscriptTitle":"UMobile-Net: A Dual-Model Deep Learning Framework for Pill Image Detection","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-08-28 15:35:45","doi":"10.21203/rs.3.rs-7357332/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"30b3a1fb-b7ab-4825-9851-570ea33a1dc6","owner":[],"postedDate":"August 28th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-09-12T22:38:08+00:00","versionOfRecord":[],"versionCreatedAt":"2025-08-28 15:35:45","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7357332","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7357332","identity":"rs-7357332","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.