D3F-Det: Progressive Fine-Grained Feature Refinement and Reuse for Robust Small Object Detection in Aerial Imagery | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article D3F-Det: Progressive Fine-Grained Feature Refinement and Reuse for Robust Small Object Detection in Aerial Imagery LingLing Li, JiaQing Liu, XueZhuan Zhao, XiaoYan Shao, MengMeng Tang, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9197165/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 4 You are reading this latest preprint version Abstract The detection of small objects in unmanned aerial vehicle (UAV) imagery remains a formidable challenge due to the severe degradation and underutilization of fine-grained information across the detection pipeline. Conventional methods often suffer from irreversible loss of spatial details during downsampling, a lack of effective same-scale spatial-semantic alignment, and the progressive dilution of shallow features in multi-scale fusion. To address these interconnected issues, we propose a unified real-time detection framework termed D3F-Det, which is designed to preserve, enhance, and progressively reuse fine-grained information for robust small object detection in aerial imagery. Our architecture comprises three novel components: a dual-branch detail-preserving downsampling module (DBDown) that reduces resolution while maintaining critical shallow details; a dual-feature multi-scale collaborative module (DFMS) embedded within same-scale layers to align and enhance spatial and semantic representations; and a three-stage reuse neck (3S-RN) that iteratively reintegrates high-resolution shallow features throughout the feature pyramid. We demonstrate that our method significantly improves performance, achieving a mean Average Precision ( [email protected] :0.95) of 30.3% on the challenging VisDrone2019 dataset, a 6.2 percentage point gain over the YOLOv12-s baseline, while concurrently reducing the parameter count from 9.3M to 3.7M. Extensive experiments on the AI-TOD and TinyPerson datasets further validate its robustness in extreme small-object and densely populated scenarios. By systematically preserving and exploiting fine-grained features, our D3F-Det framework offers a more effective and generalisable solution for robust object detection in complex aerial environments.The source code is available at \url{ https://github.com/LJQA1/D3F-Det} (DOI: \url{10.5281/zenodo.19162854}). Small Object Detection UAV Imagery Fine-Grained Feature Refinement Progressive Feature Reuse Robust Detection Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Reviewers invited by journal 04 May, 2026 Editor assigned by journal 24 Mar, 2026 Submission checks completed at journal 24 Mar, 2026 First submitted to journal 23 Mar, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9197165","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":634021178,"identity":"eeb1c16a-df2b-41ee-97d9-5c4c6c3d0010","order_by":0,"name":"LingLing Li","email":"","orcid":"","institution":"Zhengzhou University of Aeronautics","correspondingAuthor":false,"prefix":"","firstName":"LingLing","middleName":"","lastName":"Li","suffix":""},{"id":634021179,"identity":"16bf9c9f-0c5e-4a56-84af-ab1f29f19827","order_by":1,"name":"JiaQing Liu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA6ElEQVRIiWNgGAWjYBACPmYGBoMEBgYeBgbmAxChAwS0sCG0sCUQqQXB5DEgUgs7j0HBgxprGXP+NZ8/89QwyPHdSGD8XIDXYTwGBgnH0nksZ7zdYMxzjMFY8kYCs/QMglrYDvMY3Di7IZmHjSFxw40EoCBBLf9AWs48OMzzj6GeOC2JbUAt53sYm3nbGBIMCGthKzBI7EsH2sJmzDi3T8Jw5pmHzdL4tPDzH95m+OObtb3B+cOPP7z5ZiPPdzz54Gd8WkAWAeMDGKESCSCOBBAzNuDXAFT+AKyF/wAhhaNgFIyCUTBSAQDma0RIRy264QAAAABJRU5ErkJggg==","orcid":"","institution":"Zhengzhou University of Aeronautics","correspondingAuthor":true,"prefix":"","firstName":"JiaQing","middleName":"","lastName":"Liu","suffix":""},{"id":634021180,"identity":"b09b2a55-3777-4845-b7d2-a81f63c6f423","order_by":2,"name":"XueZhuan Zhao","email":"","orcid":"","institution":"Zhengzhou University of Aeronautics","correspondingAuthor":false,"prefix":"","firstName":"XueZhuan","middleName":"","lastName":"Zhao","suffix":""},{"id":634021181,"identity":"f49d5421-b19d-467b-a687-791e6fcd10f5","order_by":3,"name":"XiaoYan Shao","email":"","orcid":"","institution":"Zhengzhou University of Aeronautics","correspondingAuthor":false,"prefix":"","firstName":"XiaoYan","middleName":"","lastName":"Shao","suffix":""},{"id":634021182,"identity":"4d4154dc-2b66-4af8-9329-150b52997c84","order_by":4,"name":"MengMeng Tang","email":"","orcid":"","institution":"Zhengzhou University of Aeronautics","correspondingAuthor":false,"prefix":"","firstName":"MengMeng","middleName":"","lastName":"Tang","suffix":""},{"id":634021183,"identity":"385f83f7-4d49-4f8b-add1-f531b7de1f73","order_by":5,"name":"YaXuan Xing","email":"","orcid":"","institution":"Zhengzhou University of Aeronautics","correspondingAuthor":false,"prefix":"","firstName":"YaXuan","middleName":"","lastName":"Xing","suffix":""}],"badges":[],"createdAt":"2026-03-23 07:53:57","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9197165/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9197165/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":109204977,"identity":"61b2f436-b593-4c74-8cb9-0d2b5830716e","added_by":"auto","created_at":"2026-05-13 15:03:04","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":4790678,"visible":true,"origin":"","legend":"","description":"","filename":"D3FDet.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9197165/v1_covered_aa2f3835-9f9f-42fa-bd69-1164ee8695bd.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"D3F-Det: Progressive Fine-Grained Feature Refinement and Reuse for Robust Small Object Detection in Aerial Imagery","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"the-visual-computer","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"tvcj","sideBox":"Learn more about [The Visual Computer](http://link.springer.com/journal/371)","snPcode":"371","submissionUrl":"https://submission.nature.com/new-submission/371/3","title":"The Visual Computer","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Small Object Detection, UAV Imagery, Fine-Grained Feature Refinement, Progressive Feature Reuse, Robust Detection","lastPublishedDoi":"10.21203/rs.3.rs-9197165/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9197165/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"The detection of small objects in unmanned aerial vehicle (UAV) imagery remains a formidable challenge due to the severe degradation and underutilization of fine-grained information across the detection pipeline. Conventional methods often suffer from irreversible loss of spatial details during downsampling, a lack of effective same-scale spatial-semantic alignment, and the progressive dilution of shallow features in multi-scale fusion. To address these interconnected issues, we propose a unified real-time detection framework termed D3F-Det, which is designed to preserve, enhance, and progressively reuse fine-grained information for robust small object detection in aerial imagery. Our architecture comprises three novel components: a dual-branch detail-preserving downsampling module (DBDown) that reduces resolution while maintaining critical shallow details; a dual-feature multi-scale collaborative module (DFMS) embedded within same-scale layers to align and enhance spatial and semantic representations; and a three-stage reuse neck (3S-RN) that iteratively reintegrates high-resolution shallow features throughout the feature pyramid. We demonstrate that our method significantly improves performance, achieving a mean Average Precision (
[email protected]:0.95) of 30.3\\% on the challenging VisDrone2019 dataset, a 6.2 percentage point gain over the YOLOv12-s baseline, while concurrently reducing the parameter count from 9.3M to 3.7M. Extensive experiments on the AI-TOD and TinyPerson datasets further validate its robustness in extreme small-object and densely populated scenarios. By systematically preserving and exploiting fine-grained features, our D3F-Det framework offers a more effective and generalisable solution for robust object detection in complex aerial environments.The source code is available at \\url{https://github.com/LJQA1/D3F-Det} (DOI: \\url{10.5281/zenodo.19162854}).","manuscriptTitle":"D3F-Det: Progressive Fine-Grained Feature Refinement and Reuse for Robust Small Object Detection in Aerial Imagery","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-05-12 18:50:33","doi":"10.21203/rs.3.rs-9197165/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"reviewersInvited","content":"","date":"2026-05-04T09:26:58+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-03-24T06:03:22+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-03-24T06:01:22+00:00","index":"","fulltext":""},{"type":"submitted","content":"The Visual Computer","date":"2026-03-23T07:51:04+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"the-visual-computer","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"tvcj","sideBox":"Learn more about [The Visual Computer](http://link.springer.com/journal/371)","snPcode":"371","submissionUrl":"https://submission.nature.com/new-submission/371/3","title":"The Visual Computer","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"19318ea6-69e1-4dec-a3fc-c92f78c62e77","owner":[],"postedDate":"May 12th, 2026","published":true,"recentEditorialEvents":[{"type":"reviewersInvited","content":"3","date":"2026-05-04T09:26:58+00:00","index":"","fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-05-12T18:50:33+00:00","versionOfRecord":[],"versionCreatedAt":"2026-05-12 18:50:33","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9197165","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9197165","identity":"rs-9197165","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.