Efficient and Accurate Stereo Matching via Guided Deformable Aggregation | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Efficient and Accurate Stereo Matching via Guided Deformable Aggregation Jie Li, Xinjia Li, Mingyuan Chang, Lin Wang, Shuangli Du, Jie Zhou, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5376948/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract In stereo vision, depth information relies on the dense registration accuracy of binocular stereo images, and its realtime performance is also significant in many automation applications. Recently, it is still a challenge to balance the efficiency and accuracy. Motivated by this problem, we propose a lightweight 2D guided deformable aggregation(GDA) module. It uses color prior information to learn the aggregation sampling points for fitting the irregular window. And it enables to fast recover the lost high-frequency detail information from a coarse cost volume. Furthermore, we propose a guided deformable aggregation based stereo matching network (GDANet) for balancing the efficiency and accuracy. It builds a fast 3D network to obtain the cost volume of low-frequency non-detail regions, and then uses the lightweight 2D GDA module to recover high-frequency detail regions. Experiments show that GDANet achieves better results than current high efficiency methods in SceneFlow and KITTI datasets. Especially, in edge regions and thin structures, our method shows better qualitative and quantitative results. Computer vision for automation deep learning for visual perception guided deformable aggregation Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5376948","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":373219511,"identity":"ed5f51ad-682c-4fde-bfa4-1c6d3466ad29","order_by":0,"name":"Jie Li","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAyElEQVRIiWNgGAWjYBACPmaGhAMgBj8Dc8MBorSwwbRINjASqwXGMDjA2ECcw9jYGR4e+PDHLnHz8YONh3nb7jDwt3cnEHTYwZltyYnbziQ2ALU8Y5A4c3YDQS2HeRsOJG47ANZymMFAIpcILX/+HEjc3P+QFC0MbAcSN0iQYsvB3rZk4xk3HjYcnHPuMA9Bv/Dzn0n+8OOPnWx/f/LhD2/KDsvxt/fi18LAwJMAZzLxALkElIMA+wE4k/EHEepHwSgYBaNg5AEAZ95PXdEEx6MAAAAASUVORK5CYII=","orcid":"","institution":"Shanxi University of Finance and Economics","correspondingAuthor":true,"prefix":"","firstName":"Jie","middleName":"","lastName":"Li","suffix":""},{"id":373219512,"identity":"36d7890a-5f6c-4ceb-93cc-ce9f62615a32","order_by":1,"name":"Xinjia Li","email":"","orcid":"","institution":"Shanxi University of Finance and Economics","correspondingAuthor":false,"prefix":"","firstName":"Xinjia","middleName":"","lastName":"Li","suffix":""},{"id":373219513,"identity":"9da01090-caaf-43ab-9edc-5f97197a4cb7","order_by":2,"name":"Mingyuan Chang","email":"","orcid":"","institution":"Shanxi University of Finance and Economics","correspondingAuthor":false,"prefix":"","firstName":"Mingyuan","middleName":"","lastName":"Chang","suffix":""},{"id":373219514,"identity":"8f3a9516-4b21-499d-a88a-ca9cc8ef57db","order_by":3,"name":"Lin Wang","email":"","orcid":"","institution":"Shanxi University of Finance and Economics","correspondingAuthor":false,"prefix":"","firstName":"Lin","middleName":"","lastName":"Wang","suffix":""},{"id":373219515,"identity":"47284be6-d66c-4314-aa49-a9f699fba4e8","order_by":4,"name":"Shuangli Du","email":"","orcid":"","institution":"Xi'an University of Technology","correspondingAuthor":false,"prefix":"","firstName":"Shuangli","middleName":"","lastName":"Du","suffix":""},{"id":373219516,"identity":"f15c75c9-deea-4bc8-bc41-81fd1c37c11b","order_by":5,"name":"Jie Zhou","email":"","orcid":"","institution":"Sichuan University","correspondingAuthor":false,"prefix":"","firstName":"Jie","middleName":"","lastName":"Zhou","suffix":""},{"id":373219517,"identity":"e7f63bbe-bf94-4875-854b-a03d4f67c336","order_by":6,"name":"Yiguang Liu","email":"","orcid":"","institution":"Sichuan University","correspondingAuthor":false,"prefix":"","firstName":"Yiguang","middleName":"","lastName":"Liu","suffix":""}],"badges":[],"createdAt":"2024-11-02 07:38:13","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5376948/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5376948/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":70632044,"identity":"0470b1b0-d382-4bcb-bdff-82d8bd732f77","added_by":"auto","created_at":"2024-12-05 05:40:04","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":4298346,"visible":true,"origin":"","legend":"","description":"","filename":"EfficientandAccurateStereoMatchingviaGuidedDeformableAggregation.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5376948/v1_covered_8ba51fa8-309f-471e-b630-8c922af632e5.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"\u003cp\u003eEfficient and Accurate Stereo Matching via Guided Deformable Aggregation\u003c/p\u003e","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Computer vision for automation, deep learning for visual perception, guided deformable aggregation","lastPublishedDoi":"10.21203/rs.3.rs-5376948/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5376948/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eIn stereo vision, depth information relies on the dense registration accuracy of binocular stereo images, and its realtime performance is also significant in many automation applications. Recently, it is still a challenge to balance the efficiency and accuracy. Motivated by this problem, we propose a lightweight 2D guided deformable aggregation(GDA) module. It uses color prior information to learn the aggregation sampling points for fitting the irregular window. And it enables to fast recover the lost high-frequency detail information from a coarse cost volume. Furthermore, we propose a guided deformable aggregation based stereo matching network (GDANet) for balancing the efficiency and accuracy. It builds a fast 3D network to obtain the cost volume of low-frequency non-detail regions, and then uses the lightweight 2D GDA module to recover high-frequency detail regions. Experiments show that GDANet achieves better results than current high efficiency methods in SceneFlow and KITTI datasets. Especially, in edge regions and thin structures, our method shows better qualitative and quantitative results.\u003c/p\u003e","manuscriptTitle":"Efficient and Accurate Stereo Matching via Guided Deformable Aggregation","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-11-14 16:48:27","doi":"10.21203/rs.3.rs-5376948/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"b3994c51-82c9-4759-9425-f138d45af241","owner":[],"postedDate":"November 14th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-12-05T05:39:07+00:00","versionOfRecord":[],"versionCreatedAt":"2024-11-14 16:48:27","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-5376948","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5376948","identity":"rs-5376948","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.