RFNet-4D++: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds with Cross-Attention Spatio-Temporal Features | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article RFNet-4D++: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds with Cross-Attention Spatio-Temporal Features Tuan-Anh Vu, Duc Thanh Nguyen, Binh-Son Hua, Quang-Hieu Pham, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4390361/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Object reconstruction from 3D point clouds has been a long-standing research topic in computer vision and computer graphics, and achieved impressive progress. However, reconstruction from time-varying point clouds (a.k.a. 4D point clouds) remains overlooked. In this paper, we propose a new network architecture, namely RFNet-4D++, that jointly reconstructs 3D objects and their motion flows from 4D point clouds. The key insight is simultaneously performing both the tasks can leverage the individual ones, leading to improved overall performance. To achieve this ability, we design a compositional encoder to learn spatio-temporal representations of 4D point clouds based on dual cross-attention mechanism. In addition, we devise a joint-learning scheme for unsupervised learning of temporal vector fields and supervised learning of occupancy fields. This multi-task learning is achieved by jointly optimising loss functions and sharing spatio-temporal features. Experiments and analyses on benchmark datasets validate the effectiveness and efficiency of our method. As shown in experimental results, our method achieves state-of-the-art performance on both flow estimation and object reconstruction while performing much faster than existing methods in both training and inference. Our code and data are available at \url{ https://github.com/hkust-vgd/RFNet-4D} . Physical sciences/Mathematics and computing/Computer science Physical sciences/Mathematics and computing/Applied mathematics dynamic point clouds 4D reconstruction flow estimation. Full Text Additional Declarations There is NO Competing Interest. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4390361","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":300992779,"identity":"042e4260-15ba-4a9a-bb3d-3cbe0619d8a4","order_by":0,"name":"Tuan-Anh Vu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAzklEQVRIiWNgGAWjYLCCDyg8CTb8qnmAmHEGhM3YANJAlBZmHpK02LOfMZO2qbHJM2dvfv6Asa2ujkG6LQG/LTw5ZtI5x9KKLXuOGTYwth2WYJA5doCAw9LSpHMbDiduuJHDCNRyAOiw9Ab8WvifpUlbgrTcfwPSUkeEFonkY9KMYFt4QFqYgVrSCDjsxuPDQG+kJW44k2Y4I+HcYck2ibQEvFrY+xMbb/yosUnccPzwgw8fyur4+SXSDPBqQQUg4wnEyigYBaNgFIwCYgAAIkxA9NFNFMIAAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0002-8872-0875","institution":"The Hong Kong University of Science and Technology","correspondingAuthor":true,"prefix":"","firstName":"Tuan-Anh","middleName":"","lastName":"Vu","suffix":""},{"id":300992780,"identity":"d8dcb2d6-c9c6-486c-b001-4539ac92d950","order_by":1,"name":"Duc Thanh Nguyen","email":"","orcid":"","institution":"Deakin University","correspondingAuthor":false,"prefix":"","firstName":"Duc","middleName":"Thanh","lastName":"Nguyen","suffix":""},{"id":300992781,"identity":"c5c7171d-5b33-415a-84c6-193aae04e6e3","order_by":2,"name":"Binh-Son Hua","email":"","orcid":"","institution":"Trinity College Dublin","correspondingAuthor":false,"prefix":"","firstName":"Binh-Son","middleName":"","lastName":"Hua","suffix":""},{"id":300992782,"identity":"d2a78981-bd16-4b8b-b0ee-4fdb6b300efc","order_by":3,"name":"Quang-Hieu Pham","email":"","orcid":"","institution":"Woven by Toyota","correspondingAuthor":false,"prefix":"","firstName":"Quang-Hieu","middleName":"","lastName":"Pham","suffix":""},{"id":300992783,"identity":"ac98964a-4d53-448a-b0aa-1db9b474ce45","order_by":4,"name":"Sai-Kit Yeung","email":"","orcid":"","institution":"The Hong Kong University of Science and Technology","correspondingAuthor":false,"prefix":"","firstName":"Sai-Kit","middleName":"","lastName":"Yeung","suffix":""}],"badges":[],"createdAt":"2024-05-08 15:40:45","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4390361/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4390361/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":56617230,"identity":"e1dc808e-03f4-479e-a6af-6b4347abc829","added_by":"auto","created_at":"2024-05-16 17:08:49","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":6518075,"visible":true,"origin":"","legend":"","description":"","filename":"NMIRFNet4DManuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4390361/v1_covered_77c969f6-4198-45e0-aab8-f43b575b562f.pdf"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"RFNet-4D++: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds with Cross-Attention Spatio-Temporal Features","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"dynamic point clouds, 4D reconstruction, flow estimation.","lastPublishedDoi":"10.21203/rs.3.rs-4390361/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4390361/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Object reconstruction from 3D point clouds has been a long-standing research topic in computer vision and computer graphics, and achieved impressive progress. However, reconstruction from time-varying point clouds (a.k.a. 4D point clouds) remains overlooked. In this paper, we propose a new network architecture, namely RFNet-4D++, that jointly reconstructs 3D objects and their motion flows from 4D point clouds. The key insight is simultaneously performing both the tasks can leverage the individual ones, leading to improved overall performance. To achieve this ability, we design a compositional encoder to learn spatio-temporal representations of 4D point clouds based on dual cross-attention mechanism. In addition, we devise a joint-learning scheme for unsupervised learning of temporal vector fields and supervised learning of occupancy fields. This multi-task learning is achieved by jointly optimising loss functions and sharing spatio-temporal features.\r\nExperiments and analyses on benchmark datasets validate the effectiveness and efficiency of our method. As shown in experimental results, our method achieves state-of-the-art performance on both flow estimation and object reconstruction while performing much faster than existing methods in both training and inference. Our code and data are available at \\url{https://github.com/hkust-vgd/RFNet-4D}.","manuscriptTitle":"RFNet-4D++: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds with Cross-Attention Spatio-Temporal Features","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-05-15 03:29:15","doi":"10.21203/rs.3.rs-4390361/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"a20c0a51-8181-4fef-bed1-916a35fcfcb7","owner":[],"postedDate":"May 15th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":31760943,"name":"Physical sciences/Mathematics and computing/Computer science"},{"id":31760944,"name":"Physical sciences/Mathematics and computing/Applied mathematics"}],"tags":[],"updatedAt":"2024-05-16T18:45:08+00:00","versionOfRecord":[],"versionCreatedAt":"2024-05-15 03:29:15","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4390361","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4390361","identity":"rs-4390361","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.