StarryGazer: Leveraging Monocular Depth Estimation Models for Domain-Agnostic Single Depth Image Completion

preprint OA: closed
Full text JSON View at publisher
Full text 13,822 characters · extracted from preprint-html · click to expand
StarryGazer: Leveraging Monocular Depth Estimation Models for Domain-Agnostic Single Depth Image Completion | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article StarryGazer: Leveraging Monocular Depth Estimation Models for Domain-Agnostic Single Depth Image Completion Sangmin Hong, Suyoung Lee, Kyoung Mu Lee This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8469320/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 11 You are reading this latest preprint version Abstract The problem of depth completion involves predicting a dense depth image from a single sparse depth map and an RGB image. Unsupervised depth completion methods have been proposed for various datasets where ground truth depth data is unavailable and supervised methods cannot be applied. However, these models require auxiliary data to estimate depth values, which is far from real scenarios. Monocular depth estimation (MDE) models can produce a plausible relative depth map from a single image, but there is no work to properly combine the sparse depth map with MDE for depth completion; a simple affine transformation to the depth map will yield a high error since MDE are inaccurate at estimating depth difference between objects. We introduce StarryGazer, a domain-agnostic framework that predicts dense depth images from a single sparse depth image and an RGB image without relying on ground-truth depth by leveraging the power of large MDE models. First, we employ a pre-trained MDE model to produce relative depth images. These images are segmented and randomly rescaled to form synthetic pairs for dense pseudo-ground truth and corresponding sparse depths. A refinement network is trained with the synthetic pairs, incorporating the relative depth maps and RGB images to improve the model's accuracy and robustness. StarryGazer shows superior results over existing unsupervised methods and transformed MDE results on various datasets, demonstrating that our framework exploits the power of MDE models while appropriately fixing errors using sparse depth information. Depth completion Monocular depth estimation Self-supervised learning Synthetic dataset Full Text Additional Declarations No competing interests reported. Supplementary Files StarryGazerSIVPsupp.pdf Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 29 Mar, 2026 Reviews received at journal 28 Mar, 2026 Reviewers agreed at journal 08 Feb, 2026 Reviews received at journal 01 Feb, 2026 Reviews received at journal 19 Jan, 2026 Reviewers agreed at journal 12 Jan, 2026 Reviewers agreed at journal 12 Jan, 2026 Reviewers invited by journal 12 Jan, 2026 Editor assigned by journal 30 Dec, 2025 Submission checks completed at journal 30 Dec, 2025 First submitted to journal 29 Dec, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8469320","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":573568679,"identity":"01492f5b-4c60-478e-81d1-5e3eed0dc617","order_by":0,"name":"Sangmin Hong","email":"","orcid":"","institution":"Seoul National University","correspondingAuthor":false,"prefix":"","firstName":"Sangmin","middleName":"","lastName":"Hong","suffix":""},{"id":573568682,"identity":"c195bf11-c52d-4098-82af-ac190a9331c1","order_by":1,"name":"Suyoung Lee","email":"","orcid":"","institution":"Seoul National University","correspondingAuthor":false,"prefix":"","firstName":"Suyoung","middleName":"","lastName":"Lee","suffix":""},{"id":573568684,"identity":"98a7ae63-0ae6-4749-abb3-b0335f06e380","order_by":2,"name":"Kyoung Mu Lee","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA0ElEQVRIiWNgGAWjYBACCRiDH0OEoBbJBpK1GBwgVovkjORj0gW/tiVuPn/2mARDjR2D5OwD+LVIS6SlSc/su5247UZemgTDsWQGab4E/FrkJHLMpHl7QFp4zCQY2A4wyPEQcBhcy+b+M0At/4jQIg3SwvPjduIGhhwzCca2AwzShLRI9jxLtuZtuG0840aOsUViXzKPZA8BLRLHkw/e5vlzW7a//4zhjQ/f7OQkzhDQwiCQwMDA2AblANmEnAUE/AeAxB/C6kbBKBgFo2AEAwCC1jyKMIRA2gAAAABJRU5ErkJggg==","orcid":"","institution":"Seoul National University","correspondingAuthor":true,"prefix":"","firstName":"Kyoung","middleName":"Mu","lastName":"Lee","suffix":""}],"badges":[],"createdAt":"2025-12-29 05:23:25","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8469320/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8469320/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":100199891,"identity":"bdf90494-8d1e-4733-ab02-90b4d501f013","added_by":"auto","created_at":"2026-01-14 04:27:34","extension":"json","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":5077,"visible":true,"origin":"","legend":"","description":"","filename":"5f8917d43cc04d03be39c8c6e0ec6590.json","url":"https://assets-eu.researchsquare.com/files/rs-8469320/v1/5fad1b9f140e50a1227c6577.json"},{"id":100370023,"identity":"5edba029-924a-4a77-89ba-7a36b41725e3","added_by":"auto","created_at":"2026-01-16 07:59:47","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1969111,"visible":true,"origin":"","legend":"","description":"","filename":"StarryGazerSIVP.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8469320/v1_covered_a2e849a0-e983-4ac7-ae9c-19cb62454399.pdf"},{"id":100199892,"identity":"b8a212e9-f238-432a-ba47-42255f3673e9","added_by":"auto","created_at":"2026-01-14 04:27:34","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":1474019,"visible":true,"origin":"","legend":"","description":"","filename":"StarryGazerSIVPsupp.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8469320/v1/d76ae86088e013ab2dc1855e.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"StarryGazer: Leveraging Monocular Depth Estimation Models for Domain-Agnostic Single Depth Image Completion","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"signal-image-and-video-processing","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"sivp","sideBox":"Learn more about [Signal, Image and Video Processing](http://link.springer.com/journal/11760)","snPcode":"11760","submissionUrl":"https://submission.nature.com/new-submission/11760/3","title":"Signal, Image and Video Processing","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Depth completion, Monocular depth estimation, Self-supervised learning, Synthetic dataset","lastPublishedDoi":"10.21203/rs.3.rs-8469320/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8469320/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"The problem of depth completion involves predicting a dense depth image from a single sparse depth map and an RGB image. Unsupervised depth completion methods have been proposed for various datasets where ground truth depth data is unavailable and supervised methods cannot be applied. However, these models require auxiliary data to estimate depth values, which is far from real scenarios. Monocular depth estimation (MDE) models can produce a plausible relative depth map from a single image, but there is no work to properly combine the sparse depth map with MDE for depth completion; a simple affine transformation to the depth map will yield a high error since MDE are inaccurate at estimating depth difference between objects. We introduce StarryGazer, a domain-agnostic framework that predicts dense depth images from a single sparse depth image and an RGB image without relying on ground-truth depth by leveraging the power of large MDE models. First, we employ a pre-trained MDE model to produce relative depth images. These images are segmented and randomly rescaled to form synthetic pairs for dense pseudo-ground truth and corresponding sparse depths. A refinement network is trained with the synthetic pairs, incorporating the relative depth maps and RGB images to improve the model's accuracy and robustness. StarryGazer shows superior results over existing unsupervised methods and transformed MDE results on various datasets, demonstrating that our framework exploits the power of MDE models while appropriately fixing errors using sparse depth information.","manuscriptTitle":"StarryGazer: Leveraging Monocular Depth Estimation Models for Domain-Agnostic Single Depth Image Completion","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-01-14 04:27:29","doi":"10.21203/rs.3.rs-8469320/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2026-03-29T14:19:56+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-03-29T03:25:30+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"18637282093309140747075041091013195600","date":"2026-02-08T15:51:11+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-02-01T13:26:17+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-01-19T05:55:00+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"247225509078836970500856808794817884631","date":"2026-01-12T16:06:53+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"57771560190300229967855359987080553562","date":"2026-01-12T15:48:29+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-01-12T14:46:12+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-12-30T10:38:34+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-12-30T10:36:51+00:00","index":"","fulltext":""},{"type":"submitted","content":"Signal, Image and Video Processing","date":"2025-12-29T05:12:24+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"signal-image-and-video-processing","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"sivp","sideBox":"Learn more about [Signal, Image and Video Processing](http://link.springer.com/journal/11760)","snPcode":"11760","submissionUrl":"https://submission.nature.com/new-submission/11760/3","title":"Signal, Image and Video Processing","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"a025061b-1832-414d-afbb-d5471525a21e","owner":[],"postedDate":"January 14th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-05-17T08:08:32+00:00","versionOfRecord":[],"versionCreatedAt":"2026-01-14 04:27:29","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8469320","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8469320","identity":"rs-8469320","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00