Segmenting Objects with Imbalanced Sizes via Smooth and Sparse Dual Optimal Transport | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Segmenting Objects with Imbalanced Sizes via Smooth and Sparse Dual Optimal Transport Mengqi Ding, Gangxuan Zhou, XueCheng Tai, Li Cui, Jun Liu This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8438177/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 9 You are reading this latest preprint version Abstract In image segmentation, object sizes (volumes) are often highly imbalanced, which adversely affects the performance of data-driven methods. To address this, we formulate the imbalance problem through optimal transport theory, providing a geometric interpretation of segmentation via Laguerre cell decomposition. By interpreting the dual variable of the volume constraint as a learnable network bias and solving the smooth semi-dual formulation iteratively while incorporating spatial information of pixels, we propose an iterative network-embedding layer, VP-Sparsemax, which enables end-to-end integration of volume priors into convolutional neural networks. Furthermore, we theoretically and experimentally demonstrate the critical role of sparsity, compared to traditional softmax or modified softmax, VP-Sparsemax better preserves volume in the segmentation results after argmax. We validated this performance with a toy example and four datasets including medical, autonomous driving and remote sensing images in three segmentation network baselines, achieving superior segmentation outcomes, particularly for small targets that are easily overlooked. Image segmentation Imbalance Volume prior Optimal transport Deep learning Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 25 Mar, 2026 Reviews received at journal 24 Mar, 2026 Reviews received at journal 13 Mar, 2026 Reviewers agreed at journal 14 Jan, 2026 Reviewers agreed at journal 12 Jan, 2026 Reviewers invited by journal 12 Jan, 2026 Editor assigned by journal 04 Jan, 2026 Submission checks completed at journal 27 Dec, 2025 First submitted to journal 23 Dec, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8438177","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":573442336,"identity":"e46deaa9-0615-4038-8218-fbc4113ba0bc","order_by":0,"name":"Mengqi Ding","email":"","orcid":"","institution":"Beijing Normal University","correspondingAuthor":false,"prefix":"","firstName":"Mengqi","middleName":"","lastName":"Ding","suffix":""},{"id":573442338,"identity":"8b88e707-5a62-4e1d-bec8-65985502e686","order_by":1,"name":"Gangxuan Zhou","email":"","orcid":"","institution":"Beijing Normal University","correspondingAuthor":false,"prefix":"","firstName":"Gangxuan","middleName":"","lastName":"Zhou","suffix":""},{"id":573442343,"identity":"5d0b598b-5cbe-48f3-bec1-830270945966","order_by":2,"name":"XueCheng Tai","email":"","orcid":"","institution":"Norwegian Research Center","correspondingAuthor":false,"prefix":"","firstName":"XueCheng","middleName":"","lastName":"Tai","suffix":""},{"id":573442344,"identity":"0575b2c0-0b3a-49c4-85c3-722b8fa986c0","order_by":3,"name":"Li Cui","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAqElEQVRIiWNgGAWjYBACPmYgkVBxAMyRIEoLG1jLGZK0gAjGNpK0sDM/k3g4706ewQHmg7d5GOzyiHAYm5lE4rZnxQYH2JKteRiSi4nQwsMG1HI4ccMBHjNpHoYDiQ3EaZkD0sL/jRQtDWBb2IjVwmZskXDsWeLMw2zGlnMMkglr4ec//PDmj5o7iX3Hmx/eeFNhR1gLAoDilMGAePWjYBSMglEwCvAAAJ74NegNWPETAAAAAElFTkSuQmCC","orcid":"","institution":"Beijing Normal University","correspondingAuthor":true,"prefix":"","firstName":"Li","middleName":"","lastName":"Cui","suffix":""},{"id":573442345,"identity":"c9ef556f-823b-4392-955e-c7b404d46712","order_by":4,"name":"Jun Liu","email":"","orcid":"","institution":"Beijing Normal University","correspondingAuthor":false,"prefix":"","firstName":"Jun","middleName":"","lastName":"Liu","suffix":""}],"badges":[],"createdAt":"2025-12-24 02:53:14","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8438177/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8438177/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":100200982,"identity":"2155bdef-0435-48be-92bb-54e015759ef3","added_by":"auto","created_at":"2026-01-14 04:57:46","extension":"json","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":6394,"visible":true,"origin":"","legend":"","description":"","filename":"8ed6bee9dfad4252a59eb59c899e2c48.json","url":"https://assets-eu.researchsquare.com/files/rs-8438177/v1/ddd5950a85799b78d296ee62.json"},{"id":100370157,"identity":"96f1d58f-1adc-4720-9b2f-014331b49884","added_by":"auto","created_at":"2026-01-16 08:00:09","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":5457063,"visible":true,"origin":"","legend":"","description":"","filename":"SegmentingObjectswithImbalancedSizesviaSmoothandSparseDualOptimalTransport.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8438177/v1_covered_dbc4303f-c3c6-4ec2-bead-2d34129f6ddf.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Segmenting Objects with Imbalanced Sizes via Smooth and Sparse Dual Optimal Transport","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"journal-of-mathematical-imaging-and-vision","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"jmiv","sideBox":"Learn more about [Journal of Mathematical Imaging and Vision](http://link.springer.com/journal/10851)","snPcode":"10851","submissionUrl":"https://submission.nature.com/new-submission/10851/3","title":"Journal of Mathematical Imaging and Vision","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Image segmentation, Imbalance, Volume prior, Optimal transport, Deep learning","lastPublishedDoi":"10.21203/rs.3.rs-8438177/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8438177/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"In image segmentation, object sizes (volumes) are often highly imbalanced, which adversely affects the performance of data-driven methods. To address this, we formulate the imbalance problem through optimal transport theory, providing a geometric interpretation of segmentation via Laguerre cell decomposition. By interpreting the dual variable of the volume constraint as a learnable network bias and solving the smooth semi-dual formulation iteratively while incorporating spatial information of pixels, we propose an iterative network-embedding layer, VP-Sparsemax, which enables end-to-end integration of volume priors into convolutional neural networks. Furthermore, we theoretically and experimentally demonstrate the critical role of sparsity, compared to traditional softmax or modified softmax, VP-Sparsemax better preserves volume in the segmentation results after argmax. We validated this performance with a toy example and four datasets including medical, autonomous driving and remote sensing images in three segmentation network baselines, achieving superior segmentation outcomes, particularly for small targets that are easily overlooked.","manuscriptTitle":"Segmenting Objects with Imbalanced Sizes via Smooth and Sparse Dual Optimal Transport","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-01-14 04:57:41","doi":"10.21203/rs.3.rs-8438177/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2026-03-25T07:02:32+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-03-24T21:50:13+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-03-14T03:27:44+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"12151295090379819771644995094102593562","date":"2026-01-15T02:55:49+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"116264028117724452345436633316767966868","date":"2026-01-12T17:13:30+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-01-12T16:27:09+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-01-04T08:49:11+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-12-27T12:58:41+00:00","index":"","fulltext":""},{"type":"submitted","content":"Journal of Mathematical Imaging and Vision","date":"2025-12-24T02:48:38+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"journal-of-mathematical-imaging-and-vision","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"jmiv","sideBox":"Learn more about [Journal of Mathematical Imaging and Vision](http://link.springer.com/journal/10851)","snPcode":"10851","submissionUrl":"https://submission.nature.com/new-submission/10851/3","title":"Journal of Mathematical Imaging and Vision","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"36ccc1ca-6c2f-40c6-a965-6b9bd0a5143d","owner":[],"postedDate":"January 14th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-05-11T23:09:05+00:00","versionOfRecord":[],"versionCreatedAt":"2026-01-14 04:57:41","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8438177","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8438177","identity":"rs-8438177","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.