{"paper_id":"21c61cec-b67b-40e0-91eb-88b3cdbd519d","body_text":"Sublinear Algorithms for Several Geometric Optimization (With Outliers) Problems | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Sublinear Algorithms for Several Geometric Optimization (With Outliers) Problems HU DING This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-3804644/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract In this paper, we study several important geometric optimization problems arising in machine learning. First, we revisit the Minimum Enclosing Ball (MEB) problem in Rd. The problem has been extensively studied before, but real-world machine learning tasks often need to handle large-scale datasets so that we cannot even afford linear time algorithms. We introduce the notion of stability for MEB, which is natural and easy to understand. Roughly speaking, an instance of MEB is stable, if the radius of the resulting ball cannot be significantly reduced by removing a small fraction of the input points. Under the stability assumption, we present two sampling algorithms for computing radius-approximate MEB with sample complexities independent of the number of input points n. In particular, the second algorithm has the sample complexity even independent of the dimensionality d. We also consider the general case without the stability assumption. We present a hybrid algorithm that can output either a radius-approximate MEB or a covering-approximate MEB. Our algorithm improves the running time and the number of passes for the previous sublinear MEB algorithms. Our method relies on two novel techniques, the Uniform-Adaptive Sampling method and Sandwich Lemma. Furthermore, we observe that these two techniques can be generalized to design sublinear time algorithms for a range of geometric optimization problems with outliers in high dimensions, including MEB with outliers, one-class and two-class linear SVMs with outliers, k-center clustering with outliers, and flat fitting with outliers. minimum enclosing ball outliers sublinear coreset machine learning Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {\"props\":{\"pageProps\":{\"initialData\":{\"identity\":\"rs-3804644\",\"acceptedTermsAndConditions\":true,\"allowDirectSubmit\":true,\"archivedVersions\":[],\"articleType\":\"Research Article\",\"associatedPublications\":[],\"authors\":[{\"id\":264380614,\"identity\":\"e163db70-969f-4367-ae8e-32291fcd579d\",\"order_by\":0,\"name\":\"HU DING\",\"email\":\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAyElEQVRIiWNgGAWjYBACAyA6wMBgA+WyEa8lDaqaWC1A6jAJWswlkjceLvh1PnH+/OYHDB/KDjPwz27Ar8VyRlrB4Zl9txM3HGMzYJxx7jCDxJ0DBBx2I8fgMG8PUAsbDwMzb9thBgOJBKK0nEuc3wbU8pdoLTw/DiQ2HANqYSRKy5lnBYd5G5KNNxxLMzjYcy6dR+IGIS3Hkzd/5vljJzu/+fDDBz/KrOX4ZxDQAgaMbRD6ABDzEKEeBP4QqW4UjIJRMApGJgAAL8dHmjmzl6EAAAAASUVORK5CYII=\",\"orcid\":\"\",\"institution\":\"University of Science and Technology of China\",\"correspondingAuthor\":true,\"prefix\":\"\",\"firstName\":\"HU\",\"middleName\":\"\",\"lastName\":\"DING\",\"suffix\":\"\"}],\"badges\":[],\"createdAt\":\"2023-12-25 13:29:07\",\"currentVersionCode\":1,\"declarations\":\"\",\"doi\":\"10.21203/rs.3.rs-3804644/v1\",\"doiUrl\":\"https://doi.org/10.21203/rs.3.rs-3804644/v1\",\"draftVersion\":[],\"editorialEvents\":[],\"editorialNote\":\"\",\"failedWorkflow\":false,\"files\":[{\"id\":63700872,\"identity\":\"1af0b95e-1279-4ed9-a946-184d2c19af3f\",\"added_by\":\"auto\",\"created_at\":\"2024-08-31 15:27:00\",\"extension\":\"pdf\",\"order_by\":1,\"title\":\"\",\"display\":\"\",\"copyAsset\":false,\"role\":\"manuscript-pdf\",\"size\":921814,\"visible\":true,\"origin\":\"\",\"legend\":\"\",\"description\":\"\",\"filename\":\"Archive.pdf\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-3804644/v1_covered_8892f5f6-4305-4141-9e7e-b3d3beefa9d1.pdf\"}],\"financialInterests\":\"No competing interests reported.\",\"formattedTitle\":\"Sublinear Algorithms for Several Geometric Optimization (With Outliers) Problems\",\"fulltext\":[],\"fulltextSource\":\"\",\"fullText\":\"\",\"funders\":[],\"hasAdminPriorityOnWorkflow\":false,\"hasManuscriptDocX\":false,\"hasOptedInToPreprint\":true,\"hasPassedJournalQc\":\"\",\"hasAnyPriority\":false,\"hideJournal\":true,\"highlight\":\"\",\"institution\":\"\",\"isAcceptedByJournal\":false,\"isAuthorSuppliedPdf\":true,\"isDeskRejected\":\"\",\"isHiddenFromSearch\":false,\"isInQc\":false,\"isInWorkflow\":false,\"isPdf\":true,\"isPdfUpToDate\":true,\"isWithdrawnOrRetracted\":false,\"journal\":{\"display\":true,\"email\":\"info@researchsquare.com\",\"identity\":\"researchsquare\",\"isNatureJournal\":false,\"hasQc\":true,\"allowDirectSubmit\":true,\"externalIdentity\":\"\",\"sideBox\":\"\",\"snPcode\":\"\",\"submissionUrl\":\"/submission\",\"title\":\"Research Square\",\"twitterHandle\":\"researchsquare\",\"acdcEnabled\":true,\"dfaEnabled\":false,\"editorialSystem\":\"\",\"reportingPortfolio\":\"\",\"inReviewEnabled\":false,\"inReviewRevisionsEnabled\":true},\"keywords\":\"minimum enclosing ball, outliers, sublinear, coreset, machine learning\",\"lastPublishedDoi\":\"10.21203/rs.3.rs-3804644/v1\",\"lastPublishedDoiUrl\":\"https://doi.org/10.21203/rs.3.rs-3804644/v1\",\"license\":{\"name\":\"CC BY 4.0\",\"url\":\"https://creativecommons.org/licenses/by/4.0/\"},\"manuscriptAbstract\":\"In this paper, we study several important geometric optimization problems arising in machine learning. First, we revisit the Minimum Enclosing Ball (MEB) problem in Rd. The problem has been extensively studied before, but real-world machine learning tasks often need to handle large-scale datasets so that we cannot even afford linear time algorithms. We introduce the notion of stability for MEB, which is natural and easy to understand. Roughly speaking, an instance of MEB is stable, if the radius of the resulting ball cannot be significantly reduced by removing a small fraction of the input points. Under the stability assumption, we present two sampling algorithms for computing radius-approximate MEB with sample complexities independent of the number of input points n. In particular, the second algorithm has the sample complexity even independent of the dimensionality d. We also consider the general case without the stability assumption. We present a hybrid algorithm that can output either a radius-approximate MEB or a covering-approximate MEB. Our algorithm improves the running time and the number of passes for the previous sublinear MEB algorithms. Our method relies on two novel techniques, the Uniform-Adaptive Sampling method and Sandwich Lemma. Furthermore, we observe that these two techniques can be generalized to design sublinear time algorithms for a range of geometric optimization problems with outliers in high dimensions, including MEB with outliers, one-class and two-class linear SVMs with outliers, k-center clustering with outliers, and flat fitting with outliers. \",\"manuscriptTitle\":\"Sublinear Algorithms for Several Geometric Optimization (With Outliers) Problems\",\"msid\":\"\",\"msnumber\":\"\",\"nonDraftVersions\":[{\"code\":1,\"date\":\"2024-01-01 01:48:36\",\"doi\":\"10.21203/rs.3.rs-3804644/v1\",\"editorialEvents\":[{\"type\":\"communityComments\",\"content\":0}],\"status\":\"published\",\"journal\":{\"display\":true,\"email\":\"info@researchsquare.com\",\"identity\":\"researchsquare\",\"isNatureJournal\":false,\"hasQc\":true,\"allowDirectSubmit\":true,\"externalIdentity\":\"\",\"sideBox\":\"\",\"snPcode\":\"\",\"submissionUrl\":\"/submission\",\"title\":\"Research Square\",\"twitterHandle\":\"researchsquare\",\"acdcEnabled\":true,\"dfaEnabled\":false,\"editorialSystem\":\"\",\"reportingPortfolio\":\"\",\"inReviewEnabled\":false,\"inReviewRevisionsEnabled\":true}}],\"origin\":\"\",\"ownerIdentity\":\"55bf41c6-7a60-4437-b304-ad2a01542656\",\"owner\":[],\"postedDate\":\"January 1st, 2024\",\"published\":true,\"recentEditorialEvents\":[],\"rejectedJournal\":[],\"revision\":\"\",\"amendment\":\"\",\"status\":\"posted\",\"subjectAreas\":[],\"tags\":[],\"updatedAt\":\"2024-08-31T15:18:50+00:00\",\"versionOfRecord\":[],\"versionCreatedAt\":\"2024-01-01 01:48:36\",\"video\":\"\",\"vorDoi\":\"\",\"vorDoiUrl\":\"\",\"workflowStages\":[]},\"version\":\"v1\",\"identity\":\"rs-3804644\",\"journalConfig\":\"researchsquare\"},\"__N_SSP\":true},\"page\":\"/article/[identity]/[[...version]]\",\"query\":{\"redirect\":\"/article/rs-3804644\",\"identity\":\"rs-3804644\",\"version\":[\"v1\"]},\"buildId\":\"qtupq5eGEP_6zYnWcrvyt\",\"isFallback\":false,\"isExperimentalCompile\":false,\"dynamicIds\":[84888],\"gssp\":true,\"scriptLoader\":[]}","source_license":"CC-BY-4.0","license_restricted":false}