OncoSML: A Self-Maintaining Machine Learning Framework for Personalized Cancer Vaccine Research

preprint OA: closed
Full text JSON View at publisher
Full text 14,811 characters · extracted from preprint-html · click to expand
OncoSML: A Self-Maintaining Machine Learning Framework for Personalized Cancer Vaccine Research | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article OncoSML: A Self-Maintaining Machine Learning Framework for Personalized Cancer Vaccine Research Aditya Roy Bardhan This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9267477/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Cancer kills nearly 10 million people annually and costs the global economy over $200 billion (€184 billion / ₹16.7 lakh Crore) in direct medical expenditure. Conventional treatment modalities — surgery, chemotherapy, radiotherapy, targeted therapy, and immune checkpoint inhibitors — while clinically valuable, share fundamental limitations: non-specificity, treatment-induced resistance, fixed model parameters, inability to adapt to tumor evolution, and substantial toxicity burdens that significantly degrade patient quality of life. The economic consequences are equally severe, with treatments such as checkpoint inhibitors costing $150,000 (€138,000 / ₹1.25 Crore) per patient per year, rendering them financially catastrophic for the majority of the global population. This paper presents OncoSML (Oncology Self-Maintaining Learning System), an open-source, research-grade, end-to-end machine learning pipeline developed at the Department of Data Science and Artificial Intelligence, Indian Institute of Technology Guwahati, by Aditya Roy Bardhan. OncoSML uniquely integrates the complete personalized cancer vaccine development workflow — from raw genomic input (FASTQ/BAM/VCF) through somatic variant identification, multi-parameter neoantigen scoring, mRNA vaccine construct synthesis, multi-gate biological safety validation, and clinical genomics stack orchestration — within a single, self-improving software ecosystem governed by a continuous learning loop that updates model parameters as new genomic data and validated outcomes are incorporated. This comprehensive edition makes six primary contributions: (1) a detailed technical description of OncoSML's modular architecture, algorithmic design, and self-maintaining learning loop; (2) a comprehensive genome-to-vaccine biological walkthrough supported by six original scientific diagrams explaining DNA structure, somatic mutation types, mRNA vaccine construction, the complete immune response cascade, and patient recovery; (3) an evidence-based argument for why personalized mRNA vaccines are medically, biologically, economically, and from a patient wellbeing perspective the optimal cancer treatment modality; (4) a detailed patient recovery analysis comparing quality of life, physical function, and multi-domain wellbeing; (5) a comparative analysis against six existing neoantigen prediction tools demonstrating OncoSML is the only system supporting all 10 key pipeline capabilities simultaneously; and (6) a tri-currency economic analysis (USD / EUR / INR) demonstrating a 36% total per-patient cost reduction. Key findings across 24 original visualizations: OncoSML achieves binding affinity AUC of 0.921 after 52 weeks of self-maintaining operation versus 0.841 for static tools; mRNA vaccine therapy produces <5% Grade 3-4 adverse events versus 62% for chemotherapy (15× safer); projected 5-year survival improvements of 50–136% relative improvement over conventional treatment across major cancer types; $66,000 (€60,720 / ₹55.1 lakh) lifecycle cost savings per patient; and a 2030 India target of ₹2 lakhs ($2,395 / €2,204) — the only advanced cancer therapy affordable for India's middle class of 300 million people. Immunology Cancer Biology Artificial Intelligence and Machine Learning Medical Genetics OncoSML personalized cancer vaccine neoantigen discovery mRNA vaccine design selfmaintaining AI precision immunotherapy IIT Guwahati cost-effectiveness HLA binding affinity somatic mutation analysis pVACseq comparison cancer treatment cost USD EUR INR patient recovery analysis genome biology immune response cascade. Full Text Additional Declarations The authors declare no competing interests. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9267477","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":614615416,"identity":"89af69eb-1df3-477a-9af5-ddf75e6b6f03","order_by":0,"name":"Aditya Roy Bardhan","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABHklEQVRIiWNgGAWjYFCCxAYILQHEvEA2P4iTUIBXS2MDihZJEDfBAJ+WBEZULQYHQDw8Wvjbk9sf89TckzO43fzsw9sddnnG51cnfnhgwCDPL3YAqxaJMw8bm3mOFRsb3DlmPHPumeRisxtvN0sAHWY4c3YCdmtuJAK1sCUkbriRYMzM28acuO3G2Q0gLQkGt7FrkQdr+QfSkv4ZqKU+cfOMs5t/4NNiANLC2wbSkgOy5XDiBv7ebXhtMQT6ZebcvgRjyTtnihnnth1PnHGDd5tFgoEETr/IHU9/8OHNtwQ5vtvtmxnetlUn9vef3XzzR4WNPL80Du8DARMPkFA4AONKgFVK4FQOAow/QOHQAOPyH8CpchSMglEwCkYmAAB89W0nzNnmBwAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0009-0004-9833-2034","institution":"Indian Institute of Technology Guwahati","correspondingAuthor":true,"prefix":"","firstName":"Aditya","middleName":"Roy","lastName":"Bardhan","suffix":""}],"badges":[],"createdAt":"2026-03-30 13:23:15","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":true,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":true},"doi":"10.21203/rs.3.rs-9267477/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9267477/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":105904286,"identity":"88c82aed-9c5c-46c0-ba1f-68caa898ffe3","added_by":"auto","created_at":"2026-04-01 10:07:09","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1810275,"visible":true,"origin":"","legend":"","description":"","filename":"OncoSMLCOMPREHENSIVEFINAL100PAGE.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9267477/v1_covered_317c59a1-5d80-408f-9a8f-c151f1d20250.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003eOncoSML: A Self-Maintaining Machine Learning Framework for Personalized Cancer Vaccine Research\u003c/p\u003e","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Indian Institute of Technology Guwahati","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"OncoSML, personalized cancer vaccine, neoantigen discovery, mRNA vaccine design, selfmaintaining AI, precision immunotherapy, IIT Guwahati, cost-effectiveness, HLA binding affinity, somatic mutation analysis, pVACseq comparison, cancer treatment cost USD EUR INR, patient recovery analysis, genome biology, immune response cascade.","lastPublishedDoi":"10.21203/rs.3.rs-9267477/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9267477/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eCancer kills nearly 10 million people annually and costs the global economy over $200 billion (€184 billion / ₹16.7 lakh Crore) in direct medical expenditure. Conventional treatment modalities — surgery, chemotherapy, radiotherapy, targeted therapy, and immune checkpoint inhibitors — while clinically valuable, share fundamental limitations: non-specificity, treatment-induced resistance, fixed model parameters, inability to adapt to tumor evolution, and substantial toxicity burdens that significantly degrade patient quality of life. The economic consequences are equally severe, with treatments such as checkpoint inhibitors costing $150,000 (€138,000 / ₹1.25 Crore) per patient per year, rendering them financially catastrophic for the majority of the global population.\u003c/p\u003e\n\u003cp\u003eThis paper presents OncoSML (Oncology Self-Maintaining Learning System), an open-source, research-grade, end-to-end machine learning pipeline developed at the Department of Data Science and Artificial Intelligence, Indian Institute of Technology Guwahati, by Aditya Roy Bardhan. OncoSML uniquely integrates the complete personalized cancer vaccine development workflow — from raw genomic input (FASTQ/BAM/VCF) through somatic variant identification, multi-parameter neoantigen scoring, mRNA vaccine construct synthesis, multi-gate biological safety validation, and clinical genomics stack orchestration — within a single, self-improving software ecosystem governed by a continuous learning loop that updates model parameters as new genomic data and validated outcomes are incorporated.\u003c/p\u003e\n\u003cp\u003eThis comprehensive edition makes six primary contributions: (1) a detailed technical description of OncoSML's modular architecture, algorithmic design, and self-maintaining learning loop; (2) a comprehensive genome-to-vaccine biological walkthrough supported by six original scientific diagrams explaining DNA structure, somatic mutation types, mRNA vaccine construction, the complete immune response cascade, and patient recovery; (3) an evidence-based argument for why personalized mRNA vaccines are medically, biologically, economically, and from a patient wellbeing perspective the optimal cancer treatment modality; (4) a detailed patient recovery analysis comparing quality of life, physical function, and multi-domain wellbeing; (5) a comparative analysis against six existing neoantigen prediction tools demonstrating OncoSML is the only system supporting all 10 key pipeline capabilities simultaneously; and (6) a tri-currency economic analysis (USD / EUR / INR) demonstrating a 36% total per-patient cost reduction.\u003c/p\u003e\n\u003cp\u003eKey findings across 24 original visualizations: OncoSML achieves binding affinity AUC of 0.921 after 52 weeks of self-maintaining operation versus 0.841 for static tools; mRNA vaccine therapy produces \u0026lt;5% Grade 3-4 adverse events versus 62% for chemotherapy (15× safer); projected 5-year survival improvements of 50–136% relative improvement over conventional treatment across major cancer types; $66,000 (€60,720 / ₹55.1 lakh) lifecycle cost savings per patient; and a 2030 India target of ₹2 lakhs ($2,395 / €2,204) — the only advanced cancer therapy affordable for India's middle class of 300 million people.\u003c/p\u003e","manuscriptTitle":"OncoSML: A Self-Maintaining Machine Learning Framework for Personalized Cancer Vaccine Research","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-03-31 06:18:05","doi":"10.21203/rs.3.rs-9267477/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"645c38ce-20b2-4a6d-a810-0c507f97cb5d","owner":[],"postedDate":"March 31st, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":65440995,"name":"Immunology"},{"id":65440996,"name":"Cancer Biology"},{"id":65440997,"name":"Artificial Intelligence and Machine Learning"},{"id":65440998,"name":"Medical Genetics"}],"tags":[],"updatedAt":"2026-03-31T06:18:05+00:00","versionOfRecord":[],"versionCreatedAt":"2026-03-31 06:18:05","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9267477","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9267477","identity":"rs-9267477","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00