Real-Time Mobile Music Note and Instrument Recognition: A Unified Deep Learning vs. Classical ML Benchmark on MusicNet and NSynth

preprint OA: closed
Full text JSON View at publisher
Full text 11,662 characters · extracted from preprint-html · click to expand
Real-Time Mobile Music Note and Instrument Recognition: A Unified Deep Learning vs. Classical ML Benchmark on MusicNet and NSynth | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Real-Time Mobile Music Note and Instrument Recognition: A Unified Deep Learning vs. Classical ML Benchmark on MusicNet and NSynth Tarek Ammar, Aya Alaya, Tarek Barhoum This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8457560/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract The rapid growth of artificial intelligence and mobile computing has enabled real-time music analysis; however, accurate musical note and instrument recognition on mobile devices remains challenging due to limited computational resources, noisy audio, and strict latency constraints. This paper presents Instrumaster, a unified mobile framework for real-time musical note and instrument recognition that integrates robust audio preprocessing, feature engineering, and efficient inference. Musical note recognition is evaluated using LSTM, CNN, Feedforward Neural Network (FNN), and Logistic Regression models, while instrument recognition is performed using a Multi-Layer Perceptron (MLP). Experiments conducted on the MusicNet and NSynth datasets demonstrate that sequential models effectively capture temporal dependencies, while classical machine learning approaches can achieve competitive performance with significantly lower computational complexity. Notably, Logistic Regression achieves strong accuracy under limited data conditions, highlighting the importance of informed model selection for mobile deployment. Overall, the results provide practical insights into accuracy–efficiency trade-offs and establish a reference framework for designing reliable and real-time mobile music recognition systems. Artificial Intelligence and Machine Learning Real-Time Music Recognition Mobile Audio Analysis Musical Note Recognition Musical Instrument Recognition Deep Learning Classical Machine Learning Feature Engineering LSTM Convolutional Neural Networks Logistic Regression MusicNet NSynth Full Text Additional Declarations The authors declare no competing interests. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8457560","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":565982163,"identity":"c0ee75d8-b738-4030-832e-f356a28f3269","order_by":0,"name":"Tarek Ammar","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA5klEQVRIiWNgGAWjYHADxsYHQJKHj5A6HgST+bABSICNBC1saRJgipAWe/azDx+8qdkWzT/7jFnl1xw7GTYG5oePbuCzhSfd2HDOsdu5M87lmN2W3ZYMdBibsXEOXoelsUnzsN3ObTjDY3ZbchszUAsPmzReLfzP2H/z/LudOx+opVhyWz0RWiTS2Jh5227nbjjDlsb4cdthIrTceMYsObfvdu7GM8yHpRm3HedhYybgF/b+NMYPb77dzp13hrHx489t1fb87M0PH+PTArEKSjODGcyElCNrYfxBjOpRMApGwSgYcQAAXI1EoPBOYxEAAAAASUVORK5CYII=","orcid":"","institution":"Arab International University","correspondingAuthor":true,"prefix":"","firstName":"Tarek","middleName":"","lastName":"Ammar","suffix":""},{"id":565982164,"identity":"d2ddf54d-c873-4ee1-80b2-21a19d4650cf","order_by":1,"name":"Aya Alaya","email":"","orcid":"","institution":"Arab Inernational University","correspondingAuthor":false,"prefix":"","firstName":"Aya","middleName":"","lastName":"Alaya","suffix":""},{"id":565982165,"identity":"ba8b9843-49a6-472b-a1e9-10164b95b47c","order_by":2,"name":"Tarek Barhoum","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA00lEQVRIiWNgGAWjYFACHgYGxgYGBn4wycBMjAaoFskGkrUYHADzidBiz8B78HPljsPRxscPt0kwVFgnNvCfMSBgC1+y5Nkzh3O3nUkEajmTntggkUNAi/wbA8nGNqCWA0AtjG2HgVp4CNnCY/wTpGVz/0Ogln+HiXEYjxnYlg0SIFsagFoYCDnsAF+aZWNbeu6MGw+bLRKOpRu3SaQV4NXC3sB7+GZjm3Vuf3/6wxsfaqxl+/kPb8CrBRUkADEbAwd+h2G1+QHJWkbBKBgFo2BYAwCWcEX6HQh8dgAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0000-0002-9132-6940","institution":"Arab International University","correspondingAuthor":true,"prefix":"","firstName":"Tarek","middleName":"","lastName":"Barhoum","suffix":""}],"badges":[],"createdAt":"2025-12-26 20:52:27","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":true,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":true},"doi":"10.21203/rs.3.rs-8457560/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8457560/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":99796715,"identity":"f946e8f4-7899-48b8-bdf0-57ddd7c7feeb","added_by":"auto","created_at":"2026-01-08 13:43:15","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":662691,"visible":true,"origin":"","legend":"","description":"","filename":"instrumeter.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8457560/v1_covered_223f4896-44d1-4686-93c8-4452c5c12f8d.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003e\u003cstrong\u003eReal-Time Mobile Music Note and Instrument Recognition: A Unified Deep Learning vs. Classical ML Benchmark on MusicNet and NSynth\u003c/strong\u003e\u003c/p\u003e","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Arab International University","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Real-Time Music Recognition, Mobile Audio Analysis, Musical Note Recognition, Musical Instrument Recognition, Deep Learning, Classical Machine Learning, Feature Engineering, LSTM, Convolutional Neural Networks, Logistic Regression, MusicNet, NSynth","lastPublishedDoi":"10.21203/rs.3.rs-8457560/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8457560/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThe rapid growth of artificial intelligence and mobile computing has enabled real-time music analysis; however, accurate musical note and instrument recognition on mobile devices remains challenging due to limited computational resources, noisy audio, and strict latency constraints. This paper presents Instrumaster, a unified mobile framework for real-time musical note and instrument recognition that integrates robust audio preprocessing, feature engineering, and efficient inference. Musical note recognition is evaluated using LSTM, CNN, Feedforward Neural Network (FNN), and Logistic Regression models, while instrument recognition is performed using a Multi-Layer Perceptron (MLP). Experiments conducted on the MusicNet and NSynth datasets demonstrate that sequential models effectively capture temporal dependencies, while classical machine learning approaches can achieve competitive performance with significantly lower computational complexity. Notably, Logistic Regression achieves strong accuracy under limited data conditions, highlighting the importance of informed model selection for mobile deployment. Overall, the results provide practical insights into accuracy–efficiency trade-offs and establish a reference framework for designing reliable and real-time mobile music recognition systems.\u003c/p\u003e","manuscriptTitle":"Real-Time Mobile Music Note and Instrument Recognition: A Unified Deep Learning vs. Classical ML Benchmark on MusicNet and NSynth","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-01-07 15:04:44","doi":"10.21203/rs.3.rs-8457560/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"549d6b96-4626-421c-9b17-a5e18ef2167d","owner":[],"postedDate":"January 7th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":60254261,"name":"Artificial Intelligence and Machine Learning"}],"tags":[],"updatedAt":"2026-01-07T15:04:44+00:00","versionOfRecord":[],"versionCreatedAt":"2026-01-07 15:04:44","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8457560","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8457560","identity":"rs-8457560","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00