Row-by-row convolutional neural networks for Analog-AI

preprint OA: closed CC-BY-4.0
📄 Open PDF Full text JSON View at publisher
Full text 13,931 characters · extracted from preprint-html · click to expand
Row-by-row convolutional neural networks for Analog-AI | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Row-by-row convolutional neural networks for Analog-AI Pritish Narayanan, Stefano Ambrogio, Charles Mackin, Masatoshi Ishii, and 11 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7853227/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Analog AI implements the multiply-accumulate operations that dominate deep learning at the location of the weight data, offering orders of magnitude performance and energy improvements over conventional digital systems. However, translating these benefits to the Convolutional Neural Networks (CNN) widely used in applications such as image and speech processing is non-trivial. Significant reuse of both weights and activations, together with the need to extensively rearrange activations between layers, require micro-architectural solutions that span across weight mapping strategy, activation positioning, pipelining between stages, and data transport. In this paper, we describe a weight-stationary Analog AI micro-architecture for Convolutional Neural Networks (CNNs), called Row-By-Row- (RBR-) CNNs and its associated circuits and pipelines. We show a hardware demonstration of RBR-CNNs on a 14nm Analog AI inference chip with Phase Change Memory (PCM), achieving software-equivalent accuracy on the ML Perf Benchmark “Key Word Spotting” task using 4 RBR CNN layers. We show RBR CNNs using Analog AI can achieve 7x–15x latency improvements vs. high performance chips reported on ML Perf, while offering extremely high energy-efficiency – at least two orders of magnitude better than published low-power edge chip results, indicating strong applicability in embedded and mobile settings. Physical sciences/Engineering/Electrical and electronic engineering Physical sciences/Nanoscience and technology Full Text Additional Declarations There is NO Competing Interest. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7853227","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":536620427,"identity":"0af80861-dd79-4d94-81a1-8a8a83848ac0","order_by":0,"name":"Pritish Narayanan","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAxElEQVRIiWNgGAWjYBADORBxmCQtxqRrSWwAEsxEKTWXbn724WebXfqG8wcYDxfmMMjzix3Ar8VyzjHjmb1tybkbbiQwHJ65jcFw5uwE/FoMbiQYM/C2MQO1AP3Cu40hweA2QS3pnxn/ttWnG5w/QLSWHGNm3rbDCQYHEojUYjkjp5hZ5txxw5k3EhuAfpEg7BdzifTNjG/KquX5zh8+/Llwm408vzQhh4EIRjYw2QAkJPArh2th+ENY4SgYBaNgFIxgAADNgURmvVGiGgAAAABJRU5ErkJggg==","orcid":"","institution":"IBM Research--Almaden","correspondingAuthor":true,"prefix":"","firstName":"Pritish","middleName":"","lastName":"Narayanan","suffix":""},{"id":536620428,"identity":"38a39d5c-4b38-40db-8f1f-42872f71ad82","order_by":1,"name":"Stefano Ambrogio","email":"","orcid":"","institution":"IBM Research","correspondingAuthor":false,"prefix":"","firstName":"Stefano","middleName":"","lastName":"Ambrogio","suffix":""},{"id":536620429,"identity":"24e6d158-ab2b-415f-aefd-6829080fa7ad","order_by":2,"name":"Charles Mackin","email":"","orcid":"https://orcid.org/0000-0001-8413-5583","institution":"IBM Research","correspondingAuthor":false,"prefix":"","firstName":"Charles","middleName":"","lastName":"Mackin","suffix":""},{"id":536620430,"identity":"377ea1bd-65fc-4df2-b002-b97bf2e6657b","order_by":3,"name":"Masatoshi Ishii","email":"","orcid":"https://orcid.org/0000-0003-0794-7232","institution":"IBM Research","correspondingAuthor":false,"prefix":"","firstName":"Masatoshi","middleName":"","lastName":"Ishii","suffix":""},{"id":536620431,"identity":"c303b981-80aa-4dc6-bc19-d3f8af038484","order_by":4,"name":"Benjamin Killeen","email":"","orcid":"","institution":"IBM Reseach -- Almaden","correspondingAuthor":false,"prefix":"","firstName":"Benjamin","middleName":"","lastName":"Killeen","suffix":""},{"id":536620432,"identity":"341fbe47-ac5c-4191-897d-781956a8f92d","order_by":5,"name":"Atsuya Okazaki","email":"","orcid":"https://orcid.org/0000-0002-5275-5224","institution":"IBM Research","correspondingAuthor":false,"prefix":"","firstName":"Atsuya","middleName":"","lastName":"Okazaki","suffix":""},{"id":536620433,"identity":"db1d3b8d-96e7-4593-914e-c56be828a141","order_by":6,"name":"Kohji Hosokawa","email":"","orcid":"","institution":"IBM Research","correspondingAuthor":false,"prefix":"","firstName":"Kohji","middleName":"","lastName":"Hosokawa","suffix":""},{"id":536620434,"identity":"eef5cc79-90d7-4ba4-be21-704b3dc6bce2","order_by":7,"name":"Jose Luquin","email":"","orcid":"https://orcid.org/0009-0005-9539-1386","institution":"IBM Research","correspondingAuthor":false,"prefix":"","firstName":"Jose","middleName":"","lastName":"Luquin","suffix":""},{"id":536620435,"identity":"1f94ba19-46d6-483e-96f1-b9d77096a4b3","order_by":8,"name":"An Chen","email":"","orcid":"https://orcid.org/0000-0001-8022-4431","institution":"IBM Research","correspondingAuthor":false,"prefix":"","firstName":"An","middleName":"","lastName":"Chen","suffix":""},{"id":536620436,"identity":"018d436c-222c-4cba-8580-a448851664f2","order_by":9,"name":"Alexander Friz","email":"","orcid":"","institution":"IBM Research","correspondingAuthor":false,"prefix":"","firstName":"Alexander","middleName":"","lastName":"Friz","suffix":""},{"id":536620437,"identity":"0cbd5dd7-6cab-4b72-a36e-21013e07c0c0","order_by":10,"name":"Maritha Wang","email":"","orcid":"https://orcid.org/0009-0002-6160-5030","institution":"Stanford University","correspondingAuthor":false,"prefix":"","firstName":"Maritha","middleName":"","lastName":"Wang","suffix":""},{"id":536620438,"identity":"2fe3510d-ce14-4cdf-bb70-6bb9b6db7944","order_by":11,"name":"Sourjya Roy","email":"","orcid":"","institution":"IBM Reseach -- Almaden","correspondingAuthor":false,"prefix":"","firstName":"Sourjya","middleName":"","lastName":"Roy","suffix":""},{"id":536620439,"identity":"f98ecb2c-2bcb-41b0-9724-8a904d14cb09","order_by":12,"name":"Shubham Jain","email":"","orcid":"https://orcid.org/0000-0002-2291-7712","institution":"IBM Reseach","correspondingAuthor":false,"prefix":"","firstName":"Shubham","middleName":"","lastName":"Jain","suffix":""},{"id":536620440,"identity":"9cdbfc2a-f414-45c0-9607-3b05c7995e32","order_by":13,"name":"HsinYu Tsai","email":"","orcid":"https://orcid.org/0000-0002-3971-097X","institution":"IBM Almaden Research Center","correspondingAuthor":false,"prefix":"","firstName":"HsinYu","middleName":"","lastName":"Tsai","suffix":""},{"id":536620441,"identity":"3192c7eb-f2bc-44bf-80bd-456201a3fe2d","order_by":14,"name":"Geoffrey Burr","email":"","orcid":"https://orcid.org/0000-0001-5717-2549","institution":"IBM Research--Almaden","correspondingAuthor":false,"prefix":"","firstName":"Geoffrey","middleName":"","lastName":"Burr","suffix":""}],"badges":[],"createdAt":"2025-10-14 02:35:14","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7853227/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7853227/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":100796549,"identity":"348dd881-c23b-40ee-b02e-eaf61a6317c4","added_by":"auto","created_at":"2026-01-21 13:44:08","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":717873,"visible":true,"origin":"","legend":"Article File","description":"","filename":"NatureElectronicsMainPaperRBRCNNForAnalogAI.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7853227/v1_covered_6ac3d0de-a419-42a8-a958-efcc8d7ac6c0.pdf"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"\u003cp\u003eRow-by-row convolutional neural networks for Analog-AI\u003c/p\u003e","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-7853227/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7853227/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Analog AI implements the multiply-accumulate operations that dominate deep learning at the location of the weight data, offering orders of magnitude performance and energy improvements over conventional digital systems. However, translating these benefits to the Convolutional Neural Networks (CNN) widely used in applications such as image and speech processing is non-trivial. Significant reuse of both weights and activations, together with the need to extensively rearrange activations between layers, require micro-architectural solutions that span across weight mapping strategy, activation positioning, pipelining between stages, and data transport. In this paper, we describe a weight-stationary Analog AI micro-architecture for Convolutional Neural Networks (CNNs), called Row-By-Row- (RBR-) CNNs and its associated circuits and pipelines. We show a hardware demonstration of RBR-CNNs on a 14nm Analog AI inference chip with Phase Change Memory (PCM), achieving software-equivalent accuracy on the ML Perf Benchmark “Key Word Spotting” task using 4 RBR CNN layers. We show RBR CNNs using Analog AI can achieve 7x–15x latency improvements vs. high performance chips reported on ML Perf, while offering extremely high energy-efficiency – at least two orders of magnitude better than published low-power edge chip results, indicating strong applicability in embedded and mobile settings.","manuscriptTitle":"Row-by-row convolutional neural networks for Analog-AI","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-30 05:40:04","doi":"10.21203/rs.3.rs-7853227/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"804c0240-9de1-4552-8d93-db0238b6c544","owner":[],"postedDate":"October 30th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":57134587,"name":"Physical sciences/Engineering/Electrical and electronic engineering"},{"id":57134588,"name":"Physical sciences/Nanoscience and technology"}],"tags":[],"updatedAt":"2026-01-21T03:01:44+00:00","versionOfRecord":[],"versionCreatedAt":"2025-10-30 05:40:04","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7853227","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7853227","identity":"rs-7853227","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-23T02:00:01.238055+00:00
License: CC-BY-4.0