FPGA-Accelerated Real-Time DCGANs via Xilinx DPUs and Vitis AI | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article FPGA-Accelerated Real-Time DCGANs via Xilinx DPUs and Vitis AI Amirhossein Sadr, Shayan Haghighat, Aida Pakniyat, Dara Rahmati, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7263274/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 21 You are reading this latest preprint version Abstract Generative Adversarial Networks (GANs) produce high-quality images but are computationally intensive, especially due to transposed convolution operations, limiting their real-time performance on traditional hardware. To address this, we propose an optimized FPGA-based acceleration framework leveraging Xil-inx Deep Learning Processing Units (DPUs) and the Vitis AI toolchain to enable real-time inference of Deep Convolutional GANs (DCGANs) for image reconstruction. The proposed approach applies a two-stage quantization method that profiles layer-wise dynamic ranges and fine-tunes scale factors via host-side retraining. This enables quantization of both generator and discriminator from 32-bit floating-point to INT8 precision with minimal accuracy degradation. Additionally, structured pruning through the Vitis AI Optimizer removes redundant weights and filters, producing a compact model that fits entirely in on-chip memory and maximizes DPU efficiency. The architecture uses a multi-threaded ARM processor to manage preprocessing and DMA operations, while a lightweight scheduler in programmable logic sequences the execution of convolu-tion kernels across multiple DPU cores. Double buffering is employed to overlap data movement with computation. Experimental results on a Zynq UltraScale+ MPSoC ZCU104 show over 105 FPS throughput, achieving up to 3.5× better performance and 7.3× energy efficiency than GPU/CPU baselines, with Fréchet Inception Distance (FID) scores within 5% of floating-point models. Generative Adversarial Networks (GANs) FPGA Acceleration Real-Time Systems Image Reconstruction Xilinx DPU Vitis AI Full Text Additional Declarations No competing interests reported. Supplementary Files AuthorBiography.docx Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 23 Sep, 2025 Reviews received at journal 17 Sep, 2025 Reviews received at journal 15 Sep, 2025 Reviews received at journal 10 Sep, 2025 Reviews received at journal 09 Sep, 2025 Reviews received at journal 02 Sep, 2025 Reviewers agreed at journal 22 Aug, 2025 Reviews received at journal 22 Aug, 2025 Reviewers agreed at journal 20 Aug, 2025 Reviewers agreed at journal 20 Aug, 2025 Reviewers agreed at journal 19 Aug, 2025 Reviewers agreed at journal 18 Aug, 2025 Reviewers agreed at journal 17 Aug, 2025 Reviewers agreed at journal 17 Aug, 2025 Reviewers agreed at journal 17 Aug, 2025 Reviewers agreed at journal 15 Aug, 2025 Reviewers agreed at journal 14 Aug, 2025 Reviewers invited by journal 12 Aug, 2025 Editor assigned by journal 12 Aug, 2025 Submission checks completed at journal 04 Aug, 2025 First submitted to journal 31 Jul, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7263274","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":502494341,"identity":"67546cc7-266c-4966-8602-1c497d707f76","order_by":0,"name":"Amirhossein Sadr","email":"","orcid":"","institution":"Shahid Beheshti University","correspondingAuthor":false,"prefix":"","firstName":"Amirhossein","middleName":"","lastName":"Sadr","suffix":""},{"id":502494342,"identity":"d1b08d76-d0ca-42a1-870d-66234341d379","order_by":1,"name":"Shayan Haghighat","email":"","orcid":"","institution":"Shahid Beheshti University","correspondingAuthor":false,"prefix":"","firstName":"Shayan","middleName":"","lastName":"Haghighat","suffix":""},{"id":502494343,"identity":"5e17f838-e366-4026-a951-83f9e7a2dfa3","order_by":2,"name":"Aida Pakniyat","email":"","orcid":"","institution":"Institute for Research in Fundamental Sciences","correspondingAuthor":false,"prefix":"","firstName":"Aida","middleName":"","lastName":"Pakniyat","suffix":""},{"id":502494344,"identity":"fe3c56ba-cdda-4728-93f6-08fbdb37b252","order_by":3,"name":"Dara Rahmati","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAz0lEQVRIie3SMQqDMBTG8QeBlyXRNSGDV1AKdSh4FsHVpVtBqE72DIWeI7Mg1EN0sRTs0q1jlz7p1EXt1iH/RQj88AsEwOX6xxoANn6RV/g5YQuJQtH8SkCluGyX18jrc5sne0/fh5uAJAAu+0miG74yR5spNHm8EpBFFePhJAnpBkZaNhI0AlgKbGYgEfaStlSou4FIuYgg/aVVqGBNpJ0nmshG2k7XIo+jU9hF9RzxaNhF2sIPDt3QP3ZF4PvnaQJB/7WTnsEMcLlcLteC3jcSMI3ruzuLAAAAAElFTkSuQmCC","orcid":"","institution":"Shahid Beheshti University","correspondingAuthor":true,"prefix":"","firstName":"Dara","middleName":"","lastName":"Rahmati","suffix":""},{"id":502494345,"identity":"ed53f7f7-ccad-4537-9fda-fbc061f6bb56","order_by":4,"name":"Saeid Gorgin","email":"","orcid":"","institution":"Sungkyunkwan University","correspondingAuthor":false,"prefix":"","firstName":"Saeid","middleName":"","lastName":"Gorgin","suffix":""}],"badges":[],"createdAt":"2025-07-31 14:23:55","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7263274/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7263274/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":89445353,"identity":"3f705754-9cc7-48e5-b20e-2a8b9f82d83c","added_by":"auto","created_at":"2025-08-20 04:53:03","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3225702,"visible":true,"origin":"","legend":"","description":"","filename":"VITISLAST.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7263274/v1_covered_fa500070-8e97-480c-8a58-ac3b9dd6d0db.pdf"},{"id":89445027,"identity":"7729e4cb-bba5-40ed-8107-100d642fc47b","added_by":"auto","created_at":"2025-08-20 04:45:01","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":127626,"visible":true,"origin":"","legend":"","description":"","filename":"AuthorBiography.docx","url":"https://assets-eu.researchsquare.com/files/rs-7263274/v1/7f2f605f1907036f9f8a6644.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"FPGA-Accelerated Real-Time DCGANs via Xilinx DPUs and Vitis AI","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"the-journal-of-supercomputing","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"Learn more about [The Journal of Supercomputing](https://www.springer.com/journal/11227)","snPcode":"11227","submissionUrl":"https://submission.nature.com/new-submission/11227/3","title":"The Journal of Supercomputing","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Generative Adversarial Networks (GANs), FPGA Acceleration, Real-Time Systems, Image Reconstruction, Xilinx DPU, Vitis AI","lastPublishedDoi":"10.21203/rs.3.rs-7263274/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7263274/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Generative Adversarial Networks (GANs) produce high-quality images but are computationally intensive, especially due to transposed convolution operations, limiting their real-time performance on traditional hardware. To address this, we propose an optimized FPGA-based acceleration framework leveraging Xil-inx Deep Learning Processing Units (DPUs) and the Vitis AI toolchain to enable real-time inference of Deep Convolutional GANs (DCGANs) for image reconstruction. The proposed approach applies a two-stage quantization method that profiles layer-wise dynamic ranges and fine-tunes scale factors via host-side retraining. This enables quantization of both generator and discriminator from 32-bit floating-point to INT8 precision with minimal accuracy degradation. Additionally, structured pruning through the Vitis AI Optimizer removes redundant weights and filters, producing a compact model that fits entirely in on-chip memory and maximizes DPU efficiency. The architecture uses a multi-threaded ARM processor to manage preprocessing and DMA operations, while a lightweight scheduler in programmable logic sequences the execution of convolu-tion kernels across multiple DPU cores. Double buffering is employed to overlap data movement with computation. Experimental results on a Zynq UltraScale+ MPSoC ZCU104 show over 105 FPS throughput, achieving up to 3.5× better performance and 7.3× energy efficiency than GPU/CPU baselines, with Fréchet Inception Distance (FID) scores within 5% of floating-point models.","manuscriptTitle":"FPGA-Accelerated Real-Time DCGANs via Xilinx DPUs and Vitis AI","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-08-20 04:44:56","doi":"10.21203/rs.3.rs-7263274/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-09-23T12:22:01+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-17T20:38:15+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-15T06:34:23+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-10T17:01:14+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-09T08:13:52+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-02T16:54:44+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"181100097204146931262258899411147895086","date":"2025-08-22T14:57:01+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-08-22T14:54:51+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"191397201427323946246734277225003827478","date":"2025-08-20T12:15:57+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"107227876014218990188282342753484781327","date":"2025-08-20T09:09:13+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"128720220811688853622768246836914195970","date":"2025-08-19T10:20:06+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"24937928946857733614730731409736001403","date":"2025-08-18T09:34:34+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"41748940605825205358411954718449180461","date":"2025-08-18T00:51:55+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"330755613928190113927749083010272243605","date":"2025-08-17T15:54:01+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"216036255551648535934839240324771015406","date":"2025-08-17T14:26:25+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"151565467390867997571891869556556255358","date":"2025-08-15T15:26:40+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"123426750307356366080725342677676153023","date":"2025-08-14T21:39:02+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-08-12T18:50:52+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-08-12T18:47:15+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-08-04T13:20:26+00:00","index":"","fulltext":""},{"type":"submitted","content":"The Journal of Supercomputing","date":"2025-07-31T14:08:16+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"the-journal-of-supercomputing","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"Learn more about [The Journal of Supercomputing](https://www.springer.com/journal/11227)","snPcode":"11227","submissionUrl":"https://submission.nature.com/new-submission/11227/3","title":"The Journal of Supercomputing","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"f6f20f4a-fa09-4eb4-90c7-b05f10247dda","owner":[],"postedDate":"August 20th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-05-19T02:09:58+00:00","versionOfRecord":[],"versionCreatedAt":"2025-08-20 04:44:56","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7263274","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7263274","identity":"rs-7263274","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.