Toward accurate quality assessment of machine-generated infrared video using Fréchet Video Distance | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Toward accurate quality assessment of machine-generated infrared video using Fréchet Video Distance Huaizheng Lu, Shiwei Wang, Bin Huang, Erkang Chen, Yunfeng Sui This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-3853148/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Video generation methods have important implications for the fields of visual control and decision-making. Current research often use the Fréchet Video Distance (FVD) as an evaluation metric for machine-generated video. However, FVD has not been thoroughly verified on non-visible light sources, especially the widely used infrared light. Therefore, there is an urgent need to use real infrared video data to test the reliability and generalization ability of FVD. Toward that goal, we first collected mainstream infrared video datasets and added various types of noise to synthesize infrared videos of different quality levels. Experiments based on synthetic dataset demonstrate the feasibility of using FVD to assess the quality of infrared video. Next, we trained the Pix2PixGAN network using a dataset containing aligned visible and infrared image pairs. The trained model can generate videos of different quality levels in the infrared light domain. With the generated infrared videos, our experiments show that FVD is able to distinguish the quality differences of different infrared videos. In particular, we found that the lack of labeled infrared dataset and relatively small dataset size of infrared videos has a negative impact on calculating credible FVD values. This is because extracting effective infrared video features remains a difficult problem. Our experimental results suggest that infrared video features can be extracted using large-scale visible light video pre-trained I3D models, and their calculated FVD values are even better than those directly using infrared video pre-trained I3D models. Our study provides a basis for using FVD to evaluate the quality of machine-generated videos under multispectral conditions. Machine-generated infrared video Fréchet Video Distance I3D model Correlation analysis GAN Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-3853148","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":266925001,"identity":"0e9ac088-fdcd-49f5-85a4-96fbc3855477","order_by":0,"name":"Huaizheng Lu","email":"","orcid":"","institution":"Jimei University","correspondingAuthor":false,"prefix":"","firstName":"Huaizheng","middleName":"","lastName":"Lu","suffix":""},{"id":266925002,"identity":"7c708b95-fc35-44a5-8bc5-2fa5b656a3ab","order_by":1,"name":"Shiwei Wang","email":"","orcid":"","institution":"Jimei University","correspondingAuthor":false,"prefix":"","firstName":"Shiwei","middleName":"","lastName":"Wang","suffix":""},{"id":266925003,"identity":"45711b0f-e169-46d1-98a1-d7b2a592de03","order_by":2,"name":"Bin Huang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAxElEQVRIiWNgGAWjYBACAxDB2CDBwM9wAMRkJkGLZAOJWoAMsA5itJhLJD97+HWHRZ7xwdNpEgwV1okNhLRYzkgzN5Y9I1FsduDsNgmGM+mEtRjcSDCTlmyTSNwG0sLYdpgYLenfwFo2N4C0/CNKS46Z5Eeglg0MIC0NxGg586ZMmvGMROKMA2c3WyQcSzcmrOV4+jbJnzvqEvtnnN1440ONtSxBLSDAzAMiJQ4wMCQQoxwEGH+ASH6ijB8Fo2AUjIKRCABbBUVRwpygCQAAAABJRU5ErkJggg==","orcid":"","institution":"Jimei University","correspondingAuthor":true,"prefix":"","firstName":"Bin","middleName":"","lastName":"Huang","suffix":""},{"id":266925004,"identity":"ba92391c-fe11-4dc1-90ae-77f350be60bf","order_by":3,"name":"Erkang Chen","email":"","orcid":"","institution":"Jimei University","correspondingAuthor":false,"prefix":"","firstName":"Erkang","middleName":"","lastName":"Chen","suffix":""},{"id":266925005,"identity":"ceaaf576-fcef-48d0-9a91-1212df83df67","order_by":4,"name":"Yunfeng Sui","email":"","orcid":"","institution":"Civil Aviation Administration of China","correspondingAuthor":false,"prefix":"","firstName":"Yunfeng","middleName":"","lastName":"Sui","suffix":""}],"badges":[],"createdAt":"2024-01-11 10:44:09","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-3853148/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-3853148/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":55363424,"identity":"e34ca390-2952-494f-ae88-ca9a5908f968","added_by":"auto","created_at":"2024-04-26 09:09:00","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1977364,"visible":true,"origin":"","legend":"","description":"","filename":"TowardaccuratequalityassessmentofmachinegeneratedinfraredvideousingFrchetVideoDistance.pdf","url":"https://assets-eu.researchsquare.com/files/rs-3853148/v1_covered_e47c2c5f-8128-47fb-9671-8733b2bb1fbb.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Toward accurate quality assessment of machine-generated infrared video using Fréchet Video Distance","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Machine-generated infrared video, Fréchet Video Distance, I3D model, Correlation analysis, GAN","lastPublishedDoi":"10.21203/rs.3.rs-3853148/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-3853148/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Video generation methods have important implications for the fields of visual control and decision-making. Current research often use the Fréchet Video Distance (FVD) as an evaluation metric for machine-generated video. However, FVD has not been thoroughly verified on non-visible light sources, especially the widely used infrared light. Therefore, there is an urgent need to use real infrared video data to test the reliability and generalization ability of FVD. Toward that goal, we first collected mainstream infrared video datasets and added various types of noise to synthesize infrared videos of different quality levels. Experiments based on synthetic dataset demonstrate the feasibility of using FVD to assess the quality of infrared video. Next, we trained the Pix2PixGAN network using a dataset containing aligned visible and infrared image pairs. The trained model can generate videos of different quality levels in the infrared light domain. With the generated infrared videos, our experiments show that FVD is able to distinguish the quality differences of different infrared videos. In particular, we found that the lack of labeled infrared dataset and relatively small dataset size of infrared videos has a negative impact on calculating credible FVD values. This is because extracting effective infrared video features remains a difficult problem. Our experimental results suggest that infrared video features can be extracted using large-scale visible light video pre-trained I3D models, and their calculated FVD values are even better than those directly using infrared video pre-trained I3D models. Our study provides a basis for using FVD to evaluate the quality of machine-generated videos under multispectral conditions.","manuscriptTitle":"Toward accurate quality assessment of machine-generated infrared video using Fréchet Video Distance","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-01-16 18:56:31","doi":"10.21203/rs.3.rs-3853148/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"277584a0-9dba-461a-a567-350194138b7d","owner":[],"postedDate":"January 16th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-04-26T08:58:13+00:00","versionOfRecord":[],"versionCreatedAt":"2024-01-16 18:56:31","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-3853148","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-3853148","identity":"rs-3853148","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.