Machine Learning Modeling for Multi-order Human Visual Motion Processing

preprint OA: closed
Full text JSON View at publisher
Full text 13,427 characters · extracted from preprint-html · click to expand
Machine Learning Modeling for Multi-order Human Visual Motion Processing | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Machine Learning Modeling for Multi-order Human Visual Motion Processing Shin'ya Nishida, Zitang Sun, Yen-Ju Chen, Yung-hao Yang, Yuan Li This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5631041/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 15 Jul, 2025 Read the published version in Nature Machine Intelligence → Version 1 posted You are reading this latest preprint version Abstract Our research aims to develop machines that learn to perceive visual motion as do humans. While recent advances in computer vision (CV) have enabled DNN-based models to accurately estimate optical flow in naturalistic images, a significant disparity remains between CV models and the biological visual system in both architecture and behavior. This disparity includes humans' ability to perceive the motion of higher-order image features (second-order motion), which many CV models fail to capture because of their reliance on the intensity conservation law. Our model architecture mimics the cortical V1-MT motion processing pathway, utilizing a trainable motion energy sensor bank and a recurrent graph network. Supervised learning employing diverse naturalistic videos allows the model to replicate psychophysical and physiological findings about first-order (luminance-based) motion perception. For second-order motion, inspired by neuroscientific findings, the model includes an additional sensing pathway with nonlinear preprocessing before motion energy sensing, implemented using a simple multilayer 3D CNN block. When exploring how the brain acquired the ability to perceive second-order motion in natural environments, in which pure second-order signals are rare, we hypothesized that second-order mechanisms were critical when estimating robust object motion amidst optical fluctuations, such as highlights on glossy surfaces. We trained our dual-pathway model on novel motion datasets with varying material properties of moving objects. We found that training to estimate object motion from non-Lambertian materials naturally endowed the model with the capacity to perceive second-order motion, as can humans. The resulting model effectively aligns with biological systems while generalizing to both first- and second-order motion phenomena in natural scenes. Biological sciences/Computational biology and bioinformatics/Machine learning Biological sciences/Computational biology and bioinformatics/Computational models Biological sciences/Computational biology and bioinformatics/Image processing Visual motion perception Optical flow Second-order motion Graph neural network Motion segmentation Full Text Additional Declarations There is NO Competing Interest. Cite Share Download PDF Status: Published Journal Publication published 15 Jul, 2025 Read the published version in Nature Machine Intelligence → Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5631041","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":405989897,"identity":"d13e919b-c5aa-436f-8ba9-14f68f9f43ad","order_by":0,"name":"Shin'ya Nishida","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABP0lEQVRIie2RzWrCQBCAx6ZsLtN63cWir7BFiAihz7IirA9Q8NKCghAvoeeA9B3sG6QEkpu9WvBgKeTUQ7xITtIxaakYtddC88Huzuzux/4MQEnJX8QABopRwCvDbV4H09ldx9+VJmBIgzqhACmQKXnWGXK9qxRpmWa8XPZvoDEZOe94D5Xpa6z5KrVB+sZbAleLfaU9wpZUsy7IxfO4SXcyxESHwlOaFNbkgPG+IgNkvOMYIKmvIQNWrfUoUMFg6oNFbwmKihnT5gE0vK2yAWQiyhQ6xVwfVsAihebnpFw4wKuchV8KHjkFLa5mEUpSxOMDl8LV3banNYgAb7k68JaXKBZp/67e8Hox/1jbg2kUXs8T24bLaPyUrNzCj32TVewM8+qc5zUyqHVc/5iSUUnz0Uh+5tLTSklJScl/4BMtgWUt6dp9egAAAABJRU5ErkJggg==","orcid":"","institution":"Graduate School of Informatics, Kyoto University","correspondingAuthor":true,"prefix":"","firstName":"Shin'ya","middleName":"","lastName":"Nishida","suffix":""},{"id":405989898,"identity":"b671bf70-bbc1-46b5-a73c-b7422a93b4c6","order_by":1,"name":"Zitang Sun","email":"","orcid":"https://orcid.org/0000-0003-2267-421X","institution":"Kyoto University","correspondingAuthor":false,"prefix":"","firstName":"Zitang","middleName":"","lastName":"Sun","suffix":""},{"id":405989899,"identity":"e8ca4c05-c0dd-40a3-8819-27ded3838149","order_by":2,"name":"Yen-Ju Chen","email":"","orcid":"https://orcid.org/0000-0002-2038-1440","institution":"Graduate School of Informatics, Kyoto University","correspondingAuthor":false,"prefix":"","firstName":"Yen-Ju","middleName":"","lastName":"Chen","suffix":""},{"id":405989900,"identity":"94706791-b141-475f-b390-640fe24047c1","order_by":3,"name":"Yung-hao Yang","email":"","orcid":"","institution":"Graduate School of Informatics, Kyoto University","correspondingAuthor":false,"prefix":"","firstName":"Yung-hao","middleName":"","lastName":"Yang","suffix":""},{"id":405989901,"identity":"7045a906-6df0-457d-8105-ea58f916966a","order_by":4,"name":"Yuan Li","email":"","orcid":"","institution":"Graduate School of Informatics, Kyoto University","correspondingAuthor":false,"prefix":"","firstName":"Yuan","middleName":"","lastName":"Li","suffix":""}],"badges":[],"createdAt":"2024-12-12 11:15:11","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5631041/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5631041/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s42256-025-01068-w","type":"published","date":"2025-07-15T04:00:00+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":86835743,"identity":"e6e7aa1a-879e-43d8-aeea-fdef02a5262b","added_by":"auto","created_at":"2025-07-16 07:08:58","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":8845904,"visible":true,"origin":"","legend":"Article File","description":"","filename":"Manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5631041/v1_covered_24d80ced-1225-446a-b7aa-91597027b6c8.pdf"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"Machine Learning Modeling for Multi-order Human Visual Motion Processing","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Visual motion perception, Optical flow, Second-order motion, Graph neural network, Motion segmentation","lastPublishedDoi":"10.21203/rs.3.rs-5631041/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5631041/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Our research aims to develop machines that learn to perceive visual motion as do humans. While recent advances in computer vision (CV) have enabled DNN-based models to accurately estimate optical flow in naturalistic images, a significant disparity remains between CV models and the biological visual system in both architecture and behavior. This disparity includes humans' ability to perceive the motion of higher-order image features (second-order motion), which many CV models fail to capture because of their reliance on the intensity conservation law. Our model architecture mimics the cortical V1-MT motion processing pathway, utilizing a trainable motion energy sensor bank and a recurrent graph network. Supervised learning employing diverse naturalistic videos allows the model to replicate psychophysical and physiological findings about first-order (luminance-based) motion perception. For second-order motion, inspired by neuroscientific findings, the model includes an additional sensing pathway with nonlinear preprocessing before motion energy sensing, implemented using a simple multilayer 3D CNN block. When exploring how the brain acquired the ability to perceive second-order motion in natural environments, in which pure second-order signals are rare, we hypothesized that second-order mechanisms were critical when estimating robust object motion amidst optical fluctuations, such as highlights on glossy surfaces. We trained our dual-pathway model on novel motion datasets with varying material properties of moving objects. We found that training to estimate object motion from non-Lambertian materials naturally endowed the model with the capacity to perceive second-order motion, as can humans. The resulting model effectively aligns with biological systems while generalizing to both first- and second-order motion phenomena in natural scenes.","manuscriptTitle":"Machine Learning Modeling for Multi-order Human Visual Motion Processing","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-01-28 05:33:34","doi":"10.21203/rs.3.rs-5631041/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"nature-machine-intelligence","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"natmachintell","sideBox":"Learn more about [Nature Machine Intelligence](http://www.nature.com/natmachintell/)","snPcode":"","submissionUrl":"","title":"Nature Machine Intelligence","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Nature Research","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"1e0ea8cb-aa14-42db-954f-217fcc712683","owner":[],"postedDate":"January 28th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":43290736,"name":"Biological sciences/Computational biology and bioinformatics/Machine learning"},{"id":43290737,"name":"Biological sciences/Computational biology and bioinformatics/Computational models"},{"id":43290738,"name":"Biological sciences/Computational biology and bioinformatics/Image processing"}],"tags":[],"updatedAt":"2025-07-16T07:08:49+00:00","versionOfRecord":{"articleIdentity":"rs-5631041","link":"https://doi.org/10.1038/s42256-025-01068-w","journal":{"identity":"nature-machine-intelligence","isVorOnly":false,"title":"Nature Machine Intelligence"},"publishedOn":"2025-07-15 04:00:00","publishedOnDateReadable":"July 15th, 2025"},"versionCreatedAt":"2025-01-28 05:33:34","video":"","vorDoi":"10.1038/s42256-025-01068-w","vorDoiUrl":"https://doi.org/10.1038/s42256-025-01068-w","workflowStages":[]},"version":"v1","identity":"rs-5631041","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5631041","identity":"rs-5631041","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00