Improved PMGAT for Human-Object Interaction Detection through Graph Sampling-based Dynamic Edge Strategy (GraphSADES) | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Improved PMGAT for Human-Object Interaction Detection through Graph Sampling-based Dynamic Edge Strategy (GraphSADES) Jiali Zhang, Zuriahati Mohd Yunos, Habibollah Haron This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4365163/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract One of the challenges in training graph neural networks (GNNs) applied to human-object interaction (HOI) is the computational complexity associated with updating and aggregating the information of all connected nodes in dense graph data, which results in a long training time and poor convergence efficiency. In particular, the parallel multi-head graph attention network (PMGAT), a graph neural network model, has achieved promising results in HOI detection by capturing the interactive associations between keypoints through local feature modules and multi-head graph attention mechanisms. However, to address the challenge of computational complexity, this study proposes a graph sampling-based dynamic edge strategy called GraphSADES to improve the PMGAT. GraphSADES reduces computational complexity by dynamically sampling a subset of edges during the training process while maintaining the precision of the original model. Initially, an object-centered complete graph is constructed, node updates are performed to obtain the initial attention coefficients, and importance coefficients are computed. Subsequently, a dynamic edge sampling strategy is adopted to reduce the computational complexity by randomly selecting a subset of edges for updating and aggregating the information in each training step. Through experimental comparative analysis, GraphSADES-PMGAT maintains the precision of the PMGAT model, and the models are trained using ResNet-50 and ViT-B/16 as backbone networks. On the dataset, HICO-DET, Floating Point Operations (FLOPs) for computational complexity are decreased by 40.12% and 39.89%, and the training time is decreased by 14.20% and 12.02%, respectively, and the convergence efficiency is the earliest to converge after 180 epochs. On the V-COCO dataset, under the same backbone network condition as HICO-DET, FLOPs decreased by 39.81% and 39.56%, training time decreased by 10.26% and 16.91%, respectively, and the convergence efficiency was the earliest to converge after 165 epochs. Specifically, GraphSADES-PMGAT maintains comparable precision while reducing FLOPs, resulting in a shorter training time and improved convergence efficiency compared to the PMGAT model. This work opens up new possibilities for achieving efficient human-object interaction detection. Physical sciences/Engineering Physical sciences/Engineering/Electrical and electronic engineering Human–object interaction Graph attention network Computational complexity Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4365163","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":305113396,"identity":"e8668212-ba3e-43bc-91c0-f92a5603672d","order_by":0,"name":"Jiali Zhang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABBElEQVRIiWNgGAWjYBACxmYGxgMJDDY8/CAeD5FaGIBa0uQkGyBaJECCDYS0HWBgOGxscIBYLcztzA8OPGBgTtx8fu3DD2/b7Or4GXjMH91g2JaISx9jM5sB0GFsidtuPDeWnNuWLCHZwGPYnMNwG48WBoMDif94gFqOMUjztjFLGBwgqIX9A9AWicTNM44x/+Ztq5ewJ6yFB+QwA2MD/jY2oC2HJQwYCGspAGpJkJO4wcZmOefccckZh9kKZ+cY3DbGpcWw//jGhz8Y/vPw9x9jvvGmrJqfv715w+ecituyOLXAJSQSoAxmEGHA4IhLizycxX8AVcYeh45RMApGwSgYeQAAKmRZkwfa0doAAAAASUVORK5CYII=","orcid":"","institution":"Universiti Teknologi Malaysia","correspondingAuthor":true,"prefix":"","firstName":"Jiali","middleName":"","lastName":"Zhang","suffix":""},{"id":305113397,"identity":"0caff6cb-b1cf-4a3a-8cad-e4fc5f08bcb2","order_by":1,"name":"Zuriahati Mohd Yunos","email":"","orcid":"","institution":"Universiti Teknologi Malaysia","correspondingAuthor":false,"prefix":"","firstName":"Zuriahati","middleName":"Mohd","lastName":"Yunos","suffix":""},{"id":305113398,"identity":"fc1207d2-c6dd-42a8-a036-0f8392a9ceb4","order_by":2,"name":"Habibollah Haron","email":"","orcid":"","institution":"Universiti Teknologi Malaysia","correspondingAuthor":false,"prefix":"","firstName":"Habibollah","middleName":"","lastName":"Haron","suffix":""}],"badges":[],"createdAt":"2024-05-03 16:15:28","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4365163/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4365163/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":58916638,"identity":"7d383a88-b5b7-4b45-9568-c6432dd2b4c1","added_by":"auto","created_at":"2024-06-24 06:01:07","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1090371,"visible":true,"origin":"","legend":"","description":"","filename":"1405ImprovedPMGATforHumanObjectInteractionDetectionthroughGraphSamplingbasedDynamicEdgeStrategyGraphSADES.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4365163/v1_covered_a8bb6cca-eecf-4cfd-aa44-49301e594ab4.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Improved PMGAT for Human-Object Interaction Detection through Graph Sampling-based Dynamic Edge Strategy (GraphSADES)","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Human–object interaction, Graph attention network, Computational complexity","lastPublishedDoi":"10.21203/rs.3.rs-4365163/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4365163/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eOne of the challenges in training graph neural networks (GNNs) applied to human-object interaction (HOI) is the computational complexity associated with updating and aggregating the information of all connected nodes in dense graph data, which results in a long training time and poor convergence efficiency. In particular, the parallel multi-head graph attention network (PMGAT), a graph neural network model, has achieved promising results in HOI detection by capturing the interactive associations between keypoints through local feature modules and multi-head graph attention mechanisms. However, to address the challenge of computational complexity, this study proposes a graph sampling-based dynamic edge strategy called GraphSADES to improve the PMGAT. GraphSADES reduces computational complexity by dynamically sampling a subset of edges during the training process while maintaining the precision of the original model. Initially, an object-centered complete graph is constructed, node updates are performed to obtain the initial attention coefficients, and importance coefficients are computed. Subsequently, a dynamic edge sampling strategy is adopted to reduce the computational complexity by randomly selecting a subset of edges for updating and aggregating the information in each training step. Through experimental comparative analysis, GraphSADES-PMGAT maintains the precision of the PMGAT model, and the models are trained using ResNet-50 and ViT-B/16 as backbone networks. On the dataset, HICO-DET, Floating Point Operations (FLOPs) for computational complexity are decreased by 40.12% and 39.89%, and the training time is decreased by 14.20% and 12.02%, respectively, and the convergence efficiency is the earliest to converge after 180 epochs. On the V-COCO dataset, under the same backbone network condition as HICO-DET, FLOPs decreased by 39.81% and 39.56%, training time decreased by 10.26% and 16.91%, respectively, and the convergence efficiency was the earliest to converge after 165 epochs. Specifically, GraphSADES-PMGAT maintains comparable precision while reducing FLOPs, resulting in a shorter training time and improved convergence efficiency compared to the PMGAT model. This work opens up new possibilities for achieving efficient human-object interaction detection.\u003c/p\u003e","manuscriptTitle":"Improved PMGAT for Human-Object Interaction Detection through Graph Sampling-based Dynamic Edge Strategy (GraphSADES)","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-05-23 05:22:44","doi":"10.21203/rs.3.rs-4365163/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"5d84091d-4d5d-421c-b276-4f43e743e9c8","owner":[],"postedDate":"May 23rd, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":32214979,"name":"Physical sciences/Engineering"},{"id":32214980,"name":"Physical sciences/Engineering/Electrical and electronic engineering"}],"tags":[],"updatedAt":"2024-06-24T05:53:00+00:00","versionOfRecord":[],"versionCreatedAt":"2024-05-23 05:22:44","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4365163","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4365163","identity":"rs-4365163","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.