HSOD: Hybrid Strategy Object Detection in Piping \& Instrumentation Diagrams and Process Flow Diagrams | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article HSOD: Hybrid Strategy Object Detection in Piping & Instrumentation Diagrams and Process Flow Diagrams Feiyang Xu, Heng Zhang, Jianyu Han, Gaoming Zhang, Defu Lian, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9252092/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 5 You are reading this latest preprint version Abstract As core technical documents for the process industry, Piping and Instrumentation Diagrams (P\&IDs) and Process Flow Diagrams (PFDs) face severe challenges in automated object detection: small industrial symbols (e.g., arrows, stream numbers, valves) occupy minimal image areas, leading to insufficient feature extraction; limited labeled samples result in poor model generalization under few-shot conditions; and dense symbol distribution with complex background interference causes high false detection and missed detection rates. To tackle these issues, this study constructs a dedicated multi-industry dataset consisting of 182 annotated P\&IDs/PFDs, covering 308 core object categories and 118,723 annotation samples from petrochemical and coal chemical industries. Furthermore, we propose a Hybrid Strategy Object Detection (HSOD) workflow that integrates parallel adaptive multi-scale detection and hybrid post-processing. Based on YOLOv12, the workflow employs three-level adaptive patch Slicing Aided Hyper Inference (SAHI) to enhance small object feature extraction, and designs a hybrid filtering strategy combining position-based confidence calibration, adaptive Non-Maximum Suppression (NMS), and quartile-based confidence thresholding to eliminate duplicate detections and false positives. Extensive experiments show that our HSOD workflow outperforms mainstream baselines (RT-DETR, YOLOv12, Faster R-CNN) on all metrics. Compared with the RT-DETR baseline, our method improves [email protected] by 23.27%, [email protected] :0.95 by 33.09%, and the recall of small objects by 85.86%. Qualitative comparison with Gemini 3.1 Pro further validates the superiority of specialized detection models. This work provides a reliable solution for the intelligent digital parsing of industrial drawings, supporting the intelligent operation, maintenance and digital twin construction of process industries. Industrial Drawing Analysis Small Object Detection YOLOv12 SAHI P&ID PFD Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Reviewers agreed at journal 18 May, 2026 Reviewers invited by journal 16 Apr, 2026 Editor assigned by journal 29 Mar, 2026 Submission checks completed at journal 29 Mar, 2026 First submitted to journal 28 Mar, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9252092","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":624024383,"identity":"e3b8fb4c-b412-4936-98de-d3ba92cf5c8f","order_by":0,"name":"Feiyang Xu","email":"","orcid":"","institution":"Hefei University of Technology","correspondingAuthor":false,"prefix":"","firstName":"Feiyang","middleName":"","lastName":"Xu","suffix":""},{"id":624024384,"identity":"cf362806-27fe-4feb-ae84-7362fb420cb2","order_by":1,"name":"Heng Zhang","email":"","orcid":"","institution":"University of Science and Technology of China","correspondingAuthor":false,"prefix":"","firstName":"Heng","middleName":"","lastName":"Zhang","suffix":""},{"id":624024385,"identity":"ed7b2369-c6e8-4420-b0ee-4df91bde3b4e","order_by":2,"name":"Jianyu Han","email":"","orcid":"","institution":"iFLYTEK Co., Ltd.","correspondingAuthor":false,"prefix":"","firstName":"Jianyu","middleName":"","lastName":"Han","suffix":""},{"id":624024386,"identity":"184418e9-ce82-4a17-8538-673a536587e9","order_by":3,"name":"Gaoming Zhang","email":"","orcid":"","institution":"University of Science and Technology of China","correspondingAuthor":false,"prefix":"","firstName":"Gaoming","middleName":"","lastName":"Zhang","suffix":""},{"id":624024387,"identity":"da98a92b-ff85-4b79-baa1-25b713a32c75","order_by":4,"name":"Defu Lian","email":"","orcid":"","institution":"University of Science and Technology of China","correspondingAuthor":false,"prefix":"","firstName":"Defu","middleName":"","lastName":"Lian","suffix":""},{"id":624024388,"identity":"d7fb3b91-c36e-4004-aa6d-ba760075fdff","order_by":5,"name":"Le Wu","email":"","orcid":"","institution":"Hefei University of Technology","correspondingAuthor":false,"prefix":"","firstName":"Le","middleName":"","lastName":"Wu","suffix":""},{"id":624024389,"identity":"abb9ee4a-d7dd-46b0-88ee-df1a950ce4ac","order_by":6,"name":"Xin Li","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAyklEQVRIiWNgGAWjYBACPhCRwGAD5bIRoYUNoiUNyiJaCwPDYVK0sDcfk3i443zi/PnNDxg+lB1m4J/dQEALz7Fkg8QztxM3HGMzYJxx7jCDxJ0DBLRI5Bg+SGwDamHjYWDmbTvMYCCRQECL/PsPBxLbziXObwNq+UuUFgkeRqAtBxIbjgG1MBKlhSfN2CCxLdl4w7E0g4M959J5JG4Q0MLPfviZ5M82O9n5zYcfPvhRZi3HP4OAFhRwAIh5SFA/CkbBKBgFowAXAAARBz7yawHB/gAAAABJRU5ErkJggg==","orcid":"","institution":"University of Science and Technology of China","correspondingAuthor":true,"prefix":"","firstName":"Xin","middleName":"","lastName":"Li","suffix":""}],"badges":[],"createdAt":"2026-03-28 10:54:38","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9252092/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9252092/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":107705994,"identity":"a4249379-745a-4070-b969-77f83af95b11","added_by":"auto","created_at":"2026-04-24 09:17:05","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1098876,"visible":true,"origin":"","legend":"","description":"","filename":"HSOD.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9252092/v1_covered_4fbf9af8-e116-4bf1-b69d-d98aa5188da9.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"HSOD: Hybrid Strategy Object Detection in Piping \\\u0026 Instrumentation Diagrams and Process Flow Diagrams","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":false,"email":"","identity":"journal-on-image-and-video-processing","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Journal on Image and Video Processing","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"VoR Journals","inReviewEnabled":false,"inReviewRevisionsEnabled":false},"keywords":"Industrial Drawing Analysis, Small Object Detection, YOLOv12, SAHI, P\u0026ID, PFD","lastPublishedDoi":"10.21203/rs.3.rs-9252092/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9252092/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eAs core technical documents for the process industry, Piping and Instrumentation Diagrams (P\\\u0026amp;IDs) and Process Flow Diagrams (PFDs) face severe challenges in automated object detection: small industrial symbols (e.g., arrows, stream numbers, valves) occupy minimal image areas, leading to insufficient feature extraction; limited labeled samples result in poor model generalization under few-shot conditions; and dense symbol distribution with complex background interference causes high false detection and missed detection rates. To tackle these issues, this study constructs a dedicated multi-industry dataset consisting of 182 annotated P\\\u0026amp;IDs/PFDs, covering 308 core object categories and 118,723 annotation samples from petrochemical and coal chemical industries. Furthermore, we propose a Hybrid Strategy Object Detection (HSOD) workflow that integrates parallel adaptive multi-scale detection and hybrid post-processing. Based on YOLOv12, the workflow employs three-level adaptive patch Slicing Aided Hyper Inference (SAHI) to enhance small object feature extraction, and designs a hybrid filtering strategy combining position-based confidence calibration, adaptive Non-Maximum Suppression (NMS), and quartile-based confidence thresholding to eliminate duplicate detections and false positives. Extensive experiments show that our HSOD workflow outperforms mainstream baselines (RT-DETR, YOLOv12, Faster R-CNN) on all metrics. Compared with the RT-DETR baseline, our method improves
[email protected] by 23.27%,
[email protected]:0.95 by 33.09%, and the recall of small objects by 85.86%. Qualitative comparison with Gemini 3.1 Pro further validates the superiority of specialized detection models. This work provides a reliable solution for the intelligent digital parsing of industrial drawings, supporting the intelligent operation, maintenance and digital twin construction of process industries.\u003c/p\u003e","manuscriptTitle":"HSOD: Hybrid Strategy Object Detection in Piping \\\u0026amp; Instrumentation Diagrams and Process Flow Diagrams","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-04-23 11:41:19","doi":"10.21203/rs.3.rs-9252092/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"reviewerAgreed","content":"107126325536739962807747616091873638169","date":"2026-05-18T07:34:14+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-04-16T05:00:28+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-03-30T00:49:47+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-03-30T00:49:32+00:00","index":"","fulltext":""},{"type":"submitted","content":"Journal on Image and Video Processing","date":"2026-03-28T10:51:08+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":false,"email":"","identity":"journal-on-image-and-video-processing","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Journal on Image and Video Processing","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"VoR Journals","inReviewEnabled":false,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"3696b873-de4a-4197-a770-dac69c6c7df9","owner":[],"postedDate":"April 23rd, 2026","published":true,"recentEditorialEvents":[{"type":"reviewerAgreed","content":"107126325536739962807747616091873638169","date":"2026-05-18T07:34:14+00:00","index":17,"fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-04-23T11:41:19+00:00","versionOfRecord":[],"versionCreatedAt":"2026-04-23 11:41:19","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9252092","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9252092","identity":"rs-9252092","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.