Research on causation analysis and risk severity prediction methods for road traffic accidents | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Research on causation analysis and risk severity prediction methods for road traffic accidents Zhenfei Zhan, Qing Mao, Zhenxing Yi This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6956397/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract The safety of people's lives and property is greatly threatened by traffic accidents. However, there are numerous reasons for traffic accidents, and it can be challenging to identify the most important ones, which makes accident prevention challenging. Therefore, this study conducts relevant work using the UK Department for Transport's 2019 road traffic accident datasets. In order to determine the distribution characteristics between the accident information and the dimensional characteristics of the person, vehicle, road, environment, and accident form, this paper first processes the data from the road traffic accident datasets. It then uses multiple interpolations based on the chained random forest to fill in the missing values. Next, the study delves into the reasons behind vehicular accidents, enhancing the quantitative foundation of scenario clustering through the integration of Bayesian optimization-based random forest models, Cramer's V correlation test, K-Modes clustering, and frequency statistics, culminating in the identification of dangerous situations involving non-operating vehicles and passenger vehicles. Subsequently, the Apriori algorithm, based on the attribute values of the specified constraint items, is employed to conduct correlation studies across various dimensions, including individual, vehicle, road, environment, accident form, and temporal aspects, aiming to unearth a possible link between the severity of accidents and these dimensions. Ultimately, the model predicting accident risk levels employs the LightGBM algorithm, based on Bayesian optimization. The model underwent external validation and interpretive analysis using the 2022 UK traffic accident datasets, confirming its effective generalization capabilities. Physical sciences/Engineering Physical sciences/Engineering/Mechanical engineering Traffic Safety Data Mining Accident Scenario Clustering Association Rules Risk Prediction Model Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6956397","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":479463771,"identity":"40413464-4c8b-4313-9ac8-cc26f1c53edb","order_by":0,"name":"Zhenfei Zhan","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAy0lEQVRIiWNgGAWjYBACNvbmAwcSKtiYQQzitPDxHEs88OEMHzuQkUCcFjmJHOODM9vk+IEMAyIdBlR5mOeMmTQbQ87HG28Y7OR0Gwhp4XlWcJinIs2YjeHsZss5DMnGZgcIaWFP3gC05VgyG2PvNmkehgOJ2whqYUgwOMzb9r++jZnnGZFaOFIMgN4HBjIbDxuRWoBhCwxkoBYeNmPLOQZE+EW+vfnwB1BUys9//PDGmwo7OYJaUIAED5FRg6yFVB2jYBSMglEwIgAAkiRA4VAQ6CUAAAAASUVORK5CYII=","orcid":"","institution":"Chongqing Jiaotong University","correspondingAuthor":true,"prefix":"","firstName":"Zhenfei","middleName":"","lastName":"Zhan","suffix":""},{"id":479463772,"identity":"a939bfe8-e0b9-4fc1-a312-574675f60f9b","order_by":1,"name":"Qing Mao","email":"","orcid":"","institution":"Chongqing Jiaotong University","correspondingAuthor":false,"prefix":"","firstName":"Qing","middleName":"","lastName":"Mao","suffix":""},{"id":479463773,"identity":"1182b9e4-1de9-4e95-a13e-508f9b453f2a","order_by":2,"name":"Zhenxing Yi","email":"","orcid":"","institution":"Chongqing Jiaotong University","correspondingAuthor":false,"prefix":"","firstName":"Zhenxing","middleName":"","lastName":"Yi","suffix":""}],"badges":[],"createdAt":"2025-06-23 11:38:26","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6956397/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6956397/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":109174515,"identity":"885f009d-8f09-4673-92b4-ba4e6f193563","added_by":"auto","created_at":"2026-05-13 09:17:11","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1037234,"visible":true,"origin":"","legend":"","description":"","filename":"paper.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6956397/v1_covered_6e3f2f93-0957-4efe-94dd-3b5b71ee15d1.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Research on causation analysis and risk severity prediction methods for road traffic accidents","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Traffic Safety, Data Mining, Accident Scenario Clustering, Association Rules, Risk Prediction Model","lastPublishedDoi":"10.21203/rs.3.rs-6956397/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6956397/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThe safety of people's lives and property is greatly threatened by traffic accidents. However, there are numerous reasons for traffic accidents, and it can be challenging to identify the most important ones, which makes accident prevention challenging. Therefore, this study conducts relevant work using the UK Department for Transport's 2019 road traffic accident datasets. In order to determine the distribution characteristics between the accident information and the dimensional characteristics of the person, vehicle, road, environment, and accident form, this paper first processes the data from the road traffic accident datasets. It then uses multiple interpolations based on the chained random forest to fill in the missing values. Next, the study delves into the reasons behind vehicular accidents, enhancing the quantitative foundation of scenario clustering through the integration of Bayesian optimization-based random forest models, Cramer's V correlation test, K-Modes clustering, and frequency statistics, culminating in the identification of dangerous situations involving non-operating vehicles and passenger vehicles. Subsequently, the Apriori algorithm, based on the attribute values of the specified constraint items, is employed to conduct correlation studies across various dimensions, including individual, vehicle, road, environment, accident form, and temporal aspects, aiming to unearth a possible link between the severity of accidents and these dimensions. Ultimately, the model predicting accident risk levels employs the LightGBM algorithm, based on Bayesian optimization. The model underwent external validation and interpretive analysis using the 2022 UK traffic accident datasets, confirming its effective generalization capabilities.\u003c/p\u003e","manuscriptTitle":"Research on causation analysis and risk severity prediction methods for road traffic accidents","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-07-03 06:36:48","doi":"10.21203/rs.3.rs-6956397/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"e7e1db1c-fb41-4d49-9597-327d5322b899","owner":[],"postedDate":"July 3rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":50909622,"name":"Physical sciences/Engineering"},{"id":50909623,"name":"Physical sciences/Engineering/Mechanical engineering"}],"tags":[],"updatedAt":"2026-05-13T09:14:05+00:00","versionOfRecord":[],"versionCreatedAt":"2025-07-03 06:36:48","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6956397","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6956397","identity":"rs-6956397","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.