Risky lane-changing behavior recognition based on Stacking ensemble learning on snowy and icy surfaces

doi:10.21203/rs.3.rs-4491572/v1

Risky lane-changing behavior recognition based on Stacking ensemble learning on snowy and icy surfaces

2024 · doi:10.21203/rs.3.rs-4491572/v1

preprint OA: closed

Full text JSON View at publisher

Full text 157,542 characters · extracted from preprint-html · click to expand

Risky lane-changing behavior recognition based on Stacking ensemble learning on snowy and icy surfaces | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Risky lane-changing behavior recognition based on Stacking ensemble learning on snowy and icy surfaces Xuejing DU, Wei Zhao This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4491572/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 20 Aug, 2024 Read the published version in Scientific Reports → Version 1 posted 9 You are reading this latest preprint version Abstract Risky lane-changing (LC) behavior adversely affects traffic safety, especially on snowy and icy surfaces. However, due to the particularity of the snowy and icy surfaces and the scarcity of data, research on risky lane-changing behavior (RLCB) under extreme scenarios is insufficient. Therefore, this study presents a novel research framework aimed at selecting key risk characterisation indicators (RCIs) and identifying RLCB on highways using driving simulation data on snowy and icy surfaces. A highway LC scenario was established on snowy and icy surfaces using a driving simulator, and 1200 sets of LC sample data were extracted. From the perspectives of parameter importance and correlation, 12 key RCIs with high importance and low inter-correlation were selected using the C4.5 decision tree algorithm and Pearson correlation analysis method. The RLCB recognition model was developed using the Stacking ensemble learning method and then compared with traditional recognition algorithms. The results show that the accuracy of the recognition model based on the Stacking ensemble learning model is significantly better than that of traditional algorithms, with a recognition accuracy of 98.33%. This finding can provide the basis for developing LC warning systems for intelligent connected vehicles on snowy and icy surfaces. Physical sciences/Engineering/Mechanical engineering Physical sciences/Engineering/Civil engineering Traffic safety snowy and icy surfaces risky lane-changing behavior risk characterisation indicators ensemble learning Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 1. Introduction Lane-changing (LC) and Lane-keeping (LK) are fundamental driving behaviors [ 1 ] . However, LC is more complex than LK, with a greater impact on the normal driving of surrounding adjacent vehicles, potentially leading to traffic congestion and accidents [ 2 ] . Studies have shown that approximately 5% of traffic accidents and 7% of crash fatalities in the United States are related to LC behavior annually, causing at least 60,000 injuries [ 3 ] . In China, almost 14,500 traffic accidents were related to LC, contributing to 5.89% of all accidents, resulting in 2,600 deaths and economic losses exceeding 60 million yuan [ 4 ] . Improper LC behavior poses a serious threat to traffic safety [ 5 ] . In snowy and icy conditions, the accumulation of snow and ice significantly reduces the adhesion coefficient of the road surface, greatly increasing the risk of losing control and colliding during lane changes [ 6 ] . Therefore, accurate recognition of RLCB is crucial to mitigate accidents, improve driving efficiency, and ensure road traffic safety and stability. Extensive research has been conducted by scholars on identifying RLCB under normal driving conditions, employing various research methods, and achieving satisfactory results. Simplified physical models are used by some researchers to describe the movements of traffic participants and select specific indicators to characterize risks [ 7 ] . If the calculation results of these indicators exceed a certain threshold, risks are considered to be present. These indicators primarily consist of time indicators (Time to Collision (TTC) [8] , Time Headway (THW) [ 9 ] , Post Encroachment Time (PET) [ 10 ] , Modified Time-to-Collision(MTTC) [ 11 ] ), acceleration indicators (Deceleration Rate to Avoid a Crash (DRAC) [ 12 ] , Brake-Threat-Number (BTN) [13] , Steer-Threat Number (STN) [ 14 ] ), and distance indicators (Minimum Safety Distance (MSD) [ 15 ] , Stopping Distance Index (SDI) [16] ). Due to the gradual improvement of input parameters and the continuous updates of recognition models, the accuracy of recognition has been enhanced. Despite the high computational efficiency and recognition accuracy of these simplified models, they overlook the uncertainty of vehicle motion, thereby restricting the application scenarios and recognition accuracy of these evaluation methods [ 17 ] . The advancement of technology and the development of hardware devices have enabled the collection of an increasing number of feature indicators. In some studies, vehicle dynamic parameters and driving behavior parameters are collected through natural driving experiments or driving simulation tests to discriminate the vehicle’s operating status. Common vehicle motion control parameters include vehicle speed, acceleration, steering wheel angle, brake pedal, and accelerator pedal opening and closing angles. Driver behavior parameters include eye movement indicators (gaze points, blink frequency), electroencephalogram signals, and electrocardiogram signals. Under the premise of ensuring sufficient data, the key to identifying risky behaviors lies in the establishment of accurate and efficient recognition models. Machine learning algorithms, such as Support Vector Machines [18] , Random Forest Models [ 19 ] , Dynamic Bayesian Network Models [ 20 ] , Neural Networks [ 21 ] , XGBoost [ 22 ] , and K-Means clustering [ 23 ] , are extensively employed for identifying, predicting, and assessing RLCB owing to their capacity to handle large-scale datasets. Compared with simplified physical models, machine learning recognition models that consider vehicle motion parameters and driving behavior parameters can better conform to actual situations, thereby improving the accuracy and timeliness of recognition results. However, the effectiveness of these models in snowy and icy environments remains uncertain, with individual characteristics and shortcomings potentially causing variations in recognition results. Therefore, how to overcome the deficiencies of individual models and synthesize the advantages of multiple models to enhance the recognition accuracy of the overall model on snowy and icy surfaces is a pressing issue at present. In summary, research on identifying RLCB in normal weather conditions has been relatively comprehensive and mature. However, given the particularity of the snowy and icy surfaces and the scarcity of data, research on the risk of LC under such extreme scenarios has not received sufficient attention, and there is a research gap. Therefore, this study establishes a highway LC scenario on snowy and icy surfaces, conducts driving simulation experiments based on a driving simulator, and collects vehicle motion control parameters, driving behavior parameters, and vehicle interaction data during the LC process. Based on this, statistical and machine learning methods are employed to select key RCIs and a framework based on Stacking ensemble learning is proposed to identify RLCB on snowy and icy surfaces. 2. Methodology 2.1 Driving simulation experiment 2.1.1 Experimental device This experiment relies on a simulated driving motion simulation test platform, which consists of a driving cabin, a 210° display screen, a 6-DOF motion platform, industrial computers, and a control platform, as shown in Fig. 1 . The driving cabin can simulate the translational or rotational motion in the x/y/z directions according to the real-time driving status of the vehicle, thereby truly reflecting driving experiences on snowy and icy surfaces and facilitating dynamic adjustments by drivers. This simulated experiment can effectively avoid safety hazards and testing errors encountered in real vehicle tests on snowy and icy surfaces while gathering authentic and dependable test data. 2.1.2 Experiment scenario The experimental scenario design is based on the SILAB6.0 driving simulation software, which can meticulously set the road environment, weather, traffic flow status, and vehicle dynamics. It achieves flexible control of the scenario through scripts, thus simulating highly realistic experimental scenes. For this experiment, we selected the segment from Shuangcheng to Shijia on the Jingha Expressway, with a total length of 33 kilometers, and designed with two lanes in each direction, each lane being 3.75 meters wide, with emergency lanes on both sides. The road mainly comprises straight sections and large curvature radius curves, with curve radius all greater than 1500 meters. The morphology of the snowy and icy road surface was chosen to replicate common snow-covered icy film road surfaces on expressways, with a friction coefficient set to 0.45. Environment vehicles are configured using the Car and TruckMotorway components of the SILAB 6.0 software, with a traffic flow density of approximately 25 vehicles per kilometer, at a moderate density level. 2.1.3 Experiment process A total of 50 participants were recruited for this experiment, including 38 males and 12 females. All participants were aged between 25 and 55 years old, with driving experience of more than 5 years and holding valid driver's licenses. Participants were required to be in good physical and mental condition, abstain from alcohol, and have no physiological or psychological issues affecting driving behavior during the experiment. Before the formal experiment, participants received instructions on skill operations and completed 10 to 20 minutes of adaptive driving. The formal experiment began only after confirming that participants had no adverse reactions. Participants freely chose lanes based on their driving habits during the experiment. Each experiment lasted approximately 30 minutes, generating real-time output of over a hundred thousand pieces of data. 2.1.4 Data acquisition The driving simulator can collect various data in real-time during experiments, including vehicle operating status data, driving behavior data, vehicle interaction data, and other information. Over 100 dynamic parameters were outputted at a recording frequency of 100Hz. After selection, this experiment chose 27 LC behavior parameters for the study, covering aspects such as driver behavior (5 parameters), vehicle operating status (13 parameters), and vehicle interaction (9 parameters), as shown in Table 1 . During the data acquisition process, noise interference may occur due to equipment or system issues. Therefore, a sliding average algorithm was used to filter the data to ensure data quality and accuracy. Table 1 LC behavior parameters Number Classification Parameters Descriptions (Unit) X1 Driver behavior information ${\delta }_{s}$ Steering Wheel Angle, Left turn is negative, right turn is positive (°) X2 ${\omega }_{s}$ Steering Wheel Angle Rate(rad/s) X3 ${f}_{s}$ Steering Wheel Steering Torque ( $N·m$ ) X4 ${a}_{p}$ Accelerator pedal position: Unpressed, 0; Fully depressed, 1 X5 ${b}_{p}$ Brake pedal position: Unpressed, 0; Fully depressed, 1 X6 Vehicle operating status information $v$ Velocity (m/s) X7 ${v}_{x}$ Lateral velocity (m/s) X8 ${a}_{x}$ Lateral acceleration (m/s 2 ) X9 ${v}_{y}$ Longitudinal velocity s(m/) X10 ${a}_{y}$ Longitudinal acceleration (m/s 2 ) X11 ${d}_{lat}$ Vehicle lateral position (m) X12 ${d}_{lon}$ Vehicle longitudinal position (m) X13 ${\theta }_{roll}$ Side Roll Angle (rad) X14 ${\beta }_{s}$ Side Slip Angle (rad) X15 ${r}_{yaw}$ Yaw Rate (rad/s) X16 ${a}_{yaw}$ Yaw angular acceleration (rad/s 2 ) X17 ${r}_{roll}$ Roll Rate (rad/s) X18 ${a}_{roll}$ Roll angular acceleration (rad/s 2 ) X19 Vehicle interaction information ${d}_{x}^{Sub\&Pre2}$ The relative distance between $Sub$ and $Pre2$ (m) X20 ${v}_{y}^{Sub\&Pre2}$ The velocity difference between $Sub$ and $Pre2$ (m) X21 ${d}_{x}^{Sub\&Fol2}$ The relative distance between $Sub$ and $Flo2$ (m) X22 ${v}_{y}^{Sub\&Fol2}$ The velocity difference between $Sub$ and $Flo2$ (m) X23 ${d}_{y}^{Sub\&Pre1}$ The relative distance between $Sub$ and $Pre1$ (m) X24 ${v}_{y}^{Sub\&Pre1}$ The velocity difference between $Sub$ and $Pre1$ (m) X25 ${d}_{y}^{Sub\&Fol1}$ The relative distance between $Sub$ and $Flo1$ (m) X26 ${v}_{y}^{Sub\&Fol1}$ The velocity difference between $Sub$ and $Fol1$ (m) X27 ${L}_{n}$ Lane IDs 2.1.5 Data processing (1) Definition of Lane-Changing The lane-changing process, as defined in the literature, involves a vehicle moving from its original lane to the target lane, entailing a unidirectional continuous change in the vehicle's lateral position [ 24 ] . This study specifically focuses on LC scenarios involving five cars, as depicted in Fig. 2 . Figures 2 (a) and (b) depict LC occurring from left to right and from right to left, respectively. During this process, the LC vehicle ( sub ) moves from its original lane to the target lane. Fol1 and Pre1 represent the following and preceding cars of sub in the original lane. Fol2 and Pre2 represent the following and preceding cars of sub in the target lane. (2) LC sample extraction The extraction of trajectory data during LC processes is closely related to determining the start and end points of LC. To obtain effective LC trajectory data, we have formulated extraction rules. First, it is determined whether a vehicle has undergone LC behavior based on the lane IDs. Specifically, if the lane ID of the vehicle at time 𝑡 is different from the previous time, then we can infer that the vehicle has undergone an LC behavior. Second, we further identify intervals of monotonic lateral displacement of vehicles and mark the start and end points of lane changes. Finally, to exclude vehicles that do not intend to change lanes or have weak LC intentions, we introduce a constraint: the distance of the lateral coordinates of the vehicles at the start and end points of the LC from the lane lines on both sides should be greater than or equal to half of the vehicle width. Based on the above rules and constraints, a method to identify the start and end times of LC was proposed, as shown in Equations ( 1 ) and ( 2 ). $${t}_{s}=\left\{\begin{array}{c}max\left\{t|t-{t}_{l}<0 and \frac{1}{2}{W}_{V}\le {x}_{t}-{x}_{l}\le {W}_{L}-\frac{1}{2}{W}_{V} and {x}_{t-1}\ge {x}_{t}\right\}Left LC\\ max\left\{t|t-{t}_{l}0 and \frac{1}{2}{W}_{V}\le {x}_{l}-{x}_{t}\le {W}_{L}-\frac{1}{2}{W}_{V} and {x}_{t}\ge {x}_{t+1}\right\}Left LC\\ min\left\{t|t-{t}_{l}>0 and \frac{1}{2}{W}_{V}\le {x}_{t}-{x}_{l}\le {W}_{L}-\frac{1}{2}{W}_{V} and {x}_{t}\le {x}_{t+1}\right\}Right LC\end{array}\right.$$ 2 Where ${t}_{s}$ : the time of the LC start point; ${t}_{e}$ : the time of the LC end point; ${t}_{l}$ : the time when the vehicle crosses the lane line; ${x}_{t}$ : the lateral coordinate of the vehicle at time $t$ ; ${x}_{l}$ : the lateral coordinate of the vehicle at time ${t}_{l}$ ; ${W}_{L}$ : the width of the lane; ${W}_{V}$ : the width of the vehicle. Based on the aforementioned extraction method, a total of 1200 effective LC samples were collected on snowy and icy surfaces. Among these samples, 754 were left LC samples, while 446 were right LC samples. 2.2 Key RCIs selection 2.2.1 Definition of risky LC behavior RLCB is an adverse event or potential hazard that may occur when making a lane change during the driving process [25] . The first type of RLCB is Lane-Changing Clearance Insufficient (LCCI), which means that the distance between the subject vehicle and surrounding vehicles falls below the safety threshold. In this situation, due to the close proximity of the vehicles to each other and the low coefficient of adhesion on snowy and icy road surfaces, any unexpected braking or acceleration actions by surrounding vehicles may cause collisions. The second type of RLCB is Vehicle Lateral Instability (VLI). When the vehicle performs a lane change maneuver, the risk of lateral instability occurs due to excessive steering wheel angle rate or high driving speed, coupled with the low adhesion coefficient on the snowy and icy road surface. This risk solely concerns the vehicle's own operating status and does not conflict with surrounding vehicles. 2.2.2 Key RCIs selection Due to the complexity of LC operations on snowy and icy surfaces, there exist numerous RCIs for LC. To reduce the computational complexity of the model, it is necessary to select indicators that accurately depict the characteristics of RLCB. The key RCIs were selected in this study using decision tree principles. The specific process is shown in Fig. 3 . (1) Statistical feature of RCIs The LC behavior parameters in Table 1 were discretized using a statistical value construction method to generate corresponding statistical indicators. 13 statistical features were extracted from the time series of each feature indicator, including the mean, maximum, minimum, mode, 0.25 quantiles, median, 0.75 quantiles, variance, standard deviation, range, variable coefficient, skewness, and kurtosis. The lane number(X27), not being a time series parameter, was excluded. Finally, the remaining 26 LC behavior parameters were selected as RCIs, totaling 338 statistical features. The importance of each RCI was objectively reflected by analyzing the importance of its statistical features. (2) Importance analysis of RCIs The importance of feature indicators was quantified using the information gain rate by the C4.5 algorithm. A greater information gain rate indicates a higher degree of orderliness of the resulting set after splitting, leading to better classification effects and higher importance of feature indicators. The specific process of analyzing the importance of RCIs using the C4.5 algorithm is as follows. Firstly, 1200 sets of statistical features of LC samples were taken as sample set $S$ . The uncertainty of the sample set $S$ was described by the information entropy $Ent\left(S\right)$ . $$Ent\left(S\right)=-\sum _{i=1}^{n}{p}_{i}{\text{log}}_{2}{p}_{i}$$ 3 Where ${p}_{i}$ is the probability of occurrence of the $i$ th class data in the dataset $S$ . Then the sample set $S$ was categorized into $n$ classes based on feature $A$ . At this stage, the information entropy of the sample set was denoted by $Ent(S, A)$ . $$Ent\left(S,A\right)=\sum _{j=1}^{n}\frac{\left|{S}_{j}\right|}{S}\times Ent\left({S}_{j}\right)$$ 4 Where ${S}_{j}$ is the set of samples that take value $j$ under feature $A$ . The information gain obtained by dividing sample set $S$ according to feature $A$ was $Gain(S,A)$ , and the higher the information gain, the greater the decrease in uncertainty of set $S$ . $$Gain\left(S,A\right)=Ent\left(S\right)-Ent(S,A)$$ 5 Considering that information gain tends to overfit when selecting features with fewer values, the information gain ratio $GainRatio(S,A)$ was introduced based on information gain. $$GainRatio\left(S,A\right)=\frac{Gain(S,A)}{SplitInf\left(A\right)}$$ 6 $$SplitInf\left(A\right)=-\sum _{i=1}^{n}\frac{\left|{S}_{i}\right|}{\left|S\right|}{\text{log}}_{2}\frac{\left|{S}_{i}\right|}{\left|S\right|}$$ 7 Where $SplitInf\left(A\right)$ denotes the complexity of splitting the dataset due to feature $A$ . To obtain RCIs with higher importance, the information gain rate ranking of all statistical features for each RCI was plotted as a box plot. Each box in the box plot represents the ranking of the information gain ratio of all statistical features corresponding to the RCI. RCIs with information gain ratio rankings exceeding 50% are relatively less important and are eliminated. The statistical features of the selected RCIs were used as split nodes of the decision tree to split the original data. If the information gain ratio of the dataset after splitting is high and the level of confusion decreases significantly, it indicates that the selected RCIs have a better classification effect in identifying RLCB, meeting the requirements of the LC RCIs. (3) Correlation Analysis of RCIs To reduce the computational complexity and improve the recognition efficiency, it is necessary to ensure that the input feature indicators are as independent as possible in the model. In this study, the Pearson correlation coefficient was used to analyze the correlation of the screened RCIs, and the feature parameters with strong correlation were excluded to avoid redundant information and multicollinearity problems. Hohlfelder [ 26 ] pointed out that a Pearson correlation coefficient greater than 0.6 is considered as strong correlation between two feature parameters, which should be excluded to reduce feature redundancy. 2.3 Framework of the Stacking ensemble learning Ensemble learning is a machine learning method that improves classification and recognition performance by constructing multiple models and combining them [ 27 ] . The three common frameworks of ensemble learning are Bagging for parallel computation, Boosting for sequential computation, and Stacking for hierarchical integration. Compared with Bagging and Boosting, the Stacking algorithm can train multiple different classification models on the same dataset, thus taking advantage of the strengths of different models and improving the recognition accuracy [ 28 ] . 2.3.1 Stacking ensemble learning model Figure 4 illustrates the RLCB recognition model based on the Stacking ensemble algorithm. In this study, three basic classifiers (SVM, RF, and Bi-LSTM) and a meta-classifier (LSTM) were used. The LC risks were classified into three types: Normal lane-changing (LCN), VLI, and LCCI. The main steps of the model training process are as follows. Step 1: The dataset ( $S$ ) is randomly divided in a 7:3 ratio, where 70% of the data is used as the training set ( $T$ ) and 30% as the testing set ( $R$ ). To prevent model overfitting, the 5-fold cross-validation was performed within each model. Thus, the training set ( $T$ ) is randomly divided into 5 subsets ${T}_{1}$ , ${T}_{2}$ , ${T}_{3}$ , ${T}_{4}$ , and ${T}_{5}$ . Where ${T}_{i}$ (i = 1,2,3,4,5) is used as the validation subset, then ${T}_{i+}=T-{T}_{i}$ is used as the training subset. Step 2: The base classifier SVM is trained using the ${T}_{i+}$ training subset, obtaining 5 trained SVM classifiers. Then, the validation subset Ti is input into the trained SVM respectively, and 5 recognition results are obtained. The meta-classifier training set (A1) is obtained by combining the 5 recognition results. Step 3: The test set ( $R$ ) is recognized using the 5 trained base classifier SVM obtained in step 2, respectively, and 5 recognition results of the test set are obtained. Then, the test set (B1) of meta-classifiers is obtained by arithmetic averaging the 5 recognition results. Step 4: Steps 1–3 are sequentially performed on the two base classifiers of RF and Bi-LSTM to obtain the meta-classifier training set (A2, A3, A4) and test set (B2, B3, B4). Step 5: The meta-classifier is trained using the training set (A1, A2, A3, A4). Then, the test set (B1, B2, B3, B4) is used for recognition, and the final recognition results are output. Three different types of base classifiers were constructed using a Stacking ensemble learning framework to generate preliminary recognition results. These preliminary recognition results were then fed into the meta-classifier to obtain the final recognition results. If one of the base classifiers makes errors when learning specific regions of the feature space, the meta-classifier can correct these errors by integrating the learning results of other base classifiers. This approach effectively takes the advantages of each base classifier and compensates for their limitations in different regions, thereby improving the overall model performance and generalization ability. 2.3.2 Base and meta classifier of ensemble learning (1) SVM-Base Classifier: SVM is a powerful machine learning algorithm that is widely used to solve classification problems [ 29 ] . Due to its high reliability and computational efficiency, it is employed as one of the base classifiers. The two hyperparameters of the SVM algorithm are the penalty coefficient ( C ) and the kernel function ( g ). These parameters control the complexity of the model and the curvature of the decision boundary, respectively. The values of C and g were set to 5.4 and 1.7, respectively. (2) RF-Base Classifier: RF [ 30 ] is a bagging method that uses the CART decision tree as a weak learner for training. It employs random data sampling and feature selection, effectively avoiding the issue of overfitting. Therefore, RF is selected as one of the base classifiers in this study. The hyper-parameters of RF include n_estimators, max_depth, max_features, min_samples_split, min_samples_leaf, and max_leaf_ nodes. These parameters were set to 105, 13, 4, 4, 2, and None, respectively. (3) Bi-LSTM-Base Classifier: Bi-LSTM [ 31 ] is a deep learning model used for sequence data processing. It consists of two Long Short-Term Memory (LSTM) networks that process input sequences in forward and backward directions, respectively. RCIs are typically time-varying data, and RLCB can be better identified by analyzing the time-series variations of these indications. Therefore, Bi-LSTM is selected as one of the base classifiers for ensemble learning. Reasonable setting of hyper-parameters such as the number of hidden_size, learning_rate, and dropout_rate is crucial to improve the performance and training effect of the Bi-LSTM model. Thus, the values of hidden_size, learning_rate, and dropout_rate were set to 130, 0.05 and 0.4, respectively. (4) LSTM-Meta Classifier: LSTM [32] is a type of recurrent neural network that is widely used for processing sequence data. Compared to traditional recurrent neural networks (RNNs), LSTM can control the input, forget, and output of information through gate mechanisms, addressing the issue of long-term dependencies. In addition, LSTM has strong robustness and generalization capabilities. Given its advantages, this study employs LSTM as the meta classifier and sets the values of hidden_size, learning_rate, and dropout_rate to 120, 0.01, and 0.005, respectively. 2.3.3 Model evaluation (1) Accuracy and error rates Accuracy ( $Acc$ ) is the simplest model evaluation metric for assessing how accurately a classifier classifies samples. The formula is as follows: $$Accuracy=\frac{{N}_{correct}}{{N}_{total}}$$ 8 (2) Precision and recall Precision measures the proportion of correctly predicted (True Positive) results among all positive cases; while recall measures the proportion of predicted and actual positive cases [ 33 ] . The confusion matrix presents the results of binary classification intuitively, as shown in Table 2 . Table 2 Confusion matrix Actual Classification Predicted Classification Positive Negative Positive True Positive (TP) False Negative (FN) Negative False Positive (FP) True Negative (TN) Based on the confusion matrix, the calculation formulas for accuracy ( $P$ ) and recall ( $R$ ) are as follows: $$P=\frac{TP}{TP+FP}$$ 9 $$R=\frac{TP}{TP+FN}$$ 10 (3) F1-score The F1-score, a harmonic mean indicator, is used to measure the precision ( $P$ ) and recall ( $R$ ) of a binary classification model [34] . It combines the accuracy and recall capabilities of the classifier, making it a commonly used indicator to evaluate the performance of classification models. Generally, a higher F1-score indicates a better performance of the model. The formula for F1-score is as follows: $$F1=\frac{2\times P\times R}{P+R}$$ 11 (4) ROC and AUC The ROC curve is used to depict the trade-off relationship between the True Positive Rate (TPR) and the False Positive Rate (FPR). The horizontal axis of the ROC curve represents the False Positive Rate (FPR). The vertical axis represents the True Positive Rate (TPR), which is also known as recall [ 35 ] . The higher convexity of the ROC curve represents better model performance. AUC (Area Under Curve) is the area enclosed by the ROC curve, which can effectively quantify the model's performance. The AUC value is between 0 and 1, and a higher AUC value indicates better model performance. When the value of AUC is greater than 0.7, it indicates that the expected effect of the model is better. 3. Results 3.1 Key RCIs Based on 1200 sets of LC sample data, the information gain ratio of 338 statistical features derived from 26 RCIs was calculated. Then, the rankings of the information gain ratios of the 13 statistical features for each RCI were plotted as a box plot, as shown in Fig. 5 . In the box plot, each box represents the ranking of the information gain ratio of the 13 statistical features corresponding to each RCI. The smaller the vertical coordinate value of the box, the lower the importance of the feature parameter. RCIs with information gain ratio rankings exceeding 50% were eliminated. Therefore, the RCIs selected after the importance assessment are as follows: X1( ${\delta }_{s}$ ), X2( ${\omega }_{s}$ ), X6( $v$ ), X7( ${v}_{x}$ ), X8( ${a}_{x}$ ), X9( ${v}_{y}$ ), X10( ${a}_{y}$ ), X13( ${\theta }_{roll}$ ), X14( ${\beta }_{s}$ ), X15( ${r}_{yaw}$ ), X19( ${d}_{x}^{Sub\&Pre2}$ ), X20( ${v}_{y}^{Sub\&Pre2}$ ), X21( ${d}_{x}^{Sub\&Fol2}$ ), X22( ${v}_{y}^{Sub\&Fol2}$ ), X23( ${d}_{y}^{Sub\&Pre1}$ ). The statistical features of the selected RCIs were used as the splitting nodes of the decision tree to split the original data. The information gain ratio of the split data set was significantly increased, and the level of confusion was noticeably reduced. This indicates that the 15 RCIs have a good classification effect in identifying RLCB, meeting the requirements of LC RCIs. To identify key RCIs with high importance and low intercorrelation, Pearson correlation analysis was conducted on the 15 RCIs, as shown in Fig. 6 . When the significance level ( $P$ ) was less than 0.05, the RCIs X1( ${\delta }_{s}$ ), X6( $v$ ), and X13( ${\theta }_{roll}$ ) with correlation coefficients greater than 0.6 were eliminated. Therefore, by integrating the importance of RCIs and Pearson correlation analysis, ${\omega }_{s}$ , ${v}_{x}$ , ${a}_{x}$ , ${v}_{y}$ , ${a}_{y}$ , ${\beta }_{s}$ , ${r}_{yaw}$ , ${d}_{x}^{Sub\&Pre2}$ , ${v}_{y}^{Sub\&Pre2}$ , ${d}_{x}^{Sub\&Fol2}$ , ${v}_{y}^{Sub\&Fol2}$ , ${d}_{y}^{Sub\&Pre1}$ were identified as key RCIs, providing input parameters for the subsequent LC risk behavior recognition model training. 3.2 Optimal time window length The length of the LC time window represents the time series length of the model parameter input, and selecting an appropriate LC time window is crucial for accurately identifying RLCB. If the time window is too large, it may result in sequence data containing multiple behavioral features, reducing the model's accuracy and efficiency. Conversely, if the time window is too small, critical feature parameters may be excluded, severely reducing the model's recognition accuracy. Based on the LC dataset, the duration of LC on snowy and icy surfaces was determined to be approximately 4.2s-11.6s using statistical methods. In this study, the optimal time window length was identified by analyzing variations in RLCB using neural networks with different time window lengths. As shown in Fig. 7 , the highest recognition accuracy was achieved when the time window length is 7.6 s. Therefore, 7.6s was adopted as the optimal time window length for LC on snowy and icy surfaces. 3.3 Identification of RLCB and model evaluation The method of manual selection was employed to screen 643 LCN samples, 323 LCCI samples, and 234 VLI samples from a dataset of 1200 LC samples. All data samples were randomly divided according to a 7:3 ratio, with 70% used as the training set and 30% as the test set. Using the 5-fold cross-validation method, the training dataset was randomly divided into 5 subsets. To ensure better recognition, the original time series data were processed to maintain a time series length of 7.6s for model input. On snowy and icy surfaces, a Stacking ensemble learning model for RLCB recognition was developed. The model input parameters include ${\omega }_{s}$ , ${v}_{x}$ , ${a}_{x}$ , ${v}_{y}$ , ${a}_{y}$ , ${\beta }_{s}$ , ${r}_{yaw}$ , ${d}_{x}^{Sub\&Pre2}$ , ${v}_{y}^{Sub\&Pre2}$ , ${d}_{x}^{Sub\&Fol2}$ , ${v}_{y}^{Sub\&Fol2}$ , ${d}_{y}^{Sub\&Pre1}$ . Moreover, accuracy ( $Acc$ ), precision ( $P$ ), recall ( $R$ ), F1-score ( $F1$ ), $ROC$ , and confusion matrix were used to evaluate the recognition performance of the model and were compared with traditional SVM, RF, and Bi-LSTM models. The evaluation results of each model's performance are shown in Table 3 . Table 3 Evaluation results of each model Model $Acc$ (%) LCN VLI LCCI $P$ (%) $R$ (%) $F1$ (%) $P$ (%) $R$ (%) $F1$ (%) $P$ (%) $R$ (%) $F1$ (%) SVM 92.22 96.34 93.88 95.09 88.57 91.18 89.86 88.66 89.58 89.12 RF 93.89 93.89 95.36 94.63 91.43 90.14 90.78 91.75 93.68 92.70 Bi-LSTM 96.67 98.45 96.94 97.69 92.86 95.59 94.21 95.88 97.89 96.87 Stacking 98.33 99.48 98.46 98.97 95.71 97.1 96.40 97.94 98.96 98.45 The results indicate that all four recognition models have high accuracy in identifying LC behaviors. Among them, the Stacking ensemble learning model achieved the highest accuracy, reaching 98.33%. Among the other three base classifiers, the accuracies of the Bi-LSTM, RF, and SVM models were 96.67%, 93.89%, and 92.22%, respectively. Notably, the SVM model had the lowest recognition accuracy, trailing the Stacking model by 6.11%. Additionally, in the recognition of LC behaviors such as LCN, VLI, and LCCI, the proposed model outperformed the other three base classifiers in terms of $P$ , $R$ , and $F1$ . It demonstrates that, for driving simulation data on snowy and icy surfaces, the Stacking ensemble learning model has the best overall performance in identifying LC behaviors. Figure 8 shows the confusion matrices for SVM, RF, Bi-LSTM, and Stacking models in identifying LC behaviors. These confusion matrices provide details of the recognition performance of the four models. The diagonal values of the confusion matrices represent the number of correctly predicted samples. The higher values indicate the better classification performance. It is evident that the proposed model demonstrates high accuracy in identifying three types of LC behaviors, indicating the best performance. The off-diagonal elements represent the number of misclassified samples, lower values correspond to better model performance. The number of LCN samples misclassified as VLI and LCCI is lower in the proposed model compared to the other three models. Similarly, the number of VLI samples misclassified as LCN and LCCI is also lower. Figure 9 illustrates the ROC curves of the four models. It can be seen that when the false positive rate is 5%, the corresponding true positive rate is nearly 95%. This demonstrates that the four models have high recognition rates when the time series length is 7.6s, and the appropriate model can be selected according to actual needs. A larger area under the curve (AUC) indicates better model recognition performance. As depicted in Fig. 9 , the AUC values for the four models are as follows: Stacking > Bi-LSTM > RF > SVM. This indicates that the proposed model for identifying LC behaviors has the best recognition performance. 4. Discussion and conclusion In this study, simulated highway driving experiments on snowy and icy surfaces were conducted using a driving simulator, and a large amount of LC data was collected. The key RCIs of LC behavior were extracted, and a RLCB recognition model applicable to snowy and icy surfaces was established. The conclusions are summarized as follows: Based on the characteristics of LC behavior on snowy and icy surfaces, two risk LC behaviors were defined: vehicle lateral instability and LC clearance insufficient. A total of 1,200 LC samples were extracted from the driving simulation data on snowy and icy surfaces, including 643 LCN samples, 323 LCCI samples, and 234 VLI samples. The optimal LC time window length on snowy and icy surfaces has been determined to be 7.6s. The importance analysis of 26 RCIs during the LC process on snowy and icy surfaces was analyzed based on the C4.5 algorithm. Subsequently, Pearson correlation analysis was applied to eliminate highly correlated RCIs. Finally, 12 key RCIs, ${\omega }_{s}$ , ${v}_{x}$ , ${a}_{x}$ , ${v}_{y}$ , ${a}_{y}$ , ${\beta }_{s}$ , ${r}_{yaw}$ , ${d}_{x}^{Sub\&Pre2}$ , ${v}_{y}^{Sub\&Pre2}$ , ${d}_{x}^{Sub\&Fol2}$ , ${v}_{y}^{Sub\&Fol2}$ , ${d}_{y}^{Sub\&Pre1}$ were extracted and used as input variables for identifying RLCB on snowy and icy surfaces. A RLCB recognition model was developed based on Stacking ensemble learning, integrating three algorithms: SVM, RF, and Bi-LSTM. The research results indicate that the Stacking ensemble learning model has the highest recognition rate for RLCB, with a comprehensive recognition accuracy of 98.33%. Moreover, the model demonstrates superior performance in terms of $P$ , $R$ , and $F1$ values. This illustrates that the Stacking ensemble learning algorithm is more suitable for identifying RLCB in the research on LC trajectory data extracted from driving simulators. It is crucial to accurately identify RLCB on snowy and icy surfaces. For micro-traffic systems, identifying RLCB on snowy and icy surfaces can effectively prevent traffic accidents and enhance overall transportation safety. For macro-traffic systems, an in-depth understanding of the vehicle's operating status can help traffic management departments make effective decisions, reducing traffic congestion and accident rates. For ADAS systems, accurately identifying RLCB on snowy and icy surfaces can assist drivers in taking timely safety measures, thereby enhancing driving safety under adverse road conditions. For V2X systems, sharing and transmitting the recognition results of RLCB among vehicles can help surrounding vehicles make driving decisions in advance, further improving the safety performance of autonomous driving. This study is based on driving simulation data to conduct research on RLCB recognition. In future studies, we will conduct real vehicle experiments at professional testing sites to improve data reliability and model accuracy. Furthermore, in the context of intelligent connected vehicles, the Stacking ensemble learning algorithm and its parameters optimization should be further expanded. Additional factors, such as vehicle dynamic parameters, vehicle interaction data, and traffic environment data, should be considered to achieve more accurate recognition of RLCB. Declarations Acknowledgments This research was supported by the Fundamental Research Funds for the Central Universities (No.2572023AW66), the National Science Fund for Young Scholars (No.51108068), the Key Research and Development Guiding Projects of Heilongjiang Province (No.GZ20220027). Author contributions X.D. and W.Z. provided the study conception and design; W.Z. conducted software simulation, data collection and processing; X.D. analyzed the results; X.D. and W.Z. wrote the manuscript; W.Z. typesetted the manuscript. All authors reviewed the manuscript and provided funding support for this study. Competing interests The authors declared no potential conflicts of interest with respect to the research, authorship, and publication of this article. Data Availability The data that support the findings of this study are available from the corresponding author upon reasonable request. References Xue, Q.W., Wang, K., Lu, J.J., Xing, Y.Y., Gu X. & Zhang, M. An improved risk estimation model of lane change using naturalistic vehicle trajectories. Journal of Transportation Safety & Security. 15(10), 963–986(2023). Yang, M., Wang, X. & Quddus, M. Examining lane change gap acceptance, duration and impact using naturalistic driving data. Transportation Research Part C: Emerging Technologies. 104, 317–331(2019). Fitch, G., Lee, S., Klauer, S., Hankey, J., Sudweeks, J. & Dingus, T. Analysis of lane-change crashes and near-crashes. Report No. DOT HS 811 147; National Highway Traffic Safety Administration: Washington, DC, USA, 2009. Traffic Administration Bureau of the Ministry of Public Security of the People’s Republic of China. Annual Report on Road Traffic Accident Statistics of the People’s Republic of China, Jiangsu Wuxi, China, 2020. Fan, P.C., Guo, J.Q., Wang, Y.B. & Wijnands Jasper, S. A hybrid deep learning approach for driver anomalous lane changing identification. Accident Analysis and Prevention. 171, 106661(2022). Wang, Z.Y., Tan, D., Ge, G., et al. Optimal trajectory planning and control for automatic lane change of in wheel motor driving vehicles on snow and ice roads. Automatic Control and Computer Sciences. 54, 432–445 (2020). Li, Z.N., Huang, X.H., Mu, T. & Wang, J. Attention-based lane change and crash risk prediction model in highways. IEEE Transactions on Intelligent Transportation Systems. 23(12), 22909–22922(2022). Cohen, S. Application of relaxation procedure for lane changing in microscopic simulation models. Transportation Research Record. 1883(1), 50–58 (2004). Qi, W., Wang, W., Shen, B. & Wu, J. A modified post encroachment time model of urban road merging area based on lane-change characteristics. IEEE Access. 8, 72835–72846 (2020). Yang, J., Lee, J., Mao, S. & Hu J. Dynamic safety estimation of airport pick-up area based on video trajectory data. IEEE Transactions on Intelligent Transportation Systems. 25 (2), 1774–1786(2024). Fu, C.Y. & Sayed, T. Comparison of threshold determination methods for the deceleration rate to avoid a crash (DRAC)-based crash estimation. Accident Analysis & Prevention. 153, 106051(2021). Nilsson, J., Ödblom, A.C.E. & Fredriksson, J. Worst-case analysis of automotive collision avoidance systems. IEEE Transactions on Vehicular Technology. 65(4), 1899–1911(2016). Tyagi, I. Threat assessment for avoiding collisions with perpendicular vehicles at intersections. Proceedings of the 2021 IEEE International Conference on Electro Information Technology (EIT), May 14–15, 2021 Mt. Pleasant, MI, USA. Piscataway NJ: IEEE, c2021: 184–187. Winkler, S., Werneke, J. & Vollrath, M. Timing of early warning stages in a multi stage collision warning system: drivers’ evaluation depending on situational influences. Transportation Research Part F: Traffic Psychology and Behavior. 36, 57–68(2016). Park, H.J., Oh, C., Moon, J. & Kim, S. Development of a lane change risk index using vehicle trajectory data. Accident Analysis and Prevention. 110, 1–8(2018). Tao, L. et al. Collision risk assessment service for connected vehicles: leveraging vehicular state and motion uncertainties. IEEE Internet of Things Journal. 8(14), 11548–11560(2021). Feng, Y.Y. & Yan, X.L. Support vector machine based lane-changing behavior recognition and lateral trajectory prediction. Computational Intelligence and Neuroscience. 2022, 1–9(2022). Sun, Q.Y. et al. Lane change strategy analysis and recognition for intelligent driving systems based on random forest. Expert Systems with Applications. 186, 115781(2021). Zhu, J., Ma, Y. & Lou, Y. Multi-vehicle interaction safety of connected automated vehicles in merging area: a real-time risk assessment approach. Accident Analysis & Prevention. 166, 106546(2022). Peng, J.S. & Shao. Y.M. Intelligent method for identifying driving risk based on V2V multisource big data. Complexity. 2018, 1801273(2018). Gu, X.P., Han, Y.P. & Yu, J.F. A novel lane-changing decision model for autonomous vehicles based on deep autoencoder network and XGBoost. IEEE Access. 8, 9846–9863(2020). Chen, T.Y., Shi, X.P. & Wong, Y.D. A lane-changing risk profile analysis method based on time-series clustering. Physica A. 565, 125567(2021). Prajwal, C., Venkatesan, K. & Gowri, A. Understanding the mechanism of lane changing process and dynamics using microscopic traffic data. Physica A: Statistical Mechanics and its Applications. 593, 126981(2022). Wu, J.B., Chen, X.H., Bie, Y.M. & Zhou, W. A co-evolutionary lane-changing trajectory planning method for automated vehicles based on the instantaneous risk identification. Accident Analysis & Prevention. 180, 106907(2023). Hohlfelder, B. et al. Prospective evaluation of a bivalirudin to warfarin transition nomogram. Journal of Thrombosis Thrombolysis. 43, 498–504(2017). Ding, W.M. & Wu, S.L. A cross-entropy based stacking method in ensemble learning. Journal of Intelligent & Fuzzy Systems. 39(3), 4677–4688(2020). Agarwal, S. & Chowdary, C.R. A-stacking and a-bagging: adaptive versions of ensemble learning algorithms for spoof fingerprint detection. Expert Systems with Applications. 146, 113160(2019). Cortes, C. & Vapnik, V. Support-vector networks. Machine Learning. 20, 273–297(1995). Schonlau, M. & Zou, R.Y. The random forest algorithm for statistical learning. The Stata Journal: Promoting communications on statistics and Stata. 20(1), 3–29(2020). Graves, A., Fernández, S. & Schmidhuber, J. Bidirectional LSTM networks for improved phoneme classification and recognition. Artificial Neural Networks: Formal Models and Their Applications. 3697, 799–804(2005). Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Computation. 9(8), 1735–1780(1997). Pasi, F., Radu, M.I. Soft precision and recall. Pattern Recognition Letters. 167, 115–121(2023). Pinto L., Gopalan, S. & Balasubramaniam, P. Quantification on the generalization performance of deep neural network with tychonoff separation axioms. Information Sciences. 608, 262–285(2022). Jonathan, A.C. ROC curves and nonrandom data. Pattern Recognition Letters. 85, 35–41(2017). Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 20 Aug, 2024 Read the published version in Scientific Reports → Version 1 posted Editorial decision: Revision requested 12 Jul, 2024 Reviews received at journal 25 Jun, 2024 Reviewers agreed at journal 14 Jun, 2024 Reviewers agreed at journal 14 Jun, 2024 Reviewers invited by journal 02 Jun, 2024 Editor assigned by journal 02 Jun, 2024 Editor invited by journal 31 May, 2024 Submission checks completed at journal 30 May, 2024 First submitted to journal 28 May, 2024 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4491572","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":311965636,"identity":"abca1a3b-02b1-422a-9807-4a5d4941c434","order_by":0,"name":"Xuejing DU","email":"","orcid":"","institution":"Northeast Forestry University","correspondingAuthor":false,"prefix":"","firstName":"Xuejing","middleName":"","lastName":"DU","suffix":""},{"id":311965637,"identity":"7b6bfe39-ab38-4355-ba9b-4ae1d4538b8f","order_by":1,"name":"Wei Zhao","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAr0lEQVRIie3QIQ4CMRBA0WmaFNOAbVJCr9ANFsFRZg2r1tdiumYPgOIY6DbY3gDTIyDrgLAJeiRivhoxL5kWgOP+tIT+sFOrKZGFqDWc9mtdkExkV8u9v5qjp+27CyrTRzlEAwgt3AhXLUSN0Z6TmMuDcJX9Ej3GbUIpIoGohZhBGfQ0oi1Kj8UjnRhXRcWAXfx8cia9xc0JcvMv56Yp1xYIBGDz/I2Jss9xHMcRegPWFDiTgUnmzAAAAABJRU5ErkJggg==","orcid":"","institution":"Northeast Forestry University","correspondingAuthor":true,"prefix":"","firstName":"Wei","middleName":"","lastName":"Zhao","suffix":""}],"badges":[],"createdAt":"2024-05-28 14:21:23","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4491572/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4491572/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s41598-024-69642-7","type":"published","date":"2024-08-20T15:57:11+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":58173673,"identity":"de6f7d97-a94c-4218-82c9-f4f70e41d2a4","added_by":"auto","created_at":"2024-06-12 04:06:24","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":3099498,"visible":true,"origin":"","legend":"\u003cp\u003e6-DOF driving simulator\u003c/p\u003e","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-4491572/v1/6523580637b8ec159e554092.png"},{"id":58174380,"identity":"a155bc00-fa3a-4f48-b2d8-e7b6fb295cab","added_by":"auto","created_at":"2024-06-12 04:14:24","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":399674,"visible":true,"origin":"","legend":"\u003cp\u003eSchematic diagram of LC scenario\u003c/p\u003e","description":"","filename":"Figure2.png","url":"https://assets-eu.researchsquare.com/files/rs-4491572/v1/5c9186fb149559ca30751789.png"},{"id":58174379,"identity":"2d0064d0-f11a-40bf-93c2-05da484fdc7f","added_by":"auto","created_at":"2024-06-12 04:14:24","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":170871,"visible":true,"origin":"","legend":"\u003cp\u003eFlowchart for selecting key RCIs\u003c/p\u003e","description":"","filename":"Figure3.png","url":"https://assets-eu.researchsquare.com/files/rs-4491572/v1/6fe2214ef367808015d42ea3.png"},{"id":58173679,"identity":"d6c2a599-f0a4-46f2-917f-fc4db6bc4df4","added_by":"auto","created_at":"2024-06-12 04:06:24","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":483702,"visible":true,"origin":"","legend":"\u003cp\u003eThe framework of the Stacking ensemble learning.\u003c/p\u003e","description":"","filename":"Figure4.png","url":"https://assets-eu.researchsquare.com/files/rs-4491572/v1/ac10d0163dc80bdaae03407f.png"},{"id":58173677,"identity":"9eaaee6f-24a0-4cbb-9e04-06d915b3a00f","added_by":"auto","created_at":"2024-06-12 04:06:24","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":72384,"visible":true,"origin":"","legend":"\u003cp\u003eInformation gain rate ranking for RCIs\u003c/p\u003e","description":"","filename":"Figure5.png","url":"https://assets-eu.researchsquare.com/files/rs-4491572/v1/5505ca0f4a0db8332c63fcaf.png"},{"id":58173674,"identity":"e2e71a3c-cfa6-4faa-b803-4223775ecf3a","added_by":"auto","created_at":"2024-06-12 04:06:24","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":335352,"visible":true,"origin":"","legend":"\u003cp\u003eIndependence assessment results of RCIs\u003c/p\u003e","description":"","filename":"Figure6.png","url":"https://assets-eu.researchsquare.com/files/rs-4491572/v1/7f4f421b7f3b8ffd073697f9.png"},{"id":58173681,"identity":"25afcc99-610e-45ad-a673-c341f77f4b54","added_by":"auto","created_at":"2024-06-12 04:06:24","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":40004,"visible":true,"origin":"","legend":"\u003cp\u003eModel accuracy rate under different time windows\u003c/p\u003e","description":"","filename":"Figure7.png","url":"https://assets-eu.researchsquare.com/files/rs-4491572/v1/06523f9c5f267a3d8bbcecc5.png"},{"id":58174381,"identity":"2f6d25fe-c3cf-4a14-ba25-9386fcd1abc4","added_by":"auto","created_at":"2024-06-12 04:14:24","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":428301,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix of each model\u003c/p\u003e","description":"","filename":"Figure8.png","url":"https://assets-eu.researchsquare.com/files/rs-4491572/v1/f54987f25d5593310fabc118.png"},{"id":58173676,"identity":"148de32b-d05d-4a17-9976-8162fa1adc87","added_by":"auto","created_at":"2024-06-12 04:06:24","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":59394,"visible":true,"origin":"","legend":"\u003cp\u003eROC of the four models\u003c/p\u003e","description":"","filename":"Figure9.png","url":"https://assets-eu.researchsquare.com/files/rs-4491572/v1/b15d054eb529dcfbb8ded200.png"},{"id":63300140,"identity":"a66660d0-88e2-4ddc-87b6-b7d9294d52dc","added_by":"auto","created_at":"2024-08-26 16:11:38","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":7483781,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4491572/v1/e1966aad-bf45-434f-ac20-26fb4b8602b0.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Risky lane-changing behavior recognition based on Stacking ensemble learning on snowy and icy surfaces","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eLane-changing (LC) and Lane-keeping (LK) are fundamental driving behaviors\u003csup\u003e[\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]\u003c/sup\u003e. However, LC is more complex than LK, with a greater impact on the normal driving of surrounding adjacent vehicles, potentially leading to traffic congestion and accidents\u003csup\u003e[\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]\u003c/sup\u003e. Studies have shown that approximately 5% of traffic accidents and 7% of crash fatalities in the United States are related to LC behavior annually, causing at least 60,000 injuries\u003csup\u003e[\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]\u003c/sup\u003e. In China, almost 14,500 traffic accidents were related to LC, contributing to 5.89% of all accidents, resulting in 2,600 deaths and economic losses exceeding 60\u0026nbsp;million yuan\u003csup\u003e[\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]\u003c/sup\u003e. Improper LC behavior poses a serious threat to traffic safety\u003csup\u003e[\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]\u003c/sup\u003e. In snowy and icy conditions, the accumulation of snow and ice significantly reduces the adhesion coefficient of the road surface, greatly increasing the risk of losing control and colliding during lane changes\u003csup\u003e[\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]\u003c/sup\u003e. Therefore, accurate recognition of RLCB is crucial to mitigate accidents, improve driving efficiency, and ensure road traffic safety and stability.\u003c/p\u003e \u003cp\u003eExtensive research has been conducted by scholars on identifying RLCB under normal driving conditions, employing various research methods, and achieving satisfactory results. Simplified physical models are used by some researchers to describe the movements of traffic participants and select specific indicators to characterize risks\u003csup\u003e[\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]\u003c/sup\u003e. If the calculation results of these indicators exceed a certain threshold, risks are considered to be present. These indicators primarily consist of time indicators (Time to Collision (TTC)\u003csup\u003e[8]\u003c/sup\u003e, Time Headway (THW)\u003csup\u003e[\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]\u003c/sup\u003e, Post Encroachment Time (PET)\u003csup\u003e[\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]\u003c/sup\u003e, Modified Time-to-Collision(MTTC)\u003csup\u003e[\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]\u003c/sup\u003e), acceleration indicators (Deceleration Rate to Avoid a Crash (DRAC)\u003csup\u003e[\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]\u003c/sup\u003e, Brake-Threat-Number (BTN)\u003csup\u003e[13]\u003c/sup\u003e, Steer-Threat Number (STN)\u003csup\u003e[\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]\u003c/sup\u003e), and distance indicators (Minimum Safety Distance (MSD)\u003csup\u003e[\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]\u003c/sup\u003e, Stopping Distance Index (SDI)\u003csup\u003e[16]\u003c/sup\u003e). Due to the gradual improvement of input parameters and the continuous updates of recognition models, the accuracy of recognition has been enhanced. Despite the high computational efficiency and recognition accuracy of these simplified models, they overlook the uncertainty of vehicle motion, thereby restricting the application scenarios and recognition accuracy of these evaluation methods\u003csup\u003e[\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eThe advancement of technology and the development of hardware devices have enabled the collection of an increasing number of feature indicators. In some studies, vehicle dynamic parameters and driving behavior parameters are collected through natural driving experiments or driving simulation tests to discriminate the vehicle\u0026rsquo;s operating status. Common vehicle motion control parameters include vehicle speed, acceleration, steering wheel angle, brake pedal, and accelerator pedal opening and closing angles. Driver behavior parameters include eye movement indicators (gaze points, blink frequency), electroencephalogram signals, and electrocardiogram signals. Under the premise of ensuring sufficient data, the key to identifying risky behaviors lies in the establishment of accurate and efficient recognition models. Machine learning algorithms, such as Support Vector Machines\u003csup\u003e[18]\u003c/sup\u003e, Random Forest Models\u003csup\u003e[\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e19\u003c/span\u003e]\u003c/sup\u003e, Dynamic Bayesian Network Models\u003csup\u003e[\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]\u003c/sup\u003e, Neural Networks\u003csup\u003e[\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]\u003c/sup\u003e, XGBoost\u003csup\u003e[\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]\u003c/sup\u003e, and K-Means clustering\u003csup\u003e[\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]\u003c/sup\u003e, are extensively employed for identifying, predicting, and assessing RLCB owing to their capacity to handle large-scale datasets.\u003c/p\u003e \u003cp\u003eCompared with simplified physical models, machine learning recognition models that consider vehicle motion parameters and driving behavior parameters can better conform to actual situations, thereby improving the accuracy and timeliness of recognition results. However, the effectiveness of these models in snowy and icy environments remains uncertain, with individual characteristics and shortcomings potentially causing variations in recognition results. Therefore, how to overcome the deficiencies of individual models and synthesize the advantages of multiple models to enhance the recognition accuracy of the overall model on snowy and icy surfaces is a pressing issue at present.\u003c/p\u003e \u003cp\u003eIn summary, research on identifying RLCB in normal weather conditions has been relatively comprehensive and mature. However, given the particularity of the snowy and icy surfaces and the scarcity of data, research on the risk of LC under such extreme scenarios has not received sufficient attention, and there is a research gap. Therefore, this study establishes a highway LC scenario on snowy and icy surfaces, conducts driving simulation experiments based on a driving simulator, and collects vehicle motion control parameters, driving behavior parameters, and vehicle interaction data during the LC process. Based on this, statistical and machine learning methods are employed to select key RCIs and a framework based on Stacking ensemble learning is proposed to identify RLCB on snowy and icy surfaces.\u003c/p\u003e"},{"header":"2. Methodology","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e2.1 Driving simulation experiment\u003c/h2\u003e \u003cdiv id=\"Sec4\" class=\"Section3\"\u003e \u003ch2\u003e2.1.1 Experimental device\u003c/h2\u003e \u003cp\u003eThis experiment relies on a simulated driving motion simulation test platform, which consists of a driving cabin, a 210\u0026deg; display screen, a 6-DOF motion platform, industrial computers, and a control platform, as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. The driving cabin can simulate the translational or rotational motion in the x/y/z directions according to the real-time driving status of the vehicle, thereby truly reflecting driving experiences on snowy and icy surfaces and facilitating dynamic adjustments by drivers. This simulated experiment can effectively avoid safety hazards and testing errors encountered in real vehicle tests on snowy and icy surfaces while gathering authentic and dependable test data.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section3\"\u003e \u003ch2\u003e2.1.2 Experiment scenario\u003c/h2\u003e \u003cp\u003eThe experimental scenario design is based on the SILAB6.0 driving simulation software, which can meticulously set the road environment, weather, traffic flow status, and vehicle dynamics. It achieves flexible control of the scenario through scripts, thus simulating highly realistic experimental scenes. For this experiment, we selected the segment from Shuangcheng to Shijia on the Jingha Expressway, with a total length of 33 kilometers, and designed with two lanes in each direction, each lane being 3.75 meters wide, with emergency lanes on both sides. The road mainly comprises straight sections and large curvature radius curves, with curve radius all greater than 1500 meters. The morphology of the snowy and icy road surface was chosen to replicate common snow-covered icy film road surfaces on expressways, with a friction coefficient set to 0.45. Environment vehicles are configured using the Car and TruckMotorway components of the SILAB 6.0 software, with a traffic flow density of approximately 25 vehicles per kilometer, at a moderate density level.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section3\"\u003e \u003ch2\u003e2.1.3 Experiment process\u003c/h2\u003e \u003cp\u003eA total of 50 participants were recruited for this experiment, including 38 males and 12 females. All participants were aged between 25 and 55 years old, with driving experience of more than 5 years and holding valid driver's licenses. Participants were required to be in good physical and mental condition, abstain from alcohol, and have no physiological or psychological issues affecting driving behavior during the experiment. Before the formal experiment, participants received instructions on skill operations and completed 10 to 20 minutes of adaptive driving. The formal experiment began only after confirming that participants had no adverse reactions. Participants freely chose lanes based on their driving habits during the experiment. Each experiment lasted approximately 30 minutes, generating real-time output of over a hundred thousand pieces of data.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section3\"\u003e \u003ch2\u003e2.1.4 Data acquisition\u003c/h2\u003e \u003cp\u003eThe driving simulator can collect various data in real-time during experiments, including vehicle operating status data, driving behavior data, vehicle interaction data, and other information. Over 100 dynamic parameters were outputted at a recording frequency of 100Hz. After selection, this experiment chose 27 LC behavior parameters for the study, covering aspects such as driver behavior (5 parameters), vehicle operating status (13 parameters), and vehicle interaction (9 parameters), as shown in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. During the data acquisition process, noise interference may occur due to equipment or system issues. Therefore, a sliding average algorithm was used to filter the data to ensure data quality and accuracy.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eLC behavior parameters\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNumber\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eClassification\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eParameters\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eDescriptions (Unit)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\" morerows=\"4\" rowspan=\"5\"\u003e \u003cp\u003eDriver behavior\u003c/p\u003e \u003cp\u003einformation\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\delta }_{s}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eSteering Wheel Angle, Left turn is negative, right turn is positive (\u0026deg;)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\omega }_{s}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eSteering Wheel Angle Rate(rad/s)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${f}_{s}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eSteering Wheel Steering Torque (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$N\u0026middot;m\$\u003c/span\u003e\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{p}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eAccelerator pedal position: Unpressed, 0; Fully depressed, 1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${b}_{p}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eBrake pedal position: Unpressed, 0; Fully depressed, 1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\" morerows=\"12\" rowspan=\"13\"\u003e \u003cp\u003eVehicle operating\u003c/p\u003e \u003cp\u003estatus information\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$v\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eVelocity (m/s)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{x}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eLateral velocity (m/s)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{x}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eLateral acceleration (m/s\u003csup\u003e2\u003c/sup\u003e)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eLongitudinal velocity s(m/)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{y}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eLongitudinal acceleration (m/s\u003csup\u003e2\u003c/sup\u003e)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{lat}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eVehicle lateral position (m)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX12\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{lon}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eVehicle longitudinal position (m)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX13\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\theta }_{roll}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eSide Roll Angle (rad)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX14\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\beta }_{s}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eSide Slip Angle (rad)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX15\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${r}_{yaw}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eYaw Rate (rad/s)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX16\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{yaw}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eYaw angular acceleration (rad/s\u003csup\u003e2\u003c/sup\u003e)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX17\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${r}_{roll}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eRoll Rate (rad/s)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX18\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{roll}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eRoll angular acceleration (rad/s\u003csup\u003e2\u003c/sup\u003e)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX19\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\" morerows=\"8\" rowspan=\"9\"\u003e \u003cp\u003eVehicle interaction information\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{x}^{Sub\\\u0026amp;Pre2}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eThe relative distance between \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Sub\$\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Pre2\$\u003c/span\u003e\u003c/span\u003e (m)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX20\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}^{Sub\\\u0026amp;Pre2}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eThe velocity difference between\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Sub\$\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Pre2\$\u003c/span\u003e\u003c/span\u003e (m)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX21\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{x}^{Sub\\\u0026amp;Fol2}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eThe relative distance between \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Sub\$\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Flo2\$\u003c/span\u003e\u003c/span\u003e (m)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX22\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}^{Sub\\\u0026amp;Fol2}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eThe velocity difference between \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Sub\$\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Flo2\$\u003c/span\u003e\u003c/span\u003e (m)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX23\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{y}^{Sub\\\u0026amp;Pre1}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eThe relative distance between \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Sub\$\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Pre1\$\u003c/span\u003e\u003c/span\u003e (m)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX24\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}^{Sub\\\u0026amp;Pre1}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eThe velocity difference between \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Sub\$\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Pre1\$\u003c/span\u003e\u003c/span\u003e (m)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX25\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{y}^{Sub\\\u0026amp;Fol1}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eThe relative distance between \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Sub\$\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Flo1\$\u003c/span\u003e\u003c/span\u003e (m)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX26\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}^{Sub\\\u0026amp;Fol1}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eThe velocity difference between \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Sub\$\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Fol1\$\u003c/span\u003e\u003c/span\u003e (m)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eX27\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${L}_{n}\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eLane IDs\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section3\"\u003e \u003ch2\u003e2.1.5 Data processing\u003c/h2\u003e \u003cp\u003e(1) Definition of Lane-Changing\u003c/p\u003e \u003cp\u003eThe lane-changing process, as defined in the literature, involves a vehicle moving from its original lane to the target lane, entailing a unidirectional continuous change in the vehicle's lateral position\u003csup\u003e[\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]\u003c/sup\u003e. This study specifically focuses on LC scenarios involving five cars, as depicted in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. Figures\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e(a) and (b) depict LC occurring from left to right and from right to left, respectively. During this process, the LC vehicle (\u003cem\u003esub\u003c/em\u003e) moves from its original lane to the target lane. \u003cem\u003eFol1\u003c/em\u003e and \u003cem\u003ePre1\u003c/em\u003e represent the following and preceding cars of \u003cem\u003esub\u003c/em\u003e in the original lane. \u003cem\u003eFol2\u003c/em\u003e and \u003cem\u003ePre2\u003c/em\u003e represent the following and preceding cars of \u003cem\u003esub\u003c/em\u003e in the target lane.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e(2) LC sample extraction\u003c/p\u003e \u003cp\u003eThe extraction of trajectory data during LC processes is closely related to determining the start and end points of LC. To obtain effective LC trajectory data, we have formulated extraction rules. First, it is determined whether a vehicle has undergone LC behavior based on the lane IDs. Specifically, if the lane ID of the vehicle at time \u0026#119905; is different from the previous time, then we can infer that the vehicle has undergone an LC behavior. Second, we further identify intervals of monotonic lateral displacement of vehicles and mark the start and end points of lane changes. Finally, to exclude vehicles that do not intend to change lanes or have weak LC intentions, we introduce a constraint: the distance of the lateral coordinates of the vehicles at the start and end points of the LC from the lane lines on both sides should be greater than or equal to half of the vehicle width. Based on the above rules and constraints, a method to identify the start and end times of LC was proposed, as shown in Equations (\u003cspan refid=\"Equ1\" class=\"InternalRef\"\u003e1\u003c/span\u003e) and (\u003cspan refid=\"Equ2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003cdiv id=\"Equ1\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ1\" name=\"EquationSource\"\u003e\n$${t}_{s}=\\left\\{\\begin{array}{c}max\\left\\{t|t-{t}_{l}\u0026lt;0 and \\frac{1}{2}{W}_{V}\\le {x}_{t}-{x}_{l}\\le {W}_{L}-\\frac{1}{2}{W}_{V} and {x}_{t-1}\\ge {x}_{t}\\right\\}Left LC\\\\ max\\left\\{t|t-{t}_{l}\u0026lt;0 and \\frac{1}{2}{W}_{V}\\le {x}_{l}-{x}_{t}\\le {W}_{L}-\\frac{1}{2}{W}_{V} and {x}_{t-1}\\le {x}_{t}\\right\\}Right LC\\end{array}\\right.$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e1\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Equ2\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ2\" name=\"EquationSource\"\u003e\n$${t}_{e}=\\left\\{\\begin{array}{c}min\\left\\{t|t-{t}_{l}\u0026gt;0 and \\frac{1}{2}{W}_{V}\\le {x}_{l}-{x}_{t}\\le {W}_{L}-\\frac{1}{2}{W}_{V} and {x}_{t}\\ge {x}_{t+1}\\right\\}Left LC\\\\ min\\left\\{t|t-{t}_{l}\u0026gt;0 and \\frac{1}{2}{W}_{V}\\le {x}_{t}-{x}_{l}\\le {W}_{L}-\\frac{1}{2}{W}_{V} and {x}_{t}\\le {x}_{t+1}\\right\\}Right LC\\end{array}\\right.$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e2\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eWhere \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${t}_{s}\$\u003c/span\u003e\u003c/span\u003e: the time of the LC start point; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${t}_{e}\$\u003c/span\u003e\u003c/span\u003e: the time of the LC end point; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${t}_{l}\$\u003c/span\u003e\u003c/span\u003e: the time when the vehicle crosses the lane line; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${x}_{t}\$\u003c/span\u003e\u003c/span\u003e: the lateral coordinate of the vehicle at time \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$t\$\u003c/span\u003e\u003c/span\u003e; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${x}_{l}\$\u003c/span\u003e\u003c/span\u003e: the lateral coordinate of the vehicle at time\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${t}_{l}\$\u003c/span\u003e\u003c/span\u003e; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${W}_{L}\$\u003c/span\u003e\u003c/span\u003e: the width of the lane; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${W}_{V}\$\u003c/span\u003e\u003c/span\u003e: the width of the vehicle.\u003c/p\u003e \u003cp\u003eBased on the aforementioned extraction method, a total of 1200 effective LC samples were collected on snowy and icy surfaces. Among these samples, 754 were left LC samples, while 446 were right LC samples.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003e2.2 Key RCIs selection\u003c/h2\u003e \u003cdiv id=\"Sec10\" class=\"Section3\"\u003e \u003ch2\u003e2.2.1 Definition of risky LC behavior\u003c/h2\u003e \u003cp\u003eRLCB is an adverse event or potential hazard that may occur when making a lane change during the driving process\u003csup\u003e[25]\u003c/sup\u003e. The first type of RLCB is Lane-Changing Clearance Insufficient (LCCI), which means that the distance between the subject vehicle and surrounding vehicles falls below the safety threshold. In this situation, due to the close proximity of the vehicles to each other and the low coefficient of adhesion on snowy and icy road surfaces, any unexpected braking or acceleration actions by surrounding vehicles may cause collisions.\u003c/p\u003e \u003cp\u003eThe second type of RLCB is Vehicle Lateral Instability (VLI). When the vehicle performs a lane change maneuver, the risk of lateral instability occurs due to excessive steering wheel angle rate or high driving speed, coupled with the low adhesion coefficient on the snowy and icy road surface. This risk solely concerns the vehicle's own operating status and does not conflict with surrounding vehicles.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section3\"\u003e \u003ch2\u003e2.2.2 Key RCIs selection\u003c/h2\u003e \u003cp\u003eDue to the complexity of LC operations on snowy and icy surfaces, there exist numerous RCIs for LC. To reduce the computational complexity of the model, it is necessary to select indicators that accurately depict the characteristics of RLCB. The key RCIs were selected in this study using decision tree principles. The specific process is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e(1) Statistical feature of RCIs\u003c/p\u003e \u003cp\u003eThe LC behavior parameters in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e were discretized using a statistical value construction method to generate corresponding statistical indicators. 13 statistical features were extracted from the time series of each feature indicator, including the mean, maximum, minimum, mode, 0.25 quantiles, median, 0.75 quantiles, variance, standard deviation, range, variable coefficient, skewness, and kurtosis. The lane number(X27), not being a time series parameter, was excluded. Finally, the remaining 26 LC behavior parameters were selected as RCIs, totaling 338 statistical features. The importance of each RCI was objectively reflected by analyzing the importance of its statistical features.\u003c/p\u003e \u003cp\u003e(2) Importance analysis of RCIs\u003c/p\u003e \u003cp\u003eThe importance of feature indicators was quantified using the information gain rate by the C4.5 algorithm. A greater information gain rate indicates a higher degree of orderliness of the resulting set after splitting, leading to better classification effects and higher importance of feature indicators. The specific process of analyzing the importance of RCIs using the C4.5 algorithm is as follows.\u003c/p\u003e \u003cp\u003eFirstly, 1200 sets of statistical features of LC samples were taken as sample set \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$S\$\u003c/span\u003e\u003c/span\u003e. The uncertainty of the sample set \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$S\$\u003c/span\u003e\u003c/span\u003e was described by the information entropy \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Ent\\left(S\\right)\$\u003c/span\u003e\u003c/span\u003e.\u003cdiv id=\"Equ3\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ3\" name=\"EquationSource\"\u003e\n$$Ent\\left(S\\right)=-\\sum _{i=1}^{n}{p}_{i}{\\text{log}}_{2}{p}_{i}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e3\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eWhere \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${p}_{i}\$\u003c/span\u003e\u003c/span\u003e is the probability of occurrence of the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$i\$\u003c/span\u003e\u003c/span\u003eth class data in the dataset \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$S\$\u003c/span\u003e\u003c/span\u003e.\u003c/p\u003e \u003cp\u003eThen the sample set \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$S\$\u003c/span\u003e\u003c/span\u003e was categorized into \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$n\$\u003c/span\u003e\u003c/span\u003e classes based on feature \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$A\$\u003c/span\u003e\u003c/span\u003e. At this stage, the information entropy of the sample set was denoted by \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Ent(S, A)\$\u003c/span\u003e\u003c/span\u003e.\u003cdiv id=\"Equ4\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ4\" name=\"EquationSource\"\u003e\n$$Ent\\left(S,A\\right)=\\sum _{j=1}^{n}\\frac{\\left|{S}_{j}\\right|}{S}\\times Ent\\left({S}_{j}\\right)$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e4\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eWhere \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${S}_{j}\$\u003c/span\u003e\u003c/span\u003e is the set of samples that take value \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$j\$\u003c/span\u003e\u003c/span\u003e under feature \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$A\$\u003c/span\u003e\u003c/span\u003e.\u003c/p\u003e \u003cp\u003eThe information gain obtained by dividing sample set \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$S\$\u003c/span\u003e\u003c/span\u003e according to feature \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$A\$\u003c/span\u003e\u003c/span\u003e was \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Gain(S,A)\$\u003c/span\u003e\u003c/span\u003e, and the higher the information gain, the greater the decrease in uncertainty of set\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$S\$\u003c/span\u003e\u003c/span\u003e.\u003cdiv id=\"Equ5\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ5\" name=\"EquationSource\"\u003e\n$$Gain\\left(S,A\\right)=Ent\\left(S\\right)-Ent(S,A)$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e5\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eConsidering that information gain tends to overfit when selecting features with fewer values, the information gain ratio \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$GainRatio(S,A)\$\u003c/span\u003e\u003c/span\u003e was introduced based on information gain.\u003cdiv id=\"Equ6\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ6\" name=\"EquationSource\"\u003e\n$$GainRatio\\left(S,A\\right)=\\frac{Gain(S,A)}{SplitInf\\left(A\\right)}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e6\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Equ7\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ7\" name=\"EquationSource\"\u003e\n$$SplitInf\\left(A\\right)=-\\sum _{i=1}^{n}\\frac{\\left|{S}_{i}\\right|}{\\left|S\\right|}{\\text{log}}_{2}\\frac{\\left|{S}_{i}\\right|}{\\left|S\\right|}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e7\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eWhere \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$SplitInf\\left(A\\right)\$\u003c/span\u003e\u003c/span\u003e denotes the complexity of splitting the dataset due to feature \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$A\$\u003c/span\u003e\u003c/span\u003e.\u003c/p\u003e \u003cp\u003eTo obtain RCIs with higher importance, the information gain rate ranking of all statistical features for each RCI was plotted as a box plot. Each box in the box plot represents the ranking of the information gain ratio of all statistical features corresponding to the RCI. RCIs with information gain ratio rankings exceeding 50% are relatively less important and are eliminated. The statistical features of the selected RCIs were used as split nodes of the decision tree to split the original data. If the information gain ratio of the dataset after splitting is high and the level of confusion decreases significantly, it indicates that the selected RCIs have a better classification effect in identifying RLCB, meeting the requirements of the LC RCIs.\u003c/p\u003e \u003cp\u003e(3) Correlation Analysis of RCIs\u003c/p\u003e \u003cp\u003eTo reduce the computational complexity and improve the recognition efficiency, it is necessary to ensure that the input feature indicators are as independent as possible in the model. In this study, the Pearson correlation coefficient was used to analyze the correlation of the screened RCIs, and the feature parameters with strong correlation were excluded to avoid redundant information and multicollinearity problems. Hohlfelder\u003csup\u003e[\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e]\u003c/sup\u003e pointed out that a Pearson correlation coefficient greater than 0.6 is considered as strong correlation between two feature parameters, which should be excluded to reduce feature redundancy.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003e2.3 Framework of the Stacking ensemble learning\u003c/h2\u003e \u003cp\u003eEnsemble learning is a machine learning method that improves classification and recognition performance by constructing multiple models and combining them\u003csup\u003e[\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]\u003c/sup\u003e. The three common frameworks of ensemble learning are Bagging for parallel computation, Boosting for sequential computation, and Stacking for hierarchical integration. Compared with Bagging and Boosting, the Stacking algorithm can train multiple different classification models on the same dataset, thus taking advantage of the strengths of different models and improving the recognition accuracy\u003csup\u003e[\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]\u003c/sup\u003e.\u003c/p\u003e \u003cdiv id=\"Sec13\" class=\"Section3\"\u003e \u003ch2\u003e2.3.1 Stacking ensemble learning model\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e illustrates the RLCB recognition model based on the Stacking ensemble algorithm. In this study, three basic classifiers (SVM, RF, and Bi-LSTM) and a meta-classifier (LSTM) were used. The LC risks were classified into three types: Normal lane-changing (LCN), VLI, and LCCI. The main steps of the model training process are as follows.\u003c/p\u003e \u003cp\u003eStep 1: The dataset (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$S\$\u003c/span\u003e\u003c/span\u003e) is randomly divided in a 7:3 ratio, where 70% of the data is used as the training set (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$T\$\u003c/span\u003e\u003c/span\u003e) and 30% as the testing set (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$R\$\u003c/span\u003e\u003c/span\u003e). To prevent model overfitting, the 5-fold cross-validation was performed within each model. Thus, the training set (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$T\$\u003c/span\u003e\u003c/span\u003e) is randomly divided into 5 subsets \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${T}_{1}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${T}_{2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${T}_{3}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${T}_{4}\$\u003c/span\u003e\u003c/span\u003e, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${T}_{5}\$\u003c/span\u003e\u003c/span\u003e. Where \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${T}_{i}\$\u003c/span\u003e\u003c/span\u003e (i\u0026thinsp;=\u0026thinsp;1,2,3,4,5) is used as the validation subset, then \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${T}_{i+}=T-{T}_{i}\$\u003c/span\u003e\u003c/span\u003e is used as the training subset.\u003c/p\u003e \u003cp\u003eStep 2: The base classifier SVM is trained using the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${T}_{i+}\$\u003c/span\u003e\u003c/span\u003e training subset, obtaining 5 trained SVM classifiers. Then, the validation subset Ti is input into the trained SVM respectively, and 5 recognition results are obtained. The meta-classifier training set (A1) is obtained by combining the 5 recognition results.\u003c/p\u003e \u003cp\u003eStep 3: The test set (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$R\$\u003c/span\u003e\u003c/span\u003e) is recognized using the 5 trained base classifier SVM obtained in step 2, respectively, and 5 recognition results of the test set are obtained. Then, the test set (B1) of meta-classifiers is obtained by arithmetic averaging the 5 recognition results.\u003c/p\u003e \u003cp\u003eStep 4: Steps 1\u0026ndash;3 are sequentially performed on the two base classifiers of RF and Bi-LSTM to obtain the meta-classifier training set (A2, A3, A4) and test set (B2, B3, B4).\u003c/p\u003e \u003cp\u003eStep 5: The meta-classifier is trained using the training set (A1, A2, A3, A4). Then, the test set (B1, B2, B3, B4) is used for recognition, and the final recognition results are output.\u003c/p\u003e \u003cp\u003eThree different types of base classifiers were constructed using a Stacking ensemble learning framework to generate preliminary recognition results. These preliminary recognition results were then fed into the meta-classifier to obtain the final recognition results. If one of the base classifiers makes errors when learning specific regions of the feature space, the meta-classifier can correct these errors by integrating the learning results of other base classifiers. This approach effectively takes the advantages of each base classifier and compensates for their limitations in different regions, thereby improving the overall model performance and generalization ability.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section3\"\u003e \u003ch2\u003e2.3.2 Base and meta classifier of ensemble learning\u003c/h2\u003e \u003cp\u003e(1) SVM-Base Classifier: SVM is a powerful machine learning algorithm that is widely used to solve classification problems\u003csup\u003e[\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]\u003c/sup\u003e. Due to its high reliability and computational efficiency, it is employed as one of the base classifiers. The two hyperparameters of the SVM algorithm are the penalty coefficient (\u003cem\u003eC\u003c/em\u003e) and the kernel function (\u003cem\u003eg\u003c/em\u003e). These parameters control the complexity of the model and the curvature of the decision boundary, respectively. The values of \u003cem\u003eC\u003c/em\u003e and \u003cem\u003eg\u003c/em\u003e were set to 5.4 and 1.7, respectively.\u003c/p\u003e \u003cp\u003e(2) RF-Base Classifier: RF\u003csup\u003e[\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]\u003c/sup\u003e is a bagging method that uses the CART decision tree as a weak learner for training. It employs random data sampling and feature selection, effectively avoiding the issue of overfitting. Therefore, RF is selected as one of the base classifiers in this study. The hyper-parameters of RF include n_estimators, max_depth, max_features, min_samples_split, min_samples_leaf, and max_leaf_ nodes. These parameters were set to 105, 13, 4, 4, 2, and None, respectively.\u003c/p\u003e \u003cp\u003e(3) Bi-LSTM-Base Classifier: Bi-LSTM\u003csup\u003e[\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]\u003c/sup\u003e is a deep learning model used for sequence data processing. It consists of two Long Short-Term Memory (LSTM) networks that process input sequences in forward and backward directions, respectively. RCIs are typically time-varying data, and RLCB can be better identified by analyzing the time-series variations of these indications. Therefore, Bi-LSTM is selected as one of the base classifiers for ensemble learning. Reasonable setting of hyper-parameters such as the number of hidden_size, learning_rate, and dropout_rate is crucial to improve the performance and training effect of the Bi-LSTM model. Thus, the values of hidden_size, learning_rate, and dropout_rate were set to 130, 0.05 and 0.4, respectively.\u003c/p\u003e \u003cp\u003e(4) LSTM-Meta Classifier: LSTM\u003csup\u003e[32]\u003c/sup\u003e is a type of recurrent neural network that is widely used for processing sequence data. Compared to traditional recurrent neural networks (RNNs), LSTM can control the input, forget, and output of information through gate mechanisms, addressing the issue of long-term dependencies. In addition, LSTM has strong robustness and generalization capabilities. Given its advantages, this study employs LSTM as the meta classifier and sets the values of hidden_size, learning_rate, and dropout_rate to 120, 0.01, and 0.005, respectively.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section3\"\u003e \u003ch2\u003e2.3.3 Model evaluation\u003c/h2\u003e \u003cp\u003e(1) Accuracy and error rates\u003c/p\u003e \u003cp\u003eAccuracy (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Acc\$\u003c/span\u003e\u003c/span\u003e) is the simplest model evaluation metric for assessing how accurately a classifier classifies samples. The formula is as follows:\u003cdiv id=\"Equ8\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ8\" name=\"EquationSource\"\u003e\n$$Accuracy=\\frac{{N}_{correct}}{{N}_{total}}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e8\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003e(2) Precision and recall\u003c/p\u003e \u003cp\u003ePrecision measures the proportion of correctly predicted (True Positive) results among all positive cases; while recall measures the proportion of predicted and actual positive cases\u003csup\u003e[\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e33\u003c/span\u003e]\u003c/sup\u003e. The confusion matrix presents the results of binary classification intuitively, as shown in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eConfusion matrix\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eActual Classification\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003ePredicted Classification\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePositive\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eNegative\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePositive\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTrue Positive (TP)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eFalse Negative (FN)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNegative\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFalse Positive (FP)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eTrue Negative (TN)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"3\"\u003eBased on the confusion matrix, the calculation formulas for accuracy (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$P\$\u003c/span\u003e\u003c/span\u003e) and recall (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$R\$\u003c/span\u003e\u003c/span\u003e) are as follows:\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003cdiv id=\"Equ9\" class=\"Equation\"\u003e \u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ9\" name=\"EquationSource\"\u003e\n$$P=\\frac{TP}{TP+FP}$$\u003c/div\u003e \u003cdiv class=\"EquationNumber\"\u003e9\u003c/div\u003e\u003c/div\u003e \u003cdiv id=\"Equ10\" class=\"Equation\"\u003e \u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ10\" name=\"EquationSource\"\u003e\n$$R=\\frac{TP}{TP+FN}$$\u003c/div\u003e \u003cdiv class=\"EquationNumber\"\u003e10\u003c/div\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e(3) F1-score\u003c/p\u003e \u003cp\u003eThe F1-score, a harmonic mean indicator, is used to measure the precision (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$P\$\u003c/span\u003e\u003c/span\u003e) and recall (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$R\$\u003c/span\u003e\u003c/span\u003e) of a binary classification model\u003csup\u003e[34]\u003c/sup\u003e. It combines the accuracy and recall capabilities of the classifier, making it a commonly used indicator to evaluate the performance of classification models. Generally, a higher F1-score indicates a better performance of the model. The formula for F1-score is as follows:\u003cdiv id=\"Equ11\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ11\" name=\"EquationSource\"\u003e\n$$F1=\\frac{2\\times P\\times R}{P+R}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e11\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003e(4) ROC and AUC\u003c/p\u003e \u003cp\u003eThe ROC curve is used to depict the trade-off relationship between the True Positive Rate (TPR) and the False Positive Rate (FPR). The horizontal axis of the ROC curve represents the False Positive Rate (FPR). The vertical axis represents the True Positive Rate (TPR), which is also known as recall\u003csup\u003e[\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e35\u003c/span\u003e]\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eThe higher convexity of the ROC curve represents better model performance. AUC (Area Under Curve) is the area enclosed by the ROC curve, which can effectively quantify the model's performance. The AUC value is between 0 and 1, and a higher AUC value indicates better model performance. When the value of AUC is greater than 0.7, it indicates that the expected effect of the model is better.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"3. Results","content":"\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003e3.1 Key RCIs\u003c/h2\u003e \u003cp\u003eBased on 1200 sets of LC sample data, the information gain ratio of 338 statistical features derived from 26 RCIs was calculated. Then, the rankings of the information gain ratios of the 13 statistical features for each RCI were plotted as a box plot, as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e. In the box plot, each box represents the ranking of the information gain ratio of the 13 statistical features corresponding to each RCI. The smaller the vertical coordinate value of the box, the lower the importance of the feature parameter. RCIs with information gain ratio rankings exceeding 50% were eliminated. Therefore, the RCIs selected after the importance assessment are as follows: X1(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\delta }_{s}\$\u003c/span\u003e\u003c/span\u003e), X2(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\omega }_{s}\$\u003c/span\u003e\u003c/span\u003e), X6(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$v\$\u003c/span\u003e\u003c/span\u003e), X7(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{x}\$\u003c/span\u003e\u003c/span\u003e), X8(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{x}\$\u003c/span\u003e\u003c/span\u003e), X9(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}\$\u003c/span\u003e\u003c/span\u003e), X10(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{y}\$\u003c/span\u003e\u003c/span\u003e), X13(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\theta }_{roll}\$\u003c/span\u003e\u003c/span\u003e), X14(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\beta }_{s}\$\u003c/span\u003e\u003c/span\u003e), X15(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${r}_{yaw}\$\u003c/span\u003e\u003c/span\u003e), X19(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{x}^{Sub\\\u0026amp;Pre2}\$\u003c/span\u003e\u003c/span\u003e), X20(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}^{Sub\\\u0026amp;Pre2}\$\u003c/span\u003e\u003c/span\u003e), X21(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{x}^{Sub\\\u0026amp;Fol2}\$\u003c/span\u003e\u003c/span\u003e), X22(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}^{Sub\\\u0026amp;Fol2}\$\u003c/span\u003e\u003c/span\u003e), X23(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{y}^{Sub\\\u0026amp;Pre1}\$\u003c/span\u003e\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThe statistical features of the selected RCIs were used as the splitting nodes of the decision tree to split the original data. The information gain ratio of the split data set was significantly increased, and the level of confusion was noticeably reduced. This indicates that the 15 RCIs have a good classification effect in identifying RLCB, meeting the requirements of LC RCIs.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTo identify key RCIs with high importance and low intercorrelation, Pearson correlation analysis was conducted on the 15 RCIs, as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e. When the significance level (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$P\$\u003c/span\u003e\u003c/span\u003e) was less than 0.05, the RCIs X1(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\delta }_{s}\$\u003c/span\u003e\u003c/span\u003e), X6(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$v\$\u003c/span\u003e\u003c/span\u003e), and X13(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\theta }_{roll}\$\u003c/span\u003e\u003c/span\u003e) with correlation coefficients greater than 0.6 were eliminated. Therefore, by integrating the importance of RCIs and Pearson correlation analysis, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\omega }_{s}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{x}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{x}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{y}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\beta }_{s}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${r}_{yaw}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{x}^{Sub\\\u0026amp;Pre2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}^{Sub\\\u0026amp;Pre2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{x}^{Sub\\\u0026amp;Fol2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}^{Sub\\\u0026amp;Fol2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{y}^{Sub\\\u0026amp;Pre1}\$\u003c/span\u003e\u003c/span\u003e were identified as key RCIs, providing input parameters for the subsequent LC risk behavior recognition model training.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003e3.2 Optimal time window length\u003c/h2\u003e \u003cp\u003eThe length of the LC time window represents the time series length of the model parameter input, and selecting an appropriate LC time window is crucial for accurately identifying RLCB. If the time window is too large, it may result in sequence data containing multiple behavioral features, reducing the model's accuracy and efficiency. Conversely, if the time window is too small, critical feature parameters may be excluded, severely reducing the model's recognition accuracy. Based on the LC dataset, the duration of LC on snowy and icy surfaces was determined to be approximately 4.2s-11.6s using statistical methods. In this study, the optimal time window length was identified by analyzing variations in RLCB using neural networks with different time window lengths. As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e, the highest recognition accuracy was achieved when the time window length is 7.6 s. Therefore, 7.6s was adopted as the optimal time window length for LC on snowy and icy surfaces.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003e3.3 Identification of RLCB and model evaluation\u003c/h2\u003e \u003cp\u003eThe method of manual selection was employed to screen 643 LCN samples, 323 LCCI samples, and 234 VLI samples from a dataset of 1200 LC samples. All data samples were randomly divided according to a 7:3 ratio, with 70% used as the training set and 30% as the test set. Using the 5-fold cross-validation method, the training dataset was randomly divided into 5 subsets. To ensure better recognition, the original time series data were processed to maintain a time series length of 7.6s for model input.\u003c/p\u003e \u003cp\u003eOn snowy and icy surfaces, a Stacking ensemble learning model for RLCB recognition was developed. The model input parameters include \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\omega }_{s}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{x}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{x}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{y}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\beta }_{s}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${r}_{yaw}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{x}^{Sub\\\u0026amp;Pre2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}^{Sub\\\u0026amp;Pre2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{x}^{Sub\\\u0026amp;Fol2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}^{Sub\\\u0026amp;Fol2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{y}^{Sub\\\u0026amp;Pre1}\$\u003c/span\u003e\u003c/span\u003e. Moreover, accuracy (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Acc\$\u003c/span\u003e\u003c/span\u003e), precision (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$P\$\u003c/span\u003e\u003c/span\u003e), recall (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$R\$\u003c/span\u003e\u003c/span\u003e), F1-score (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$F1\$\u003c/span\u003e\u003c/span\u003e), \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$ROC\$\u003c/span\u003e\u003c/span\u003e, and confusion matrix were used to evaluate the recognition performance of the model and were compared with traditional SVM, RF, and Bi-LSTM models. The evaluation results of each model's performance are shown in Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eEvaluation results of each model\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"11\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c9\" colnum=\"9\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c10\" colnum=\"10\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c11\" colnum=\"11\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$Acc\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c5\" namest=\"c3\"\u003e \u003cp\u003eLCN\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c8\" namest=\"c6\"\u003e \u003cp\u003eVLI\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c11\" namest=\"c9\"\u003e \u003cp\u003eLCCI\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$P\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$R\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$F1\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$P\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$R\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$F1\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c9\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$P\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c10\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$R\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c11\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$F1\$\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSVM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e92.22\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e96.34\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e93.88\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e95.09\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e88.57\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e91.18\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e89.86\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e88.66\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e89.58\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e89.12\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRF\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e93.89\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e93.89\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e95.36\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e94.63\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e91.43\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e90.14\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e90.78\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e91.75\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e93.68\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e92.70\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBi-LSTM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e96.67\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e98.45\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e96.94\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e97.69\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e92.86\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e95.59\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e94.21\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e95.88\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e97.89\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e96.87\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eStacking\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e98.33\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e99.48\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e98.46\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e98.97\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e95.71\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e97.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e96.40\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e97.94\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e98.96\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e98.45\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eThe results indicate that all four recognition models have high accuracy in identifying LC behaviors. Among them, the Stacking ensemble learning model achieved the highest accuracy, reaching 98.33%. Among the other three base classifiers, the accuracies of the Bi-LSTM, RF, and SVM models were 96.67%, 93.89%, and 92.22%, respectively. Notably, the SVM model had the lowest recognition accuracy, trailing the Stacking model by 6.11%. Additionally, in the recognition of LC behaviors such as LCN, VLI, and LCCI, the proposed model outperformed the other three base classifiers in terms of \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$P\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$R\$\u003c/span\u003e\u003c/span\u003e, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$F1\$\u003c/span\u003e\u003c/span\u003e. It demonstrates that, for driving simulation data on snowy and icy surfaces, the Stacking ensemble learning model has the best overall performance in identifying LC behaviors.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e shows the confusion matrices for SVM, RF, Bi-LSTM, and Stacking models in identifying LC behaviors. These confusion matrices provide details of the recognition performance of the four models. The diagonal values of the confusion matrices represent the number of correctly predicted samples. The higher values indicate the better classification performance. It is evident that the proposed model demonstrates high accuracy in identifying three types of LC behaviors, indicating the best performance. The off-diagonal elements represent the number of misclassified samples, lower values correspond to better model performance. The number of LCN samples misclassified as VLI and LCCI is lower in the proposed model compared to the other three models. Similarly, the number of VLI samples misclassified as LCN and LCCI is also lower.\u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003e illustrates the ROC curves of the four models. It can be seen that when the false positive rate is 5%, the corresponding true positive rate is nearly 95%. This demonstrates that the four models have high recognition rates when the time series length is 7.6s, and the appropriate model can be selected according to actual needs. A larger area under the curve (AUC) indicates better model recognition performance. As depicted in Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003e, the AUC values for the four models are as follows: Stacking\u0026thinsp;\u0026gt;\u0026thinsp;Bi-LSTM\u0026thinsp;\u0026gt;\u0026thinsp;RF\u0026thinsp;\u0026gt;\u0026thinsp;SVM. This indicates that the proposed model for identifying LC behaviors has the best recognition performance.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"4. Discussion and conclusion","content":"\u003cp\u003eIn this study, simulated highway driving experiments on snowy and icy surfaces were conducted using a driving simulator, and a large amount of LC data was collected. The key RCIs of LC behavior were extracted, and a RLCB recognition model applicable to snowy and icy surfaces was established. The conclusions are summarized as follows:\u003c/p\u003e \u003cp\u003eBased on the characteristics of LC behavior on snowy and icy surfaces, two risk LC behaviors were defined: vehicle lateral instability and LC clearance insufficient. A total of 1,200 LC samples were extracted from the driving simulation data on snowy and icy surfaces, including 643 LCN samples, 323 LCCI samples, and 234 VLI samples. The optimal LC time window length on snowy and icy surfaces has been determined to be 7.6s.\u003c/p\u003e \u003cp\u003eThe importance analysis of 26 RCIs during the LC process on snowy and icy surfaces was analyzed based on the C4.5 algorithm. Subsequently, Pearson correlation analysis was applied to eliminate highly correlated RCIs. Finally, 12 key RCIs, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\omega }_{s}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{x}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{x}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${a}_{y}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${\\beta }_{s}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${r}_{yaw}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{x}^{Sub\\\u0026amp;Pre2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}^{Sub\\\u0026amp;Pre2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{x}^{Sub\\\u0026amp;Fol2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${v}_{y}^{Sub\\\u0026amp;Fol2}\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\${d}_{y}^{Sub\\\u0026amp;Pre1}\$\u003c/span\u003e\u003c/span\u003e were extracted and used as input variables for identifying RLCB on snowy and icy surfaces.\u003c/p\u003e \u003cp\u003eA RLCB recognition model was developed based on Stacking ensemble learning, integrating three algorithms: SVM, RF, and Bi-LSTM. The research results indicate that the Stacking ensemble learning model has the highest recognition rate for RLCB, with a comprehensive recognition accuracy of 98.33%. Moreover, the model demonstrates superior performance in terms of \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$P\$\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$R\$\u003c/span\u003e\u003c/span\u003e, and\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\$F1\$\u003c/span\u003e\u003c/span\u003e values. This illustrates that the Stacking ensemble learning algorithm is more suitable for identifying RLCB in the research on LC trajectory data extracted from driving simulators.\u003c/p\u003e \u003cp\u003eIt is crucial to accurately identify RLCB on snowy and icy surfaces. For micro-traffic systems, identifying RLCB on snowy and icy surfaces can effectively prevent traffic accidents and enhance overall transportation safety. For macro-traffic systems, an in-depth understanding of the vehicle's operating status can help traffic management departments make effective decisions, reducing traffic congestion and accident rates. For ADAS systems, accurately identifying RLCB on snowy and icy surfaces can assist drivers in taking timely safety measures, thereby enhancing driving safety under adverse road conditions. For V2X systems, sharing and transmitting the recognition results of RLCB among vehicles can help surrounding vehicles make driving decisions in advance, further improving the safety performance of autonomous driving.\u003c/p\u003e \u003cp\u003eThis study is based on driving simulation data to conduct research on RLCB recognition. In future studies, we will conduct real vehicle experiments at professional testing sites to improve data reliability and model accuracy. Furthermore, in the context of intelligent connected vehicles, the Stacking ensemble learning algorithm and its parameters optimization should be further expanded. Additional factors, such as vehicle dynamic parameters, vehicle interaction data, and traffic environment data, should be considered to achieve more accurate recognition of RLCB.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eAcknowledgments\u003c/h2\u003e\n\u003cp\u003eThis research was supported by the Fundamental Research Funds for the Central Universities (No.2572023AW66), the National Science Fund for Young Scholars (No.51108068), the Key Research and Development Guiding Projects of Heilongjiang Province (No.GZ20220027).\u003c/p\u003e\n\u003ch2\u003eAuthor contributions\u003c/h2\u003e\n\u003cp\u003eX.D. and W.Z. provided the study conception and design; W.Z. conducted software simulation, data collection and processing; X.D. analyzed the results; X.D. and W.Z. wrote the manuscript; W.Z. typesetted the manuscript. All authors reviewed the manuscript and provided funding support for this study.\u003c/p\u003e\n\u003ch2\u003eCompeting interests\u003c/h2\u003e\n\u003cp\u003eThe authors declared no potential conflicts of interest with respect to the research, authorship, and publication of this article.\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eThe data that support the findings of this study are available from the corresponding author upon reasonable request.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eXue, Q.W., Wang, K., Lu, J.J., Xing, Y.Y., Gu X. \u0026amp; Zhang, M. An improved risk estimation model of lane change using naturalistic vehicle trajectories. Journal of Transportation Safety \u0026amp; Security. 15(10), 963\u0026ndash;986(2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang, M., Wang, X. \u0026amp; Quddus, M. Examining lane change gap acceptance, duration and impact using naturalistic driving data. Transportation Research Part C: Emerging Technologies. 104, 317\u0026ndash;331(2019).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFitch, G., Lee, S., Klauer, S., Hankey, J., Sudweeks, J. \u0026amp; Dingus, T. Analysis of lane-change crashes and near-crashes. Report No. DOT HS 811 147; National Highway Traffic Safety Administration: Washington, DC, USA, 2009.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTraffic Administration Bureau of the Ministry of Public Security of the People\u0026rsquo;s Republic of China. Annual Report on Road Traffic Accident Statistics of the People\u0026rsquo;s Republic of China, Jiangsu Wuxi, China, 2020.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFan, P.C., Guo, J.Q., Wang, Y.B. \u0026amp; Wijnands Jasper, S. A hybrid deep learning approach for driver anomalous lane changing identification. Accident Analysis and Prevention. 171, 106661(2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang, Z.Y., Tan, D., Ge, G., et al. Optimal trajectory planning and control for automatic lane change of in wheel motor driving vehicles on snow and ice roads. Automatic Control and Computer Sciences. 54, 432\u0026ndash;445 (2020).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi, Z.N., Huang, X.H., Mu, T. \u0026amp; Wang, J. Attention-based lane change and crash risk prediction model in highways. IEEE Transactions on Intelligent Transportation Systems. 23(12), 22909\u0026ndash;22922(2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCohen, S. Application of relaxation procedure for lane changing in microscopic simulation models. Transportation Research Record. 1883(1), 50\u0026ndash;58 (2004).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eQi, W., Wang, W., Shen, B. \u0026amp; Wu, J. A modified post encroachment time model of urban road merging area based on lane-change characteristics. IEEE Access. 8, 72835\u0026ndash;72846 (2020).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang, J., Lee, J., Mao, S. \u0026amp; Hu J. Dynamic safety estimation of airport pick-up area based on video trajectory data. IEEE Transactions on Intelligent Transportation Systems. 25 (2), 1774\u0026ndash;1786(2024).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFu, C.Y. \u0026amp; Sayed, T. Comparison of threshold determination methods for the deceleration rate to avoid a crash (DRAC)-based crash estimation. Accident Analysis \u0026amp; Prevention. 153, 106051(2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNilsson, J., \u0026Ouml;dblom, A.C.E. \u0026amp; Fredriksson, J. Worst-case analysis of automotive collision avoidance systems. IEEE Transactions on Vehicular Technology. 65(4), 1899\u0026ndash;1911(2016).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTyagi, I. Threat assessment for avoiding collisions with perpendicular vehicles at intersections. Proceedings of the 2021 IEEE International Conference on Electro Information Technology (EIT), May 14\u0026ndash;15, 2021 Mt. Pleasant, MI, USA. Piscataway NJ: IEEE, c2021: 184\u0026ndash;187.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWinkler, S., Werneke, J. \u0026amp; Vollrath, M. Timing of early warning stages in a multi stage collision warning system: drivers\u0026rsquo; evaluation depending on situational influences. Transportation Research Part F: Traffic Psychology and Behavior. 36, 57\u0026ndash;68(2016).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePark, H.J., Oh, C., Moon, J. \u0026amp; Kim, S. Development of a lane change risk index using vehicle trajectory data. Accident Analysis and Prevention. 110, 1\u0026ndash;8(2018).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTao, L. et al. Collision risk assessment service for connected vehicles: leveraging vehicular state and motion uncertainties. IEEE Internet of Things Journal. 8(14), 11548\u0026ndash;11560(2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFeng, Y.Y. \u0026amp; Yan, X.L. Support vector machine based lane-changing behavior recognition and lateral trajectory prediction. Computational Intelligence and Neuroscience. 2022, 1\u0026ndash;9(2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSun, Q.Y. et al. Lane change strategy analysis and recognition for intelligent driving systems based on random forest. Expert Systems with Applications. 186, 115781(2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhu, J., Ma, Y. \u0026amp; Lou, Y. Multi-vehicle interaction safety of connected automated vehicles in merging area: a real-time risk assessment approach. Accident Analysis \u0026amp; Prevention. 166, 106546(2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePeng, J.S. \u0026amp; Shao. Y.M. Intelligent method for identifying driving risk based on V2V multisource big data. Complexity. 2018, 1801273(2018).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGu, X.P., Han, Y.P. \u0026amp; Yu, J.F. A novel lane-changing decision model for autonomous vehicles based on deep autoencoder network and XGBoost. IEEE Access. 8, 9846\u0026ndash;9863(2020).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen, T.Y., Shi, X.P. \u0026amp; Wong, Y.D. A lane-changing risk profile analysis method based on time-series clustering. Physica A. 565, 125567(2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePrajwal, C., Venkatesan, K. \u0026amp; Gowri, A. Understanding the mechanism of lane changing process and dynamics using microscopic traffic data. Physica A: Statistical Mechanics and its Applications. 593, 126981(2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWu, J.B., Chen, X.H., Bie, Y.M. \u0026amp; Zhou, W. A co-evolutionary lane-changing trajectory planning method for automated vehicles based on the instantaneous risk identification. Accident Analysis \u0026amp; Prevention. 180, 106907(2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHohlfelder, B. et al. Prospective evaluation of a bivalirudin to warfarin transition nomogram. Journal of Thrombosis Thrombolysis. 43, 498\u0026ndash;504(2017).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDing, W.M. \u0026amp; Wu, S.L. A cross-entropy based stacking method in ensemble learning. Journal of Intelligent \u0026amp; Fuzzy Systems. 39(3), 4677\u0026ndash;4688(2020).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAgarwal, S. \u0026amp; Chowdary, C.R. A-stacking and a-bagging: adaptive versions of ensemble learning algorithms for spoof fingerprint detection. Expert Systems with Applications. 146, 113160(2019).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCortes, C. \u0026amp; Vapnik, V. Support-vector networks. Machine Learning. 20, 273\u0026ndash;297(1995).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchonlau, M. \u0026amp; Zou, R.Y. The random forest algorithm for statistical learning. The Stata Journal: Promoting communications on statistics and Stata. 20(1), 3\u0026ndash;29(2020).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGraves, A., Fern\u0026aacute;ndez, S. \u0026amp; Schmidhuber, J. Bidirectional LSTM networks for improved phoneme classification and recognition. Artificial Neural Networks: Formal Models and Their Applications. 3697, 799\u0026ndash;804(2005).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHochreiter, S. \u0026amp; Schmidhuber, J. Long short-term memory. Neural Computation. 9(8), 1735\u0026ndash;1780(1997).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePasi, F., Radu, M.I. Soft precision and recall. Pattern Recognition Letters. 167, 115\u0026ndash;121(2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePinto L., Gopalan, S. \u0026amp; Balasubramaniam, P. Quantification on the generalization performance of deep neural network with tychonoff separation axioms. Information Sciences. 608, 262\u0026ndash;285(2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJonathan, A.C. ROC curves and nonrandom data. Pattern Recognition Letters. 85, 35\u0026ndash;41(2017).\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Traffic safety, snowy and icy surfaces, risky lane-changing behavior, risk characterisation indicators, ensemble learning","lastPublishedDoi":"10.21203/rs.3.rs-4491572/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4491572/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eRisky lane-changing (LC) behavior adversely affects traffic safety, especially on snowy and icy surfaces. However, due to the particularity of the snowy and icy surfaces and the scarcity of data, research on risky lane-changing behavior (RLCB) under extreme scenarios is insufficient. Therefore, this study presents a novel research framework aimed at selecting key risk characterisation indicators (RCIs) and identifying RLCB on highways using driving simulation data on snowy and icy surfaces. A highway LC scenario was established on snowy and icy surfaces using a driving simulator, and 1200 sets of LC sample data were extracted. From the perspectives of parameter importance and correlation, 12 key RCIs with high importance and low inter-correlation were selected using the C4.5 decision tree algorithm and Pearson correlation analysis method. The RLCB recognition model was developed using the Stacking ensemble learning method and then compared with traditional recognition algorithms. The results show that the accuracy of the recognition model based on the Stacking ensemble learning model is significantly better than that of traditional algorithms, with a recognition accuracy of 98.33%. This finding can provide the basis for developing LC warning systems for intelligent connected vehicles on snowy and icy surfaces.\u003c/p\u003e","manuscriptTitle":"Risky lane-changing behavior recognition based on Stacking ensemble learning on snowy and icy surfaces","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-06-12 04:06:19","doi":"10.21203/rs.3.rs-4491572/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2024-07-12T06:38:15+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-06-25T04:15:13+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"165997058131304433432899993712108980403","date":"2024-06-14T13:22:23+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"219729531924958775161041520458778269947","date":"2024-06-14T10:24:53+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2024-06-03T02:00:29+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2024-06-03T01:44:24+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2024-05-31T11:54:00+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2024-05-30T05:15:45+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2024-05-28T14:20:13+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"e33727bb-1718-45fd-b5d2-cd8c62874474","owner":[],"postedDate":"June 12th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":32978385,"name":"Physical sciences/Engineering/Mechanical engineering"},{"id":32978386,"name":"Physical sciences/Engineering/Civil engineering"}],"tags":[],"updatedAt":"2024-08-26T16:00:16+00:00","versionOfRecord":{"articleIdentity":"rs-4491572","link":"https://doi.org/10.1038/s41598-024-69642-7","journal":{"identity":"scientific-reports","isVorOnly":false,"title":"Scientific Reports"},"publishedOn":"2024-08-20 15:57:11","publishedOnDateReadable":"August 20th, 2024"},"versionCreatedAt":"2024-06-12 04:06:19","video":"","vorDoi":"10.1038/s41598-024-69642-7","vorDoiUrl":"https://doi.org/10.1038/s41598-024-69642-7","workflowStages":[]},"version":"v1","identity":"rs-4491572","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4491572","identity":"rs-4491572","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00