A Hybrid CNN and Attentive Hierarchical BiLSTM Model with SMO for Intrusion Detection in IIoT | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Method Article A Hybrid CNN and Attentive Hierarchical BiLSTM Model with SMO for Intrusion Detection in IIoT Sushama L. Pawar, Mandar S. Karyakarte This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6158243/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 11 You are reading this latest preprint version Abstract Many of intrusion detection systems (IDSs) analyses only a portion of packet data of fixed size for intrusion detection in industrial internet of things (IIoT) network, which limits the detection accuracy. In order to ensure higher detection accuracy it is important to design an IDSs that can analyse all features present in the packet. Models based on deep learning (DL) has great ability to process high-dimensional complex data. This study introduces a novel IDS called CNN-AH-BiLSTM that employs spider monkey optimization (SMO) to optimize data which enables system to not only deal with high-dimensional data but also ability to handle uncertainties in the data. Convolution Neural Network (CNN) is used for robust feature extraction. For classification a hierarchical attentive BiLSTM model is presented which enhances the system’s ability to focus on crucial temporal features. Finally self-attention layer is employed to enhance the model’s focus on critical features. Attention layer assigns weights to important parts of the input sequence. With this model we have tried to solve the problem of low detection accuracy. Performance assessment is done on three different standard datasets namely NSL-KDD, X-IIoTID and Edge-IIoTset datasets, with the accuracy 99.96%, 98.75 and 99.82 for multiclass classification and 99.98%, 98.88% and 99.93% for binary classification respectively. We have validated the proposed approach by not only conducting an extensive evaluation but also comparing the proposed model with various ML, DL models as well as with other current related research, which highlight the effectiveness of proposed model. Industrial internet of things (IIOT) Intrusion detection systems (IDS) Machine learning (ML) Deep learning (DL) Bidirectional Long Short-Term Memory (BiLSTM) Spider monkey optimization (SMO) Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 Figure 13 1. Introduction The IIoT played an important role in reshaping global industries like agriculture, healthcare and transportation even acted as a driving force in evolution of smart cities worldwide [ 1 – 3 ]. Integration of IIOT’s into industries brought major improvements in productivity and operational efficiency [ 4 – 9 ]. Deploying devices capable of real-time remote monitoring and automated control resulted in efficient resource utilization and cost savings. For instance in agriculture field sensors monitors soil moisture, nutrient levels which helps farmers to take precise decisions resulting in better crop yields while reducing resource wastage. In manufacturing factories, automation ensures consistent product quality while maintaining energy efficiency. In transportation optimizing routes and monitoring vehicle health ensures timely delivery, thereby enhancing supply chain efficiency. In health care tracking of patient health metrics in real time through smart devices improved care quality [ 10 – 19 ]. However, as this technology usage grows, vulnerabilities associated with it also grows, IIOT devices needs constant connectivity for data transfer which introduces loop holes in the network for malicious actors to exploit. These resource constrained devices needs computational power to implement traditional security protocols which they lack on top of that cyber-attacks such as Distributed Denial of Service (DDoS), unauthorised access are significant threats to IIoT networks. IDS is a critical layer of defence to mitigate these risks. IDS monitor network traffic flow to detect and counter threat or malicious activities that may compromise the network. It enhances system security by analysing patterns in the network traffic data to protect against wide varieties of known attacks. It also enhances scalability of resource-constrained IIoT devices without significantly increasing computational demands. Due to limited processing power and scalability traditional NIDs often analyse partial packet data of fixed size which limits detection accuracy especially for complex and multi-stage attacks. As the number of features increases detection accuracy of traditional NIDS decreases. Advanced models that are designed to handle large data can significantly enhances detection accuracy but require high computational cost. Achieving high accuracy while maintain scalability remains an open challenge. To overcome the complexities of intrusion detection in IIoT networks, DL and optimization algorithms are used to build hybrid, efficient and scalable models. Hybrid DL models are a viable approach that can leverage the strengths of two DL models such as CNNs for extracting spatial features from data to identify patterns in multi-dimensional inputs and detect anomalies. On other hand types of RNNs such as LSTM to capture temporal dependencies within sequential data for analysing time-series network traffic. Hierarchical models can be used to further enhance performance. Hierarchical model harness the ability to balance feature complexity and detection accuracy. These models have multi-layered architecture that can prioritize critical features while reducing noise which allows them to perform intrusion detection accurately with minimal computational overhead. IIoT environments produces data in large scale with high-dimensional features that can often lead to issues like overfitting which can degrade models performance. The comparative analysis of existing hybrid models is presented in Table 1 , which consists issues addressed by models and their limitations. Difficulty in handling high-dimensional data, lack of adaptability to new type of attacks, suboptimal feature reduction, and high computational complexity are some of the challenges that current hybrid intrusion detection models face in IIoT environment. In order to overcome these challenges they need to improve in dimensionality management, continues learning and adapt to new evolving threats along with efficient feature selection and simplified architectures. Overcoming these challenges can improve computational cost making them scalable and effective in IIoT environment. Table 1 Comparison of existing hybrid IDS models Refs Algorithm or Model Main Strength Limitation [ 1 ] IBSWO High classification accuracy. Effective feature selection using flat crossover and genetic algorithms. Struggles with high-dimensional data and finding optimal solutions. [ 2 ] SA-DCNN Addresses redundancy and underfitting issues. Unable handle increasing attack classes. [ 3 ] ELM + SVM GA + ELM Reduces irrelevant features while maintaining high detection performance. Feature reduction methods like KNN may outperform it in feature quantity reduction. [ 4 ] XGBoost + CNN + LSTM High detection rates. Effective feature reduction using XGBoost and CNN. Challenges with feature dimensionality. [ 5 ] AE + FL Handles data imbalance with collaborative learning. Increased complexity and computation costs. Proposed hybrid CNN-AH-BiLSTM model is designed to take care of following issues to improve intrusion detection: Proposed model is designed to handle wide range of attack that current intrusion detection systems are unable to detect. SMO is employed to optimize data and reduce feature dimensionality, addressing the challenge of excessive feature counts which minimizes computational cost and time during model training process. CNN is used for spatial feature extraction which captures high-level patterns. During validation testing CNN can recognize new sequential patterns which will enhance the ability of model to detect new type of attacks. A multilayer attentive BiLSTM is integrated which focuses on critical temporal features enhancing the detection of sophisticated attacks. Integration of all these models helps to find a middle ground between comprehensive data analysis and computational throughput. Presented paper is structured in following sections where section 2 provides survey and summary of prior research which includes traditional methodologies, recent research and comparative analysis of recent research. Section 3 outlines an in-depth explanation of the proposed model, thoroughly describes the methodology supporting the proposal. Section 4 delves into detail analysis of datasets that are used. Section 5 presents detailed analysis of the results and comparative analysis of proposed models results with other approaches. 2. Overview of existing research Various approaches have been proposed to address to protect IIoT network from increasing cyber threats. Each approach offer unique challenges posed by IIoT environments. Leveraging ML and DL techniques have shown significant improvement in enhancing cybersecurity for IIoT network environments In [ 22 ] a multivariate correlations analysis-long short-term memory (MCA-LSTM) was proposed, for superior classification performance it has integrated the triangle area map (TAM) matrix with optimal feature subsets, model is evaluated on NSL-KDD and UNSW-NB15 dataset achieving the testing accuracies of 82.15% and 77.74% respectively. In [ 23 ] author proposed a deep random neural network (DRaNN) which can classify nine types of attacks with a low false-positive rate. 41 features are used to train the model with the help of GD algorithm. Model was evaluated on UNSW-NB15 dataset with an accuracy of 99.54%. In [ 24 ] author introduced pretraining Wasserstein generative adversarial network-based IDS (PWG-IDS) which integrates WGAN-GP for traffic data generation and LightGBM for classification. Model was evaluated on NSL-KDD and CIC-IDS2018 dataset achieving the F1 score of 99% and 89% respectively. A hybrid CNN-LSTM model was introduced in [ 25 ] where CNN is employed for pattern recognition and LSTM for classification, model is tested on Edge-IIoTset dataset with 100% accuracy. Furthermore, a lightweight IDS is presented in [ 26 ] by using Pearson correlation coefficient for feature selection which has reduced 45 features in the TON_IoT dataset to 10 core features. It has able to detect multiple attack types with 99% accuracy using KNN and RF models demonstrating high-accuracy attack detection in high-load environments. A CTSF framework is introduced in [ 27 ] to address the limitation of transformers n extracting local features. Framework integrates CNNs and an enhanced Transformer in its pre-training phase in order to capture local and global features. For classification it uses SVM with “linear” and “rbf” kernels, framework is evaluated on X-IIOTID dataset achieving the accuracy of 98.88%. In [ 28 ] model integrating Graph Convolutional Network (GCN) and Long Short-Term Memory (LSTM) is presented, GCN is employed for feature extraction and pattern recognition and LSTM is used for classification. IoT-23 dataset is used for performance analysis achieving accuracy of 99.99%. In [ 29 ] a deep feed forward neural network DAE-DFFNN is proposed which uses hybrid rule-based feature selection. It uses genetic search algorithm for feature evaluation. Model evaluation is done on NSL-KDD and UNSW-NB15 datasets, achieving 99.0% and 98.9% accuracy, respectively. In [ 30 ] XGBoost was applied on X-IIoTDS and TON_IoT datasets to address the class imbalance, it has achived F1 score of 99.9% and 99.87%, respectively. In [ 31 ] a Deep-IFS model is proposed for distributed learning in fog computing environments which employed LocalGRU and MHSA layers for feature extraction on Bot-IoT and UNSW-NB15. It has achieved the accuracy of 99.75% for binary classification and 98.1% for multiclass classification on Bot-IoT on other hand it has achieved accuracy of 99.94% for binary classification and 99.75% for multiclass classification on UNSW-NB15 dataset. In [ 32 ] author presented ensemble models where Chi-Square Statistical method was used for feature selection along with various classifiers such as XGBoost, Bagging, and Random Forest are used for classification using ToN_IoT, Fridge, Garage_door, GPS_Tracker, Motion-Light, Moldbul, Thermostat and Weather dataset. Among these classifiers XGBoost outperformed with accuracy close to 100%. A hybrid rule-based model with DAE-DFFNN is presented in [ 33 ]. Automated dimensionality reduction techniques is used with rule-based feature extraction. Model validation is on NSL-KDD and UNSW-NB15 datasets achieving detection rate of 98.0%. In [ 34 ] a Federated learning approach was presented for privacy-preserving anomaly detection, to train local models deep reinforcement learning is used without sharing sensitive data. Author did not use any local dataset. In [ 35 ] anomaly detection in data streams (ASTREAM) approach was proposed which utilizes LSHiForest with PCA for identifying correlations between different attributes, sliding windows to handle the infiniteness of data streams and change detection to detect data distribution change in time and train the new model. Proposed approach was validated on KDDCUP99 dataset. In [ 36 ] a hybrid model called EvolCostDeep was proposed which consist of stacked autoencoders (SAE) and CNNs with a cost-dependent loss function for addressing scalability and class imbalance issues. Model evaluation was done on ToN-IoT and UNSW-NB15 datasets with F1-score of 95.2%. In [ 37 ] LightGBM with deep learning algorithms was proposed which are utilized in the lower level and upper level of the network for intrusion detection. Author focused on reducing training time making it suitable for edge IIoT scenarios. A dense random neural network (DnRaNN) was implemented in [ 38 ] for binary and multiclass classifications which classified nine different attacks on the IoT. Model is evaluated on ToN_IoT achieving the accuracy of 99.14% for binary and 99.05% for multiclass classifications. In [ 39 ] author addressed the issue of privacy-preserving in ML and issue of local models training with non-independent and identically distributed (non-IID) data for which Federated learning (FL) with instance-based transfer learning and weighted rank aggregation is proposed. For rank aggregation AdaBoost and Random Forest algorithms were used achieving the accuracy of 95.97% and 73.70% respectively. Similarly in [ 40 ] problem data privacy violation during training phase was addressed by proposing FL in context of Software Defined Networking (SDN). Model identified Syn attack with an accuracy of 98.20%, MSSQL attack with 99.30% and NetBios attack with 99.99%. In [ 41 ] author implemented six ML algorithms, RF, DT, KNN, LR, SVM, NB to build multiple IDS models. All models were assessed on the WUSTL-IIoT-2021 dataset among them RF achieved highest accuracy of 99.97%. Where SDN-based frameworks consisting of SVM and Decision Tree models also showed high detection accuracy of 99.7% with the NSL-KDD dataset [ 42 ]. In [ 43 ] author used Genetic Programming (GA) for intrusion detection in RPL-based IIoT environments. A threshold is calculated for each attack by extracting specific features from each nodes. This threshold modulation demonstrated high attack detection accuracy in simulation with 93.3% of true positive rate. In [ 44 ] a bidirectional LSTM model with multi-feature layers (B-MLSTM) is proposed. In training phase sequence and stage feature layers are introduced which enables model to detect threats in different intervals by analysing and learning corresponding attack interval from historical data after that a double-layer reverse unit updates the detection model in order to match the new attack interval. Model was evaluated on CTU-13, AWID and Gas-Water dataset with significant reductions in false positive 46.79% and false negative rates 79.85%. In [ 45 ] author used six supervised algorithms namely KNN, SVM, QDA, NB, XgBoost and Cat Boost employing min-max normalization. PCA is used for dimensionality reduction all models were evaluated on UNSW-NB15 dataset achieving an average accuracy of 99.9%. In [ 46 ] author proposed dual LSTMs, one is packet classifier and second is session classifier. It also consists of DNN that performs final classification performance evaluation was done on ISCXID2012 and CICIDS2017 dataset achieving the accuracy of 94.73% and 99.61% respectively. Pearson-Correlation Coefficient - Convolutional Neural Networks (PCC-CNN) is presented in [ 47 ]. Important features are extracted by the linear-based extractions then by CNN. Author firstly trained five PCC-based ML models such as Logistic Regression, Linear Discriminant Analysis, KNN, Classification and Regression Tree, & SVM to assess the performance. For validation NSL-KDD, CICIDS-2017, and IOTID20 datasets are used achieving the average accuracy of 99.89% across all datasets. In [ 48 ] oversampling and feature selection techniques were explored. Training sample size is reduced to minimum 39% to maximum of 74% by using SMOTE oversampling technique. Two classification models namely KNN and RF are used where detection accuracy of 99% is achieved on CICIDS-2017 and UNSW-NB15 datasets with RF and Tree Parzen Estimator (BO-TPE-RF) optimization algorithm. In [ 49 ] author used four supervised classifier algorithms, namely, DT, RF, KNN and SVM for classification along with two feature selection algorithms namely Correlation-based Feature Selection (CFS) algorithm and the Genetic Algorithm (GA). IoTID20 dataset is used for performance evaluation which showed DT and RF with GA-selected features achieved 100% accuracy across metrics [ 49 ]. Lastly, RNN-based IDS is proposed in [ 50 ] for binary and multiclass classification tasks. Performance evaluation shows that the model achieved highest accuracy on NSL-KDD dataset. The Table 2 provides an overview of recent research on IDS, showing various models and methodologies addressing specific issues in IDS, which highlights the diversity in approaches and datasets to tackle critical problems. Table 2 Comparative summary of recent research on IDS Refs. Model Used Issues Addressed Dataset Used [ 1 ] IBSWO Feature selection and Handling high-dimensional data and imbalanced datasets UNSWNB15, TON_IoT, NCTUKM-IIOT [ 2 ] SA-DCNN Imbalanced training data, redundant features, under fitting in IIoT IDS IoTID20, Edge-IIoTset [ 3 ] GA-ELM with SVM Feature selection for IoT, detection performance TON_IoT, UNSWNB15 [ 4 ] XGBoost+ LSTM, CNN + LSTM Tackling data imbalance and low test accuracy, binary and multi-class classification CICIDS2017, UNSWNB15, NSLKDD, WSNDS [ 5 ] DRL Anomaly detection in SCADA systems, monitoring complex environments, real-time detection WUSTL-IIoT-2018, WUSTL-IIoT-2021 [ 6 ] FL with AE Zero-day attacks, handling data imbalance in a 5G IIoT environment X-IIoTID [ 7 ] BPNN with GA Enhancing accuracy in fog computing environment by optimizing weights and biases UNSW-NB15, TON_IoT [ 8 ] IDS-SIoDL with LSTM Reducing training and classification times in real-time intrusion detection in IoT-based smart cities BoT-IoT, Edge-IIoT, NSLKDD [ 9 ] Ensemble DL Explainability and robustness in detecting, reducing false-positive rates TON_IoT [ 10 ] MAGRU Imbalanced training data, missing network attacks with fewer samples Edge-IIoTset, MQTTset [ 11 ] FGOA + kNN Detection of botnet attacks, Improving hyperparameter tuning. N-BaIoT [ 12 ] eBF Memory efficiency and accelerating filtering of malicious URLs in IoT Real ID Dataset [ 13 ] RF-PCCIF / RF-IFPCC Computational cost and prediction time. Addresses outliers in feature selection. Bot-IoT, NF-UNSWNB15-v2 [ 14 ] CNN + LSTM Detection rate, classification accuracy and reducing false detection. KDD CUP99, NSLKDD, UNSWNB15, [ 15 ] CRSF (CNN-RNN + SVM) Addressed limitations of manual feature extraction in SVM. TON_IoT [ 16 ] FL with Attention Scalability and communication overhead in centralized IIoT. Detection rate in distributed environments. Edge-IIoTSet [ 17 ] CNN + LSTM Detection rate for binary and multi-class classification. UNSWNB15, X-IIoTID [ 18 ] Centralized and FL Comprehensive dataset generation with realistic attacks Edge-IIoTset [ 19 ] GA-RF Enhanced accuracy and AUC in binary and multiclass detection UNSWNB15 [ 20 ] HDRaNN Detecting 16 types of attacks with robust classification. DS2OS, UNSWNB15 [ 21 ] 5-Layer AE Addressed data imbalance and reconstruction error handling NSLKDD 3. Proposed Model This study proposes a hybrid DL model which integrates CNN with Attentive Hierarchical Bi-LSTM for intrusion detection in IIoT network. As depicted in Fig. 1 proposed model consist of SMO to reduce data dimension, CNN for feature extraction and pattern recognition and Hierarchical Bi-LSTM to learn crucial temporal features followed by self-attention layer enhance the model’s focus on critical features. Model process is divided into three stages: Stage 1: Data pre-processing, Stage 2: Feature Extraction and Stage 3: Model Training & Testing. 3.1 Data Pre-Processing A crucial step to ensure that the ML and DL model works effectively is preparing the dataset. Proposed framework starts with pre-processing the raw dataset to prepare it for training and testing phases. Pre-processing incorporates various data processing techniques such as data encoding, normalization, dimension reduction. Here we are leveraging three datasets namely NSL-KDD, Edge-IIoTSet and X-IIoTID. In proposed work the pre-processing consists of the following steps: 3.1.1 Data Encoding Data encoding converts categorical data like attack types, protocol into machine-readable format, which helps to process non-numerical values during training. In NSL-KDD there are some basic features such as protocol_type, service, and flag. In X-IIoTID device_type and protocol and attack types in Edge-IIoTSet are encoded using one-hot encoding technique. 3.1.2 Data Normalization In order to scale features to a uniform range ([0, 1] or [-1, 1]) data normalization is used. It reduces bias caused by differing feature ranges which improves model performance. In ID datasets normalization is important because they contain various range of features for example byte counts, duration, or telemetry values. Here we are using Min-Max Scaling to rescale features to a [0, 1] range. Normalizing features ensures that ensures that features with large ranges for example packet size do not dominate small scale features such as CPU usage, making neural network models more effective. 3.1.3 Data Dimension Reduction The SMO algorithm is used for data dimension reduction. SMO is a swarm-based metaheuristic algorithm inspired by the foraging behavior of spider monkeys. In NSL-KDD dataset it reduces redundant network features for example packet counts, connection durations while retaining key intrusion patterns. In X-IIoTID dataset device related features like device-type, cpu_usage, and memory_usage are optimized in order to identify less but impactful attributes. In Edge-IIoTSet network traffic features such as protocol, flow_duration, total_fwd_packets are optimized which enhances interpretability while managing large-scale, heterogeneous data. 3.2 Feature Extraction After pre-processing, the data is processed through a CNN for feature extraction. CNNs capability to handle time-series and tabular data makes them well suited for pattern recognition from multi-dimensional data such as intrusion detection datasets [ 4 ]. In this approach pre-processed data is reshaped to mimic image-like input where rows represents samples and columns represents features [ 14 ]. Firstly convolution layer applies multiple filters over pre-processed data to extract spatial features. It captures most relevant features by detecting patterns and correlations between patterns. In max pooling layer the spatial dimensions of the feature maps gets reduced while retaining the most significant information resulting in minimal computational complexity and overfitting. In training process the weights of CNN highlight the features that contributes the most in differentiating normal and attack classes [ 2 ]. For example in X-IIoTID dataset CNN focuses on device behaviour and protocol anomalies, in NSL-KDD it focuses on traffic-related features and in X-IIoTID network flow metrics are prioritized. 3.3 Model Training and Validation The extracted features are split into training data and testing data. The training data is then processed through the following sequence of layers for model training and then testing data is processed to validate the trained model: 3.3.1 Hierarchical Bi-LSTM Figure 2 . Illustrate a hierarchical Bi-LSTM network with two forward layers and two backward layers. In this architecture first forward and backward layers output are passed to second forward and backward layers enabling a deeper understanding of temporal dependencies in sequential data. This structure understand sequential and contextual relationships between features. Here, input sequence \(\:\left\{{I}_{t-1},{I}_{t},{I}_{t+1}\right\}\) are passed to the first forward layer. Every input at time \(\:t\) is denoted as \(\:{I}_{t}\) which generates a hidden state \(\:{a}_{t}^{f}\) . This hidden state captures temporal information from past inputs in a forward direction. First forward state is denoted as $$\:{a}_{t}^{f}={\sigma\:}\left({W}_{f}\bullet\:{I}_{t}.+\:{U}_{f}\bullet\:{a}_{t-1}^{f}+{b}_{f}\right)$$ 1 Where \(\:,\:{W}_{f}\) is the weight matrix for current input \(\:{I}_{t}\) . \(\:{a}_{t-1}^{f}\) represents previous hidden state and weight matrix for \(\:{a}_{t-1}^{f}\) is represented by \(\:\:{U}_{f}\) , \(\:{b}_{f}\) is the forward pass bias term. After that the input sequence is passed to first backward layer, which processes data in reverse order to generate a backward hidden state \(\:{b}_{t}^{b}\) which captures dependencies from future inputs. Backward hidden state hidden state is modelled as $$\:{b}_{t}^{b}={\sigma\:}\left({W}_{b}\bullet\:{I}_{t}.+\:{U}_{b}\bullet\:{b}_{t+1}^{b}+{b}_{b}\right)$$ 2 Where \(\:,\:{W}_{b}\) , \(\:\:{U}_{b}\) , \(\:{b}_{b}\) are the weight matrices for current input \(\:{I}_{t}\) and subsequent hidden state \(\:{b}_{t+1}^{b}\) respectively, \(\:{b}_{b}\) is a bias term for backward pass. Output of first forward layer \(\:{a}_{t}^{f}\) is passed as an input to second forward layer, which refines learned temporal information and generates \(\:{a}_{t}^{a}\) , which is denoted as $$\:{a}_{t}^{a}={\sigma\:}\left({W}_{a}\bullet\:{a}_{t}^{f}.+\:{U}_{a}\bullet\:{a}_{t-1}^{a}+{b}_{a}\right)$$ 3 Where, \(\:{W}_{a}\) and \(\:{U}_{a}\) are the weight matrix for output of first forward layer \(\:{a}_{t}^{f}\) and previous refined state \(\:{a}_{t-1}^{a}\) respectively. \(\:{b}_{a}\) is a bias term for the current layer. Similarly output of first backward layer \(\:{b}_{t}^{b}\) is passed to the second backward layer which refines temporal information from the first backward layer to compute \(\:{b}_{t}^{b}\) $$\:{b}_{t}^{b}={\sigma\:}\left({W}_{b}^{{\prime\:}}\bullet\:{b}_{t}^{b}.+\:{U}_{b}^{{\prime\:}}\bullet\:{b}_{t+1}^{b}+{b}_{b}^{{\prime\:}}\right)$$ 4 Where, \(\:{W}_{b}^{{\prime\:}}\) and \(\:{U}_{b}^{{\prime\:}}\) are the weight matrix for output of first backward layer \(\:{b}_{t}^{b}\) and next refined state \(\:{b}_{t+1}^{b}\) respectively. For every time step \(\:t\) , output \(\:{O}_{t}\) is generated by combining second forward layer \(\:{a}_{t}^{a}\) and second backward layer \(\:{b}_{t}^{b}\) . \(\:{O}_{t}\) is the feature representation at time \(\:t\) . $$\:{O}_{t}=V\bullet\:\left[{a}_{t}^{a};{b}_{t}^{b}\right]+c$$ 5 Where \(\:\left[{a}_{t}^{a};{b}_{t}^{b}\right]\) represents the concatenated forward and backward hidden states, \(\:V\) represents output weights and \(\:c\) represent biases. 3.3.2 Self-Attention Layer The final output \(\:{O}_{t}\) is a combination of information from forward and backward passes which provides a comprehensive temporal context for the sequence at time \(\:t\) . Outputs from the output layer \(\:{O}_{t}\:=\{{O}_{1},\:{O}_{2},\:\dots\:{O}_{T},\:\}\) is served as input to self-attention layer. Attention layer focuses on most critical parts of the input sequence by computing attention weights. Each \(\:{O}_{t}\) gets transformed into Query (Q), Key (K), and Value (V) matrices by using learnable weight matrices. Calculation of attention score is formulated as $$\:Attention\left(Q,K,V\right)=softmax\left(\frac{Q{\times\:K}^{T}}{\sqrt{{d}_{k}}}\right)\times\:V$$ 6 Attention score represents the critical part of input sequence relative to others, which enables model to focus on patterns that are influential like intrusion signatures. 3.3.3 Dropout Layer The attention-weighted output is passed through a dropout layer for further processing in order to reduce overfitting. Dropout layer randomly deactivate fraction of neurons with a predefined probability \(\:\:\left(p\right)\) . It improves generalization ability of a model which means it reduces risk of over-reliance on specific neurons. By deactivating some neurons, dropout layer forces model to learn more robust and generalized features which in turn enhances the model's ability to handle unseen data. This helps model to improve its performance on testing dataset. 3.3.4 Fully Connected Layer Dropout layer output is processed by fully connected (dense) layer which learns complex patterns and prepares data for classification. This layer integrates features from previous layers and map them to a higher-dimensional space. By using a weight matrix \(\:{W}_{fc}\) and bias \(\:{b}_{fc}\) each input gets transformed and passed through ReLU activation function which enables model to identify patterns relative to normal and attack behavior in the input data. 3.3.5 Softmax Layer Softmax layer converts the raw predictions into probabilities for multi-class classification such as normal or different types of attacks and ensures that the probabilities for all classes sum to 1. As shown in (7) probability for a given output \(\:{O}_{t}\) belonging to class \(\:i\) is calculated as: $$\:P\left({class\:}_{i}\right|\:{O}_{t})\:=\:\frac{{e}^{{W}_{o}^{i}·{O}_{t}+{b}_{o}^{i}}}{{\sum\:}_{j=1}^{C}{e}^{{W}_{o}^{i}·{O}_{t}+{b}_{o}^{i}}}$$ 7 In (7) total number of classes is represented with \(\:C\) , \(\:{W}_{o}^{i}\) and \(\:{b}_{o}^{i}\) are the weight and bias for class \(\:i\) . \(\:P\left({class\:}_{i}\right|\:{O}_{t})\) is the probability of \(\:{O}_{t}\) for given class \(\:i\) . The class with the highest probability is selected for model's prediction. 4. Datasets The NSL-KDD dataset [ 51 ][ 54 ] addresses the redundancy and class imbalance issues that are present in KDD Cup 1999 dataset. NSL-KDD dataset is upgraded version of KDD Cup 1999 dataset which is designed for reliability during research. As per Table 3 dataset consist of 41 features categorically divided into basic, content, traffic and host features. Each record is labelled as either normal or attack. Attacks are clustered into four sub-categories Denial of Service (DoS) which overwhelm resources, Probe which scan vulnerabilities, User-to-Root for privilege escalation, Remote-to-Local known for attempting unauthorized access. The X-IIoTID dataset [ 53 ][ 56 ] is constructed specifically for IIoT environments, mimicking real-world network behaviours and threats. It focuses on specific attributes like device type, communication protocols, traffic features, and operational data. Attack data includes DoS, Man-in-the-Middle (MITM) and malicious payload injections. Data is labelled as benign and specific attack categories. Similarly Edge-IIoTset dataset [ 55 ] is also designed for IIoT environment specifically capturing the complexities of edge-based networks. Data features includes network traffic, system logs, and device telemetry mimicking real-world IIoT scenarios with various devices and communication protocols such as MQTT and CoAP. Dataset contains different types of traditional network intrusion attacks like DoS, spoofing and IIoT-specific attacks like data exfiltration, firmware tampering. Edge-IIoTset is a multi-domain dataset which helps to develop robust security solutions. Table 3 Comparative summary Datasets Attributes NSL-KDD X-IIoTID Edge-IIoT Features 41 68 61 Total Data Records 148,517 820,834 22,339,021 Main Categories 4 5 5 5. Performance Analysis Trained model is evaluated by using testing data. Data points are classified as normal or attack as per Eq. ( 8 ). For class prediction, class with the highest probability for data point at time \(\:t\) get selected. $$\:{\widehat{y}}_{t}={agrmax}_{i}\:P\left({class}_{i}\right|{O}_{t})$$ 8 Where predicted class label for time step \(\:t\) is represented by \(\:{\widehat{y}}_{t}\:\) and \(\:{agrmax}_{i}\:\) identifies the index \(\:i\) corresponding to the maximum probability. Softmax layer calculate the probabilities which are then used for classification. Each data point is classified into predefined categories such as “Normal” or other specific attack types like DoS, Probe. Standard metrics like accuracy, precision, recall, and F1-score are used for evaluating classification performance. Figure 3 , 4 illustrates the training and validation performance of a multiclass classification model on the NSL-KDD dataset over multiple epochs. As depicted in Fig. 3 . Training and validation accuracy curve increases steadily over 20 epochs and cover 90% accuracy on 10th epoch which indicates effective learning with minimal overfitting. While the training and validation loss curve decrease expeditiously till 10th epoch. Figure 5 depicts the comparative performance analysis of proposed model with existing intrusion detection models that are trained and validated on NSL-KDD dataset. Standard metrics like accuracy, precision, recall, and F1-score are used. Proposed CNN-AH-BiLSTM model has the highest accuracy of 99.96%, 99.84% precision, 99.81% recall, and 99.83% F1-score. These scores indicate the effective detection capability of proposed model. DNN [52] model achieved the accuracy of 98.00% and precision and recall score of 97.00% respectively. RANN [52] model has maintained balance between precision and recall with a score of 92.18% and 92.35% respectively with an F1-score of 92.29%. PCC-CNN [47] model has achieved accuracy of 94.00%, but the recall and F1-score is significantly lower with 77.00% and 80.00% respectively. 5-Layer AE [21] model has the lowest accuracy of 90.61% amongst all other models and has the highest recall score of 98.43% which makes it effective for intrusions detection but can be prone to false positives. Proposed CNN-AH-BiLSTM scores balanced scores across all metrics. Figure 6 and 7 illustrates the training and validation performance for multiclass classification on the Edge-IIoTset dataset. Figure 6 depicts training and validation accuracy over 20 epochs where both curves are increasing steadily achieving the training accuracy of 99.98% and validation accuracy of 99.82% on final epoch which indicates effective model learning with minimal overfitting. Figure 7 shows the loss graph showing sharp decrease in initial epochs. Both figures indicate that the model achieved high convergence rate with no overfitting or underfitting. Figure 8 illustrate the comparative performance analysis of various existing models with proposed model on Edge-IIoTSet dataset. Here SA-DCNN [2] has achieved the accuracy of 99.96%, precision score 99.83%, recall 99.79% and F1-score 99.81%, followed by MAGRU-IDS [10] with an accuracy of 99.94%. The Bi-GRU-CL and Bi-GRU-FL [16] achieved lowest accuracy of 94.60% and 95.70%, respectively. Proposed CNN-AH-BiLSTM achieved the accuracy 99.82%, precision score of 98.43%, recall 99.10% and F1-score of 98.76%, Figure 9 and 10 depicts models performance on X-IIoTD dataset over 20 epochs for multiclass classification. Figure 9 shows the training and validation accuracy. Till epoch 7 graph shows rapid increase in accuracy. After 98% there is plateau till 20th epoch which shows model learning rate slows down rapidly and no longer learning new patterns at the end. Figure 10 shows the loss graph depicting steady decrease in loss. Alignment between both curves shows the minimal overfitting. Figure 11 depicts the comparative performance analysis of existing models with proposed model using standard performance metrics for X-IIoTD dataset. Here FL-AE [ 6 ] model achieved accuracy and F1-Score of 99.32% and 99.84% respectively. AP2PFL achieved lowest accuracy 96.42%, and its variants AP2PFL-DNN and AP2PFL-MLP achieved accuracy of 97.95% and 97.21%, respectively. Proposed CNN-AH-BiLSTM model shows strong performance across all metrics. It has established itself as a strong alternative to FL-AE model with an accuracy of 98.75% and balanced score for other metrics of 98.40%. CNN-AH-BiLSTM offers a balance between accuracy and efficiency. Table 4 Comparative Analysis of proposed model with existing models for binary classification. Attribute like Feature Selection were not explicitly available for all models and datasets, so placeholders ("-") have been used in cases where data is not provided in the input. Model Feature Selection Dataset Accuracy XGBoost-LSTM [ 4 ] XGBoost UNSWNB15 94.41 CNN-LSTM [ 4 ] CNN WSN DS 91.18 CNN-LSTM [ 4 ] CNN Edge-IIoTset 95.21 FL-AE [ 6 ] - X-IIoTID 99.32 EHIDS [ 7 ] - UNSWNB15 96.47 EHIDS [ 7 ] - TON_IoT 95.36 ELM [ 9 ] - TON_IoT 98.77 SHAP-LIME-ELM [ 9 ] - TON_IoT 99.69 CNN-AH-BiLSTM CNN + SMO NSL-KDD 99.98 CNN-AH-BiLSTM CNN + SMO Edge-IIoTSet 99.93 CNN-AH-BiLSTM CNN + SMO X-IIoTID 98.88 Table 4 and Fig. 12 presents various intrusion detection models with respective accuracy and datasets they are evaluated on along with feature selection technique that is employed. All evaluated results are for binary classification. Proposed CNN-AH-BiLSTM model uses CNN + SMO for feature selection and achieves accuracy of 99.98%, 99.93%, 98.88% for NSL-KDD, Edge-IIoTSet and X-IIoTID datasets respectively. SHAP-LIME-ELM [ 9 ] model achieved 99.69% accuracy on TON_IoT dataset, FL-AE [ 6 ] model achieved 99.32% accuracy on X-IIoTID while EHIDS [ 7 ] achieved accuracy of 96.47% on UNSWNB15 and 95.36% on TON_IoT dataset. Table 5 Comparative Analysis of proposed model with existing models for multiclass classification. Attribute like Feature Selection were not explicitly available for all models and datasets, so placeholders ("-") have been used in cases where data is not provided in the input. Model Feature Selection Dataset Accuracy BSWO [ 1 ] SWO UNSWNB15 97.80 IBSWO [ 1 ] SWO UNSWNB15 98.70 BSWO [ 1 ] SWO TON_IoT 99.70 IBSWO [ 1 ] SWO TON_IoT 99.90 BSWO [ 1 ] SWO NCTUKM-IIOT 99.20 IBSWO [ 1 ] SWO NCTUKM-IIOT 99.70 SA-DCNN [ 2 ] DCNN Edge-IIoTset 99.95 SA-DCNN [ 2 ] DCNN IoTID20 96.89 GA-ELM [ 3 ] SVM TON_IoT 99.00 GA-ELM [ 3 ] SVM UNSWNB15 86.00 XGBoost-LSTM [ 4 ] CNN UNSWNB15 90.71 CNN-LSTM [ 4 ] - WSN DS 91.09 DRL [ 5 ] - WUSTL-IIoT-2018 99.36 ELM [ 9 ] - TON_IoT 88.23 SHAP-LIME-ELM [ 9 ] - TON_IoT 99.63 MAGRU [ 10 ] XGBoost Edge-IIoTset 99.94 MAGRU [ 10 ] XGBoost MQTTset 99.99 IHHO-NN [ 11 ] GOA Na-BaIoT 98.07 CNN-AH-BiLSTM CNN + SMO NSL-KDD 99.96 CNN-AH-BiLSTM CNN + SMO Edge-IIoTSet 99.82 CNN-AH-BiLSTM CNN + SMO X-IIoTID 98.75 Table 5 and Fig. 13 presents the accuracy of various intrusion detection models with their respective dataset that they are evaluated on, along with feature selection techniques. Among all the models IBSWO [ 1 ] has achieved accuracy of 99.90% on TON_IoT, 99.70% on NCTUKM-IIOT, and 98.70% on UNSWNB15. SA-DCNN [ 2 ] has achieved accuracy of 99.95% on Edge-IIoTset and 96.89% on IoTID20 dataset. MAGRU [ 10 ] which has used XGBoost for feature selection achieved the accuracy of 99.99% on MQTTset and 99.94% on Edge-IIoTset. ELM [ 9 ] model achieved accuracy of 88.23% on TON_IoT dataset while its variant SHAP-LIME-ELM [ 9 ] achieved accuracy of 99.63% on TON_IoT significantly outperforming base model while another variant GA-ELM [ 3 ] achieved 99.00% accuracy on TON_IoT but performs poorly on UNSWNB15 dataset with an accuracy of 86.00%. DRL [ 5 ] has achieved 99.36% accuracy on WUSTL-IIoT-2018, XGBoost-LSTM [ 4 ] and CNN-LSTM [ 4 ] achieved accuracy of 90.71% on UNSWNB15 and 91.09% on WSN DS, respectively. CNN-AH-BiLSTM, achieved balanced accuracy of around 99% across all datasets, making it highly effective for intrusion detection across diverse cybersecurity datasets. 6. Conclusion In this study, we proposed a hybrid IDS for IIoT networks, CNN-AH-BiLSTM, which integrates multiple DL techniques with SMO for feature extraction and classification thereby enhancing detection accuracy. Model used SMO for dimension reduction, CNN for robust feature extraction, to capture temporal dependencies hierarchical attentive BiLSTM is used followed by self-attention layer for focusing on critical features. Model is evaluated on three benchmark datasets namely NSL-KDD, Edge-IIoTSet and X-IIoTID achieving state-of-the-art accuracy in both binary and multiclass classification. Model achieved 99.98% accuracy on NSL-KDD, 99.93% on Edge-IIoTSet and 98.88% on X-IIoTID dataset for binary classification while model achieved 99.96% accuracy on NSL-KDD, 99.82 on Edge-IIoTSet and 98.75 X-IIoTID dataset. These results shows that model achieved balanced accuracy across all datasets, making it highly effective for intrusion detection across diverse cybersecurity datasets. While the evaluated results are promising, there are several areas for future research proposed model can be further optimized to reduce computational complexity, Model needs to be tested on real-world IIoT traffic datasets to validate its robustness. Future research can refine and improve proposed model ensuring even more reliable and efficient intrusion detection in IIoT networks. Declarations Ethics, Consent to Participate, and Consent to Publish declarations: Not applicable. Funding: There is no funding Conflicts of interest: The corresponding author and all co-authors, confirms that there are no conflicts of interest to declare. Data Availability: The data supporting this study can be made available upon request. Author Contribution The primary work to this research which includes Literature review, conceptualization, feasibility study, design, methodology, is done by S.P. She led the development of the proposed methodology. She also conducted experimental evaluation and evaluated the performance of IDS and lastly prepared the manuscript based on experimental findings. M.K. contributed to designing and refining the experimental setup, analyzed results, and provided critical revisions to the manuscript at each step and approved the final version of the paper. Acknowledgement I want to thank my advisor, Dr. Dr. Mandar S. Karyakarte, for his essential help and direction during the creation of this paper. His knowledge, thoughtful comments, and constant support have greatly influenced the path and excellence of this project. I deeply appreciate his guidance, which has inspired and driven me throughout this process. References M. Shtayat et al., "An Improved Binary Spider Wasp Optimization Algorithm for Intrusion Detection for Industrial Internet of Things," in IEEE Open Journal of the Communications Society, doi: 10.1109/OJCOMS.2024.3421647. M. S. Alshehri, O. Saidani, F. S. Alrayes, S. F. Abbasi and J. Ahmad, "A Self-Attention-Based Deep Convolutional Neural Networks for IIoT Networks Intrusion Detection," in IEEE Access, vol. 12, pp. 45762-45772, 2024, doi: 10.1109/ACCESS.2024.3380816. Maseno, E.M., Wang, Z. Hybrid wrapper feature selection method based on genetic algorithm and extreme learning machine for intrusion detection. J Big Data 11, 24 (2024). https://doi.org/10.1186/s40537-024-00887-9 Sajid, M., Malik, K.R., Almogren, A. et al. Enhancing intrusion detection: a hybrid machine and deep learning approach. J Cloud Comp 13, 123 (2024). https://doi.org/10.1186/s13677-024-00685-x F. Mesadieu, D. Torre and A. Chennamaneni, "Leveraging Deep Reinforcement Learning Technique for Intrusion Detection in SCADA Infrastructure," in IEEE Access, vol. 12, pp. 63381-63399, 2024, doi: 10.1109/ACCESS.2024.3390722. P. Verma, N. Bharot, J. G. Breslin, D. O'Shea, A. Vidyarthi and D. Gupta, "Zero-Day Guardian: A Dual Model Enabled Federated Learning Framework for Handling Zero-Day Attacks in 5G Enabled IIoT," in IEEE Transactions on Consumer Electronics, vol. 70, no. 1, pp. 3856-3866, Feb. 2024, doi: 10.1109/TCE.2023.3335385. Mohamed, D., Ismael, O. Enhancement of an IoT hybrid intrusion detection system based on fog-to-cloud computing. J Cloud Comp 12, 41 (2023). https://doi.org/10.1186/s13677-023-00420-y C. Hazman, A. Guezzaz, S. Benkirane and M. Azrour, "Enhanced IDS with Deep Learning for IoT-Based Smart Cities Security," in Tsinghua Science and Technology, vol. 29, no. 4, pp. 929-947, August 2024, doi: 10.26599/TST.2023.9010033. M. M. Shtayat, M. K. Hasan, R. Sulaiman, S. Islam and A. U. R. Khan, "An Explainable Ensemble Deep Learning Approach for Intrusion Detection in Industrial Internet of Things," in IEEE Access, vol. 11, pp. 115047-115061, 2023, doi: 10.1109/ACCESS.2023.3323573. S. Ullah, W. Boulila, A. Koubâa and J. Ahmad, "MAGRU-IDS: A Multi-Head Attention-Based Gated Recurrent Unit for Intrusion Detection in IIoT Networks," in IEEE Access, vol. 11, pp. 114590-114601, 2023, doi: 10.1109/ACCESS.2023.3324657. F. Taher, M. Abdel-Salam, M. Elhoseny and I. M. El-Hasnony, "Reliable Machine Learning Model for IIoT Botnet Detection," in IEEE Access, vol. 11, pp. 49319-49336, 2023, doi: 10.1109/ACCESS.2023.3253432. Gebretsadik, F.G., Nayak, S. & Patgiri, R. eBF: an enhanced Bloom Filter for intrusion detection in IoT. J Big Data 10, 102 (2023). https://doi.org/10.1186/s40537-023-00790-9 M. Mohy-Eddine, A. Guezzaz, S. Benkirane, M. Azrour and Y. Farhaoui, "An Ensemble Learning Based Intrusion Detection Model for Industrial IoT Security," in Big Data Mining and Analytics, vol. 6, no. 3, pp. 273-287, September 2023, doi: 10.26599/BDMA.2022.9020032. J. Du, K. Yang, Y. Hu and L. Jiang, "NIDS-CNNLSTM: Network Intrusion Detection Classification Model Based on Deep Learning," in IEEE Access, vol. 11, pp. 24808-24821, 2023, doi: 10.1109/ACCESS.2023.3254915. S. Li et al., "CRSF: An Intrusion Detection Framework for Industrial Internet of Things Based on Pretrained CNN2D-RNN and SVM," in IEEE Access, vol. 11, pp. 92041-92054, 2023, doi: 10.1109/ACCESS.2023.3307429. M. Nuaimi, L. C. Fourati and B. Ben Hamed, "A Scalable Intrusion Detection Approach for Industrial Internet of Things Based on Federated Learning and Attention Mechanism," 2023 IEEE Symposium on Computers and Communications (ISCC), Gammarth, Tunisia, 2023, pp. 1-4, doi: 10.1109/ISCC58397.2023.10218054 Hakan Can Altunay, Zafer Albayrak, "A hybrid CNN+LSTM-based intrusion detection system for industrial IoT networks", Engineering Science and Technology, an International Journal, Volume 38, 2023, 101322, ISSN 2215-0986, https://doi.org/10.1016/j.jestch.2022.101322. M. A. Ferrag, O. Friha, D. Hamouda, L. Maglaras and H. Janicke, "Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications for Centralized and Federated Learning," in IEEE Access, vol. 10, pp. 40281-40306, 2022, doi: 10.1109/ACCESS.2022.3165809. S. M. Kasongo, "An Advanced Intrusion Detection System for IIoT Based on GA and Tree Based Algorithms," in IEEE Access, vol. 9, pp. 113199-113212, 2021, doi: 10.1109/ACCESS.2021.3104113. Z. E. Huma et al., "A Hybrid Deep Random Neural Network for Cyberattack Detection in the Industrial Internet of Things," in IEEE Access, vol. 9, pp. 55595-55605, 2021, doi: 10.1109/ACCESS.2021.3071766. W. Xu, J. Jang-Jaccard, A. Singh, Y. Wei and F. Sabrina, "Improving Performance of Autoencoder-Based Network Anomaly Detection on NSL-KDD Dataset," in IEEE Access, vol. 9, pp. 140136-140146, 2021, doi: 10.1109/ACCESS.2021.3116612. Dong, R.-H., Li, X.-Y., Zhang, Q.-Y. and Yuan, H. (2020), Network intrusion detection model based on multivariate correlation analysis – long short-time memory network. IET Inf. Secur., 14: 166-174. https://doi.org/10.1049/iet-ifs.2019.0294 S. Latif, Z. Idrees, Z. Zou and J. Ahmad, "DRaNN: A Deep Random Neural Network Model for Intrusion Detection in Industrial IoT," 2020 International Conference on UK-China Emerging Technologies (UCET), Glasgow, UK, 2020, pp. 1-4, doi: 10.1109/UCET51115.2020.9205361. Zhang, L., Jiang, S., Shen, X., Gupta, B.B., & Tian, Z. (2021). PWG-IDS: An Intrusion Detection Model for Solving Class Imbalance in IIoT Networks Using Generative Adversarial Networks. ArXiv, abs/2110.03445. A. Khacha, R. Saadouni, Y. Harbi and Z. Aliouat, "Hybrid Deep Learning-based Intrusion Detection System for Industrial Internet of Things," 2022 5th International Symposium on Informatics and its Applications (ISIA), M'sila, Algeria, 2022, pp. 1-6, doi: 10.1109/ISIA55826.2022.9993487. H. -Y. Chuang and R. -M. Chen, "Detection of Attacks on Industrial Internet of Things Using Fewer Features," 2023 Sixth International Symposium on Computer, Consumer and Control (IS3C), Taichung, Taiwan, 2023, pp. 1-4, doi: 10.1109/IS3C57901.2023.00009. Chai, G., Li, S., Yang, Y., Zhou, G., & Wang, Y. (2023). CTSF: An Intrusion Detection Framework for Industrial Internet Based on Enhanced Feature Extraction and Decision Optimization Approach. Sensors, 23(21), 8793. https://doi.org/10.3390/s23218793 M. Koca and I. Avci, "A Novel Hybrid Model Detection of Security Vulnerabilities in Industrial Control Systems and IoT Using GCN+LSTM," in IEEE Access, vol. 12, pp. 143343-143351, 2024, doi: 10.1109/ACCESS.2024.3466391. J. B. Awotunde, C. Chakraborty, A. E. Adeniyi, and A. Jolfaei. “Intrusion Detection in Industrial Internet of Things Network-Based on Deep Learning Model with Rule-Based Feature Selection”. Wirel. Commun. Mob. Comput. 2021 (2021). https://doi.org/10.1155/2021/7154587 Le, T.-T.-H., Oktian, Y. E., & Kim, H. (2022). XGBoost for Imbalanced Multiclass Classification-Based Industrial Internet of Things Intrusion Detection Systems. Sustainability, 14(14), 8707. https://doi.org/10.3390/su14148707 M. Abdel-Basset, V. Chang, H. Hawash, R. K. Chakrabortty and M. Ryan, "Deep-IFS: Intrusion Detection Approach for Industrial Internet of Things Traffic in Fog Environment," in IEEE Transactions on Industrial Informatics, vol. 17, no. 11, pp. 7704-7715, Nov. 2021, doi: 10.1109/TII.2020.3025755. Awotunde, J. B., Folorunso, S. O., Imoize, A. L., Odunuga, J. O., Lee, C.-C., Li, C.-T., & Do, D.-T. (2023). An Ensemble Tree-Based Model for Intrusion Detection in Industrial Internet of Things Networks. Applied Sciences, 13(4), 2479. https://doi.org/10.3390/app13042479 Potnurwar, A. V., Bongirwar, V. K., Ajani, S., Shelke, N., Dhone, M., & Parati, N. (2023). Deep Learning-Based Rule-Based Feature Selection for Intrusion Detection in Industrial Internet of Things Networks. International Journal of Intelligent Systems and Applications in Engineering, 11(10s), 23–35. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3231 X. Wang et al., "Toward Accurate Anomaly Detection in Industrial Internet of Things Using Hierarchical Federated Learning," in IEEE Internet of Things Journal, vol. 9, no. 10, pp. 7110-7119, 15 May15, 2022, doi: 10.1109/JIOT.2021.3074382. Y. Yang et al., "ASTREAM: Data-Stream-Driven Scalable Anomaly Detection With Accuracy Guarantee in IIoT Environment," in IEEE Transactions on Network Science and Engineering, vol. 10, no. 5, pp. 3007-3016, 1 Sept.-Oct. 2023, doi: 10.1109/TNSE.2022.3157730. A. Telikani, J. Shen, J. Yang and P. Wang, "Industrial IoT Intrusion Detection via Evolutionary Cost-Sensitive Learning and Fog Computing," in IEEE Internet of Things Journal, vol. 9, no. 22, pp. 23260-23271, 15 Nov.15, 2022, doi: 10.1109/JIOT.2022.3188224. H. Yao, P. Gao, P. Zhang, J. Wang, C. Jiang and L. Lu, "Hybrid Intrusion Detection System for Edge-Based IIoT Relying on Machine-Learning-Aided Detection," in IEEE Network, vol. 33, no. 5, pp. 75-81, Sept.-Oct. 2019, doi: 10.1109/MNET.001.1800479 S. Latif et al., "Intrusion Detection Framework for the Internet of Things Using a Dense Random Neural Network," in IEEE Transactions on Industrial Informatics, vol. 18, no. 9, pp. 6435-6444, Sept. 2022, doi: 10.1109/TII.2021.3130248. J. Zhang, C. Luo, M. Carpenter and G. Min, "Federated Learning for Distributed IIoT Intrusion Detection Using Transfer Approaches," in IEEE Transactions on Industrial Informatics, vol. 19, no. 7, pp. 8159-8169, July 2023, doi: 10.1109/TII.2022.3216575. P. T. Duy, T. V. Hung, N. H. Ha, H. D. Hoang and V. -H. Pham, "Federated learning-based intrusion detection in SDN-enabled IIoT networks," 2021 8th NAFOSTED Conference on Information and Computer Science (NICS), Hanoi, Vietnam, 2021, pp. 424-429, doi: 10.1109/NICS54270.2021.9701525 A. M. Eid, A. B. Nassif, B. Soudan and M. N. Injadat, "IIoT Network Intrusion Detection Using Machine Learning," 2023 6th International Conference on Intelligent Robotics and Control Engineering (IRCE), Jilin, China, 2023, pp. 196-201, doi: 10.1109/IRCE59430.2023.10255088. Alshahrani, H.; Khan, A.; Rizwan, M.; Reshan, M.S.A.; Sulaiman, A.; Shaikh, A. Intrusion Detection Framework for Industrial Internet of Things Using Software Defined Network. Sustainability 2023, 15, 9001. https://doi.org/10.3390/su15119001 K. N. Qureshi, S. S. Rana, A. Ahmed, G. Jeon, “A novel and secure attacks detection framework for smart cities industrial internet of things”, Sustainable Cities and Society, Volume 61, 2020, ISSN 2210-6707, https://doi.org/10.1016/j.scs.2020.102343. X. Li, M. Xu, P. Vijayakumar, N. Kumar and X. Liu, "Detection of Low-Frequency and Multi-Stage Attacks in Industrial Internet of Things," in IEEE Transactions on Vehicular Technology, vol. 69, no. 8, pp. 8820-8831, Aug. 2020, doi: 10.1109/TVT.2020.2995133. Y. K. Saheed, A. I. Abiodun, S. Misra, M. K. Holone, R. Colomo-Palacios, “A machine learning-based intrusion detection for detecting internet of things network attacks”, Alexandria Engineering Journal, Volume 61, Issue 12, 2022, ISSN 1110-0168, https://doi.org/10.1016/j.aej.2022.02.063. Han, J.; Pak, W. “Hierarchical LSTM-Based Network Intrusion Detection System Using Hybrid Classification”. Appl. Sci. 2023, 13, 3089. https://doi.org/10.3390/app13053089 Bhavsar, M., Roy, K., Kelly, J. et al. Anomaly-based intrusion detection system for IoT application. Discov Internet Things 3, 5 (2023). https://doi.org/10.1007/s43926-023-00034-5 M. Injadat, A. Moubayed, A. B. Nassif and A. Shami, "Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection," in IEEE Transactions on Network and Service Management, vol. 18, no. 2, pp. 1803-1816, June 2021, doi: 10.1109/TNSM.2020.3014929. Altulaihan, E., Almaiah, M. A., & Aljughaiman, A. (2024). Anomaly Detection IDS for Detecting DoS Attacks in IoT Networks Based on Machine Learning Algorithms. Sensors, 24(2), 713. https://doi.org/10.3390/s24020713 C. Yin, Y. Zhu, J. Fei and X. He, "A Deep Learning Approach for Intrusion Detection Using Recurrent Neural Networks," in IEEE Access, vol. 5, pp. 21954-21961, 2017, doi: 10.1109/ACCESS.2017.2762418. Almiani, M., AbuGhazleh, A., Al-Rahayfeh, A., Atiewi, S., & Razaque, A. (2019). Deep Recurrent Neural Network For IoT Intrusion Detection System. Simulation Modelling Practice and Theory, 102031. doi:10.1016/j.simpat.2019.102031 Liang, C., Shanmugam, B., Azam, S., Jonkman, M., Boer, F. D., & Narayansamy, G. (2019). Intrusion Detection System for Internet of Things based on a Machine Learning approach. 2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN). doi:10.1109/vitecon.2019.8899448 M. Al-Hawawreh, E. Sitnikova and N. Aboutorab, "Asynchronous Peer-to-Peer Federated Capability-Based Targeted Ransomware Detection Model for Industrial IoT," in IEEE Access, vol. 9, pp. 148738-148755, 2021, doi: 10.1109/ACCESS.2021.3124634. Ingre, B., & Yadav, A. (2015). Performance analysis of NSL-KDD dataset using ANN. 2015 International Conference on Signal Processing and Communication Engineering Systems. doi:10.1109/spaces.2015.7058223 M. A. Ferrag, O. Friha, D. Hamouda, L. Maglaras and H. Janicke, "Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications for Centralized and Federated Learning," in IEEE Access , vol. 10, pp. 40281-40306, 2022, doi: 10.1109/ACCESS.2022.3165809. Muna Al-Hawawreh, Elena Sitnikova, Neda Aboutorab, July 30, 2021, "X-IIoTID: A Connectivity- and Device-agnostic Intrusion Dataset for Industrial Internet of Things", IEEE Dataport, doi: https://dx.doi.org/10.21227/mpb6-py55. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 04 Apr, 2025 Reviewers agreed at journal 03 Apr, 2025 Reviews received at journal 02 Apr, 2025 Reviewers agreed at journal 28 Mar, 2025 Reviews received at journal 28 Mar, 2025 Reviewers agreed at journal 28 Mar, 2025 Reviewers agreed at journal 28 Mar, 2025 Reviewers invited by journal 28 Mar, 2025 Editor assigned by journal 21 Mar, 2025 Submission checks completed at journal 21 Mar, 2025 First submitted to journal 04 Mar, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6158243","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Method Article","associatedPublications":[],"authors":[{"id":438346985,"identity":"c956dd60-e6cb-44df-9e1b-b7e984c7f463","order_by":0,"name":"Sushama L. Pawar","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA90lEQVRIiWNgGAWjYDADAwYGxocfKoAsZuYGorUwG0ucAWlhJF4LmwRvG4hJQIs5+xnDzxV/Dtubix1+ICE5rzaavx2o5UfFNpxaLHtyjCXPth1O3Dk7zcCgcNvx3BmHGRsYe87cxu2eA7kbJBsbDicY3E4wSJDcdiy3AaiFmbENj5bzbzf/bAA6zOB2+ocDvHOO5c4nqOVG7jbJBrbDjBtu5xg28DbU5G4gpMVyxvtvlo1t6YlALcXMEscO5G4EajmIzy/m/GnJNxv+WIMctv3nh5q63HnnDx988KMCj8PQ+IfB5AGc6rFoqcOneBSMglEwCkYoAACsvWLmhBYBkQAAAABJRU5ErkJggg==","orcid":"","institution":"BRACT’s Vishwakarma Institute of Information Technology, Pune, Maharashtra","correspondingAuthor":true,"prefix":"","firstName":"Sushama","middleName":"L.","lastName":"Pawar","suffix":""},{"id":438346986,"identity":"57772950-9406-4f57-acb3-4ea21997af4f","order_by":1,"name":"Mandar S. Karyakarte","email":"","orcid":"","institution":"BRACT’s Vishwakarma Institute of Information Technology, Pune, Maharashtra","correspondingAuthor":false,"prefix":"","firstName":"Mandar","middleName":"S.","lastName":"Karyakarte","suffix":""}],"badges":[],"createdAt":"2025-03-05 03:08:07","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6158243/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6158243/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":80785259,"identity":"79b0bb16-5f5c-4922-9801-ac5eea01d220","added_by":"auto","created_at":"2025-04-17 05:40:27","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":58901,"visible":true,"origin":"","legend":"\u003cp\u003eProposed IDS with CNN and Attentive Hierarchical BiLSTM Model with Spider Monkey Optimization.\u003c/p\u003e","description":"","filename":"1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/54a3eaaf3afa37ee758b55d0.jpg"},{"id":80785231,"identity":"35dd0f21-d0a4-4f78-b2a8-960a0d958b74","added_by":"auto","created_at":"2025-04-17 05:40:25","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":64573,"visible":true,"origin":"","legend":"\u003cp\u003eBi-LSTM with two forward layers and two backward layers.\u003c/p\u003e","description":"","filename":"2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/d790edb10381e78eb11b474b.jpg"},{"id":80785233,"identity":"699bdf39-367e-408b-a425-1d070e689f78","added_by":"auto","created_at":"2025-04-17 05:40:26","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":22408,"visible":true,"origin":"","legend":"\u003cp\u003eTraining and Validation Accuracy for Multiclass Classification on NSL-KDD Dataset\u003c/p\u003e","description":"","filename":"3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/fb06164444581711585e9b44.jpg"},{"id":80785237,"identity":"5dda0583-29c2-4c95-a6a2-fcd3b47cecd4","added_by":"auto","created_at":"2025-04-17 05:40:26","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":21396,"visible":true,"origin":"","legend":"\u003cp\u003eTraining and Validation Loss for Multiclass Classification for NSL-KDD Dataset\u003c/p\u003e","description":"","filename":"4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/523ab9fc3ada229a4bfeb903.jpg"},{"id":80785235,"identity":"ff7b8c19-d134-4310-9bc0-827a85cc1610","added_by":"auto","created_at":"2025-04-17 05:40:26","extension":"jpg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":42446,"visible":true,"origin":"","legend":"\u003cp\u003eComparative performance metrics for NSL-KDD dataset\u003c/p\u003e","description":"","filename":"5.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/6b4c20c9966b2f9189a49deb.jpg"},{"id":80785236,"identity":"ba7b6a28-0626-4e4e-b029-7c69d049cfc4","added_by":"auto","created_at":"2025-04-17 05:40:26","extension":"jpg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":26413,"visible":true,"origin":"","legend":"\u003cp\u003eTesting and Validation Accuracy for Multiclass Classification on Edge-IIoTset Dataset\u003c/p\u003e","description":"","filename":"6.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/bc202a9688963d95eaa79fd6.jpg"},{"id":80785234,"identity":"87c8e975-e3fc-4ce9-a12f-d0228bb04869","added_by":"auto","created_at":"2025-04-17 05:40:26","extension":"jpg","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":23829,"visible":true,"origin":"","legend":"\u003cp\u003eTraining and Validation Loss for Multiclass Classification for Edge-IIoTset Dataset\u003c/p\u003e","description":"","filename":"7.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/04f995e46c67548ea0f9596d.jpg"},{"id":80785270,"identity":"78ec7b28-7584-4466-983e-0e9346e4e297","added_by":"auto","created_at":"2025-04-17 05:40:28","extension":"jpg","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":45141,"visible":true,"origin":"","legend":"\u003cp\u003eComparative performance metrics for Edge-IIoTSet dataset\u003c/p\u003e","description":"","filename":"8.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/6241eec34f8f205bbce4f69f.jpg"},{"id":80787118,"identity":"3ae625d9-89cb-4d81-9f28-bc5e89c6687f","added_by":"auto","created_at":"2025-04-17 06:04:27","extension":"jpg","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":27631,"visible":true,"origin":"","legend":"\u003cp\u003eTraining and Validation Accuracy for Multiclass Classification for X-IIoTID dataset\u003c/p\u003e","description":"","filename":"9.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/d085766c22caea7e2283edd0.jpg"},{"id":80787119,"identity":"31dbb3db-ef48-4923-a456-a3386253a211","added_by":"auto","created_at":"2025-04-17 06:04:28","extension":"jpg","order_by":10,"title":"Figure 10","display":"","copyAsset":false,"role":"figure","size":40616,"visible":true,"origin":"","legend":"\u003cp\u003eTraining and Validation Loss for Multiclass Classification for X-IIoTID dataset\u003c/p\u003e","description":"","filename":"10.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/d789ae8036a2d28c39078cb4.jpg"},{"id":80785274,"identity":"4025a006-28de-4dbe-a4d2-0801804ac0ea","added_by":"auto","created_at":"2025-04-17 05:40:28","extension":"jpg","order_by":11,"title":"Figure 11","display":"","copyAsset":false,"role":"figure","size":47258,"visible":true,"origin":"","legend":"\u003cp\u003eComparative performance metrics for X-IIoTID dataset\u003c/p\u003e","description":"","filename":"11.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/b22064f7b80831dbe29fa294.jpg"},{"id":80785272,"identity":"a6bcb6c2-6320-4aba-8e70-d484c07d411c","added_by":"auto","created_at":"2025-04-17 05:40:28","extension":"jpg","order_by":12,"title":"Figure 12","display":"","copyAsset":false,"role":"figure","size":40327,"visible":true,"origin":"","legend":"\u003cp\u003eComparative Analysis of proposed model with existing models for binary classification.\u003c/p\u003e","description":"","filename":"12.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/42656b5ad3841d53ae357177.jpg"},{"id":80787124,"identity":"09968b90-f0c6-4aa5-8c3a-4340940b6432","added_by":"auto","created_at":"2025-04-17 06:05:09","extension":"jpg","order_by":13,"title":"Figure 13","display":"","copyAsset":false,"role":"figure","size":50198,"visible":true,"origin":"","legend":"\u003cp\u003eComparative Analysis of proposed CNN-AH-BiLSTM model with existing models for multiclass classification.\u003c/p\u003e","description":"","filename":"13.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/17b226ce5b21cb503ce29be3.jpg"},{"id":80787923,"identity":"fb219b94-af43-4bde-ac3b-f33bb763b7e6","added_by":"auto","created_at":"2025-04-17 06:12:35","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1440774,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6158243/v1/553815c7-158c-48b9-a373-59f1a8fc0efb.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"A Hybrid CNN and Attentive Hierarchical BiLSTM Model with SMO for Intrusion Detection in IIoT","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eThe IIoT played an important role in reshaping global industries like agriculture, healthcare and transportation even acted as a driving force in evolution of smart cities worldwide [\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. Integration of IIOT\u0026rsquo;s into industries brought major improvements in productivity and operational efficiency [\u003cspan additionalcitationids=\"CR5 CR6 CR7 CR8\" citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. Deploying devices capable of real-time remote monitoring and automated control resulted in efficient resource utilization and cost savings. For instance in agriculture field sensors monitors soil moisture, nutrient levels which helps farmers to take precise decisions resulting in better crop yields while reducing resource wastage. In manufacturing factories, automation ensures consistent product quality while maintaining energy efficiency. In transportation optimizing routes and monitoring vehicle health ensures timely delivery, thereby enhancing supply chain efficiency. In health care tracking of patient health metrics in real time through smart devices improved care quality [\u003cspan additionalcitationids=\"CR11 CR12 CR13 CR14 CR15 CR16 CR17 CR18\" citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eHowever, as this technology usage grows, vulnerabilities associated with it also grows, IIOT devices needs constant connectivity for data transfer which introduces loop holes in the network for malicious actors to exploit. These resource constrained devices needs computational power to implement traditional security protocols which they lack on top of that cyber-attacks such as Distributed Denial of Service (DDoS), unauthorised access are significant threats to IIoT networks. IDS is a critical layer of defence to mitigate these risks. IDS monitor network traffic flow to detect and counter threat or malicious activities that may compromise the network. It enhances system security by analysing patterns in the network traffic data to protect against wide varieties of known attacks. It also enhances scalability of resource-constrained IIoT devices without significantly increasing computational demands.\u003c/p\u003e \u003cp\u003eDue to limited processing power and scalability traditional NIDs often analyse partial packet data of fixed size which limits detection accuracy especially for complex and multi-stage attacks. As the number of features increases detection accuracy of traditional NIDS decreases. Advanced models that are designed to handle large data can significantly enhances detection accuracy but require high computational cost. Achieving high accuracy while maintain scalability remains an open challenge.\u003c/p\u003e \u003cp\u003eTo overcome the complexities of intrusion detection in IIoT networks, DL and optimization algorithms are used to build hybrid, efficient and scalable models. Hybrid DL models are a viable approach that can leverage the strengths of two DL models such as CNNs for extracting spatial features from data to identify patterns in multi-dimensional inputs and detect anomalies. On other hand types of RNNs such as LSTM to capture temporal dependencies within sequential data for analysing time-series network traffic. Hierarchical models can be used to further enhance performance. Hierarchical model harness the ability to balance feature complexity and detection accuracy. These models have multi-layered architecture that can prioritize critical features while reducing noise which allows them to perform intrusion detection accurately with minimal computational overhead. IIoT environments produces data in large scale with high-dimensional features that can often lead to issues like overfitting which can degrade models performance.\u003c/p\u003e \u003cp\u003eThe comparative analysis of existing hybrid models is presented in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e, which consists issues addressed by models and their limitations. Difficulty in handling high-dimensional data, lack of adaptability to new type of attacks, suboptimal feature reduction, and high computational complexity are some of the challenges that current hybrid intrusion detection models face in IIoT environment. In order to overcome these challenges they need to improve in dimensionality management, continues learning and adapt to new evolving threats along with efficient feature selection and simplified architectures. Overcoming these challenges can improve computational cost making them scalable and effective in IIoT environment.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eComparison of existing hybrid IDS models\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRefs\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAlgorithm or Model\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eMain Strength\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eLimitation\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eIBSWO\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eHigh classification accuracy. Effective feature selection using flat crossover and genetic algorithms.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eStruggles with high-dimensional data and finding optimal solutions.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSA-DCNN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAddresses redundancy and underfitting issues.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eUnable handle increasing attack classes.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eELM\u0026thinsp;+\u0026thinsp;SVM GA\u0026thinsp;+\u0026thinsp;ELM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eReduces irrelevant features while maintaining high detection performance.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eFeature reduction methods like KNN may outperform it in feature quantity reduction.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eXGBoost\u0026thinsp;+\u0026thinsp;CNN\u0026thinsp;+\u0026thinsp;LSTM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eHigh detection rates. Effective feature reduction using XGBoost and CNN.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eChallenges with feature dimensionality.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAE\u0026thinsp;+\u0026thinsp;FL\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eHandles data imbalance with collaborative learning.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eIncreased complexity and computation costs.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eProposed hybrid CNN-AH-BiLSTM model is designed to take care of following issues to improve intrusion detection:\u003c/p\u003e \u003cp\u003eProposed model is designed to handle wide range of attack that current intrusion detection systems are unable to detect.\u003c/p\u003e \u003cp\u003eSMO is employed to optimize data and reduce feature dimensionality, addressing the challenge of excessive feature counts which minimizes computational cost and time during model training process.\u003c/p\u003e \u003cp\u003eCNN is used for spatial feature extraction which captures high-level patterns. During validation testing CNN can recognize new sequential patterns which will enhance the ability of model to detect new type of attacks.\u003c/p\u003e \u003cp\u003eA multilayer attentive BiLSTM is integrated which focuses on critical temporal features enhancing the detection of sophisticated attacks.\u003c/p\u003e \u003cp\u003eIntegration of all these models helps to find a middle ground between comprehensive data analysis and computational throughput.\u003c/p\u003e \u003cp\u003ePresented paper is structured in following sections where section 2 provides survey and summary of prior research which includes traditional methodologies, recent research and comparative analysis of recent research. Section 3 outlines an in-depth explanation of the proposed model, thoroughly describes the methodology supporting the proposal. Section 4 delves into detail analysis of datasets that are used. Section 5 presents detailed analysis of the results and comparative analysis of proposed models results with other approaches.\u003c/p\u003e"},{"header":"2. Overview of existing research","content":"\u003cp\u003eVarious approaches have been proposed to address to protect IIoT network from increasing cyber threats. Each approach offer unique challenges posed by IIoT environments. Leveraging ML and DL techniques have shown significant improvement in enhancing cybersecurity for IIoT network environments\u003c/p\u003e \u003cp\u003eIn [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e] a multivariate correlations analysis-long short-term memory (MCA-LSTM) was proposed, for superior classification performance it has integrated the triangle area map (TAM) matrix with optimal feature subsets, model is evaluated on NSL-KDD and UNSW-NB15 dataset achieving the testing accuracies of 82.15% and 77.74% respectively. In [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e] author proposed a deep random neural network (DRaNN) which can classify nine types of attacks with a low false-positive rate. 41 features are used to train the model with the help of GD algorithm. Model was evaluated on UNSW-NB15 dataset with an accuracy of 99.54%. In [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e] author introduced pretraining Wasserstein generative adversarial network-based IDS (PWG-IDS) which integrates WGAN-GP for traffic data generation and LightGBM for classification. Model was evaluated on NSL-KDD and CIC-IDS2018 dataset achieving the F1 score of 99% and 89% respectively. A hybrid CNN-LSTM model was introduced in [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e] where CNN is employed for pattern recognition and LSTM for classification, model is tested on Edge-IIoTset dataset with 100% accuracy. Furthermore, a lightweight IDS is presented in [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e] by using Pearson correlation coefficient for feature selection which has reduced 45 features in the TON_IoT dataset to 10 core features. It has able to detect multiple attack types with 99% accuracy using KNN and RF models demonstrating high-accuracy attack detection in high-load environments. A CTSF framework is introduced in [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e] to address the limitation of transformers n extracting local features. Framework integrates CNNs and an enhanced Transformer in its pre-training phase in order to capture local and global features. For classification it uses SVM with \u0026ldquo;linear\u0026rdquo; and \u0026ldquo;rbf\u0026rdquo; kernels, framework is evaluated on X-IIOTID dataset achieving the accuracy of 98.88%. In [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e] model integrating Graph Convolutional Network (GCN) and Long Short-Term Memory (LSTM) is presented, GCN is employed for feature extraction and pattern recognition and LSTM is used for classification. IoT-23 dataset is used for performance analysis achieving accuracy of 99.99%. In [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e] a deep feed forward neural network DAE-DFFNN is proposed which uses hybrid rule-based feature selection. It uses genetic search algorithm for feature evaluation. Model evaluation is done on NSL-KDD and UNSW-NB15 datasets, achieving 99.0% and 98.9% accuracy, respectively. In [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e] XGBoost was applied on X-IIoTDS and TON_IoT datasets to address the class imbalance, it has achived F1 score of 99.9% and 99.87%, respectively. In [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e] a Deep-IFS model is proposed for distributed learning in fog computing environments which employed LocalGRU and MHSA layers for feature extraction on Bot-IoT and UNSW-NB15. It has achieved the accuracy of 99.75% for binary classification and 98.1% for multiclass classification on Bot-IoT on other hand it has achieved accuracy of 99.94% for binary classification and 99.75% for multiclass classification on UNSW-NB15 dataset. In [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e] author presented ensemble models where Chi-Square Statistical method was used for feature selection along with various classifiers such as XGBoost, Bagging, and Random Forest are used for classification using ToN_IoT, Fridge, Garage_door, GPS_Tracker, Motion-Light, Moldbul, Thermostat and Weather dataset. Among these classifiers XGBoost outperformed with accuracy close to 100%. A hybrid rule-based model with DAE-DFFNN is presented in [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. Automated dimensionality reduction techniques is used with rule-based feature extraction. Model validation is on NSL-KDD and UNSW-NB15 datasets achieving detection rate of 98.0%. In [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e] a Federated learning approach was presented for privacy-preserving anomaly detection, to train local models deep reinforcement learning is used without sharing sensitive data. Author did not use any local dataset. In [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e] anomaly detection in data streams (ASTREAM) approach was proposed which utilizes LSHiForest with PCA for identifying correlations between different attributes, sliding windows to handle the infiniteness of data streams and change detection to detect data distribution change in time and train the new model. Proposed approach was validated on KDDCUP99 dataset. In [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e] a hybrid model called EvolCostDeep was proposed which consist of stacked autoencoders (SAE) and CNNs with a cost-dependent loss function for addressing scalability and class imbalance issues. Model evaluation was done on ToN-IoT and UNSW-NB15 datasets with F1-score of 95.2%. In [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e] LightGBM with deep learning algorithms was proposed which are utilized in the lower level and upper level of the network for intrusion detection. Author focused on reducing training time making it suitable for edge IIoT scenarios. A dense random neural network (DnRaNN) was implemented in [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e] for binary and multiclass classifications which classified nine different attacks on the IoT. Model is evaluated on ToN_IoT achieving the accuracy of 99.14% for binary and 99.05% for multiclass classifications. In [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e] author addressed the issue of privacy-preserving in ML and issue of local models training with non-independent and identically distributed (non-IID) data for which Federated learning (FL) with instance-based transfer learning and weighted rank aggregation is proposed. For rank aggregation AdaBoost and Random Forest algorithms were used achieving the accuracy of 95.97% and 73.70% respectively. Similarly in [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e] problem data privacy violation during training phase was addressed by proposing FL in context of Software Defined Networking (SDN). Model identified Syn attack with an accuracy of 98.20%, MSSQL attack with 99.30% and NetBios attack with 99.99%. In [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e] author implemented six ML algorithms, RF, DT, KNN, LR, SVM, NB to build multiple IDS models. All models were assessed on the WUSTL-IIoT-2021 dataset among them RF achieved highest accuracy of 99.97%. Where SDN-based frameworks consisting of SVM and Decision Tree models also showed high detection accuracy of 99.7% with the NSL-KDD dataset [\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e]. In [\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e] author used Genetic Programming (GA) for intrusion detection in RPL-based IIoT environments. A threshold is calculated for each attack by extracting specific features from each nodes. This threshold modulation demonstrated high attack detection accuracy in simulation with 93.3% of true positive rate. In [\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e] a bidirectional LSTM model with multi-feature layers (B-MLSTM) is proposed. In training phase sequence and stage feature layers are introduced which enables model to detect threats in different intervals by analysing and learning corresponding attack interval from historical data after that a double-layer reverse unit updates the detection model in order to match the new attack interval. Model was evaluated on CTU-13, AWID and Gas-Water dataset with significant reductions in false positive 46.79% and false negative rates 79.85%. In [\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e] author used six supervised algorithms namely KNN, SVM, QDA, NB, XgBoost and Cat Boost employing min-max normalization. PCA is used for dimensionality reduction all models were evaluated on UNSW-NB15 dataset achieving an average accuracy of 99.9%. In [\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e] author proposed dual LSTMs, one is packet classifier and second is session classifier. It also consists of DNN that performs final classification performance evaluation was done on ISCXID2012 and CICIDS2017 dataset achieving the accuracy of 94.73% and 99.61% respectively. Pearson-Correlation Coefficient - Convolutional Neural Networks (PCC-CNN) is presented in [\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e]. Important features are extracted by the linear-based extractions then by CNN. Author firstly trained five PCC-based ML models such as Logistic Regression, Linear Discriminant Analysis, KNN, Classification and Regression Tree, \u0026amp; SVM to assess the performance. For validation NSL-KDD, CICIDS-2017, and IOTID20 datasets are used achieving the average accuracy of 99.89% across all datasets. In [\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e] oversampling and feature selection techniques were explored. Training sample size is reduced to minimum 39% to maximum of 74% by using SMOTE oversampling technique. Two classification models namely KNN and RF are used where detection accuracy of 99% is achieved on CICIDS-2017 and UNSW-NB15 datasets with RF and Tree Parzen Estimator (BO-TPE-RF) optimization algorithm. In [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e] author used four supervised classifier algorithms, namely, DT, RF, KNN and SVM for classification along with two feature selection algorithms namely Correlation-based Feature Selection (CFS) algorithm and the Genetic Algorithm (GA). IoTID20 dataset is used for performance evaluation which showed DT and RF with GA-selected features achieved 100% accuracy across metrics [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e]. Lastly, RNN-based IDS is proposed in [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e] for binary and multiclass classification tasks. Performance evaluation shows that the model achieved highest accuracy on NSL-KDD dataset.\u003c/p\u003e \u003cp\u003eThe Table \u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e provides an overview of recent research on IDS, showing various models and methodologies addressing specific issues in IDS, which highlights the diversity in approaches and datasets to tackle critical problems.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eComparative summary of recent research on IDS\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRefs.\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eModel Used\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eIssues Addressed\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eDataset Used\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eIBSWO\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eFeature selection and Handling high-dimensional data and imbalanced datasets\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eUNSWNB15, TON_IoT, NCTUKM-IIOT\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSA-DCNN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eImbalanced training data, redundant features, under fitting in IIoT IDS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eIoTID20,\u003c/p\u003e \u003cp\u003eEdge-IIoTset\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGA-ELM with SVM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eFeature selection for IoT, detection performance\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTON_IoT, UNSWNB15\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eXGBoost+\u003c/p\u003e \u003cp\u003eLSTM, CNN\u0026thinsp;+\u0026thinsp;LSTM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eTackling data imbalance and low test accuracy, binary and multi-class classification\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCICIDS2017, UNSWNB15, NSLKDD, WSNDS\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eDRL\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAnomaly detection in SCADA systems, monitoring complex environments, real-time detection\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eWUSTL-IIoT-2018,\u003c/p\u003e \u003cp\u003eWUSTL-IIoT-2021\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFL with AE\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eZero-day attacks, handling data imbalance in a 5G IIoT environment\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eX-IIoTID\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eBPNN with GA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eEnhancing accuracy in fog computing environment by optimizing weights and biases\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eUNSW-NB15, TON_IoT\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eIDS-SIoDL with LSTM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eReducing training and classification times in real-time intrusion detection in IoT-based smart cities\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eBoT-IoT,\u003c/p\u003e \u003cp\u003eEdge-IIoT, NSLKDD\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eEnsemble DL\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eExplainability and robustness in detecting, reducing false-positive rates\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTON_IoT\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMAGRU\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eImbalanced training data, missing network attacks with fewer samples\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eEdge-IIoTset, MQTTset\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFGOA\u0026thinsp;+\u0026thinsp;kNN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eDetection of botnet attacks, Improving hyperparameter tuning.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eN-BaIoT\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eeBF\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eMemory efficiency and accelerating filtering of malicious URLs in IoT\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eReal ID Dataset\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eRF-PCCIF / RF-IFPCC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eComputational cost and prediction time. Addresses outliers in feature selection.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eBot-IoT, NF-UNSWNB15-v2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCNN\u0026thinsp;+\u0026thinsp;LSTM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eDetection rate, classification accuracy and reducing false detection.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eKDD CUP99, NSLKDD, UNSWNB15,\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCRSF (CNN-RNN\u0026thinsp;+\u0026thinsp;SVM)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAddressed limitations of manual feature extraction in SVM.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTON_IoT\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFL with Attention\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eScalability and communication overhead in centralized IIoT. Detection rate in distributed environments.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eEdge-IIoTSet\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCNN\u0026thinsp;+\u0026thinsp;LSTM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eDetection rate for binary and multi-class classification.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eUNSWNB15,\u003c/p\u003e \u003cp\u003eX-IIoTID\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCentralized and FL\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eComprehensive dataset generation with realistic attacks\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eEdge-IIoTset\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGA-RF\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eEnhanced accuracy and AUC in binary and multiclass detection\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eUNSWNB15\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eHDRaNN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eDetecting 16 types of attacks with robust classification.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eDS2OS,\u003c/p\u003e \u003cp\u003eUNSWNB15\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e5-Layer AE\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAddressed data imbalance and reconstruction error handling\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eNSLKDD\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e"},{"header":"3. Proposed Model","content":"\u003cp\u003eThis study proposes a hybrid DL model which integrates CNN with Attentive Hierarchical Bi-LSTM for intrusion detection in IIoT network. As depicted in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e proposed model consist of SMO to reduce data dimension, CNN for feature extraction and pattern recognition and Hierarchical Bi-LSTM to learn crucial temporal features followed by self-attention layer enhance the model\u0026rsquo;s focus on critical features. Model process is divided into three stages: Stage 1: Data pre-processing, Stage 2: Feature Extraction and Stage 3: Model Training \u0026amp; Testing.\u003c/p\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e3.1 Data Pre-Processing\u003c/h2\u003e \u003cp\u003eA crucial step to ensure that the ML and DL model works effectively is preparing the dataset. Proposed framework starts with pre-processing the raw dataset to prepare it for training and testing phases. Pre-processing incorporates various data processing techniques such as data encoding, normalization, dimension reduction. Here we are leveraging three datasets namely NSL-KDD, Edge-IIoTSet and X-IIoTID. In proposed work the pre-processing consists of the following steps:\u003c/p\u003e \u003cdiv id=\"Sec5\" class=\"Section3\"\u003e \u003ch2\u003e3.1.1 Data Encoding\u003c/h2\u003e \u003cp\u003eData encoding converts categorical data like attack types, protocol into machine-readable format, which helps to process non-numerical values during training. In NSL-KDD there are some basic features such as protocol_type, service, and flag. In X-IIoTID device_type and protocol and attack types in Edge-IIoTSet are encoded using one-hot encoding technique.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section3\"\u003e \u003ch2\u003e3.1.2 Data Normalization\u003c/h2\u003e \u003cp\u003eIn order to scale features to a uniform range ([0, 1] or [-1, 1]) data normalization is used. It reduces bias caused by differing feature ranges which improves model performance. In ID datasets normalization is important because they contain various range of features for example byte counts, duration, or telemetry values. Here we are using Min-Max Scaling to rescale features to a [0, 1] range. Normalizing features ensures that ensures that features with large ranges for example packet size do not dominate small scale features such as CPU usage, making neural network models more effective.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section3\"\u003e \u003ch2\u003e3.1.3 Data Dimension Reduction\u003c/h2\u003e \u003cp\u003eThe SMO algorithm is used for data dimension reduction. SMO is a swarm-based metaheuristic algorithm inspired by the foraging behavior of spider monkeys. In NSL-KDD dataset it reduces redundant network features for example packet counts, connection durations while retaining key intrusion patterns. In X-IIoTID dataset device related features like device-type, cpu_usage, and memory_usage are optimized in order to identify less but impactful attributes. In Edge-IIoTSet network traffic features such as protocol, flow_duration, total_fwd_packets are optimized which enhances interpretability while managing large-scale, heterogeneous data.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003e3.2 Feature Extraction\u003c/h2\u003e \u003cp\u003eAfter pre-processing, the data is processed through a CNN for feature extraction. CNNs capability to handle time-series and tabular data makes them well suited for pattern recognition from multi-dimensional data such as intrusion detection datasets [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. In this approach pre-processed data is reshaped to mimic image-like input where rows represents samples and columns represents features [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eFirstly convolution layer applies multiple filters over pre-processed data to extract spatial features. It captures most relevant features by detecting patterns and correlations between patterns. In max pooling layer the spatial dimensions of the feature maps gets reduced while retaining the most significant information resulting in minimal computational complexity and overfitting. In training process the weights of CNN highlight the features that contributes the most in differentiating normal and attack classes [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. For example in X-IIoTID dataset CNN focuses on device behaviour and protocol anomalies, in NSL-KDD it focuses on traffic-related features and in X-IIoTID network flow metrics are prioritized.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003e3.3 Model Training and Validation\u003c/h2\u003e \u003cp\u003eThe extracted features are split into training data and testing data. The training data is then processed through the following sequence of layers for model training and then testing data is processed to validate the trained model:\u003c/p\u003e \u003cdiv id=\"Sec10\" class=\"Section3\"\u003e \u003ch2\u003e3.3.1 Hierarchical Bi-LSTM\u003c/h2\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. Illustrate a hierarchical Bi-LSTM network with two forward layers and two backward layers. In this architecture first forward and backward layers output are passed to second forward and backward layers enabling a deeper understanding of temporal dependencies in sequential data. This structure understand sequential and contextual relationships between features.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eHere, input sequence \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\left\\{{I}_{t-1},{I}_{t},{I}_{t+1}\\right\\}\\)\u003c/span\u003e\u003c/span\u003e are passed to the first forward layer. Every input at time \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:t\\)\u003c/span\u003e\u003c/span\u003e is denoted as \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{I}_{t}\\)\u003c/span\u003e\u003c/span\u003e which generates a hidden state \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{a}_{t}^{f}\\)\u003c/span\u003e\u003c/span\u003e. This hidden state captures temporal information from past inputs in a forward direction. First forward state is denoted as\u003cdiv id=\"Equ1\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ1\" name=\"EquationSource\"\u003e\n$$\\:{a}_{t}^{f}={\\sigma\\:}\\left({W}_{f}\\bullet\\:{I}_{t}.+\\:{U}_{f}\\bullet\\:{a}_{t-1}^{f}+{b}_{f}\\right)$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e1\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eWhere\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:,\\:{W}_{f}\\)\u003c/span\u003e\u003c/span\u003e is the weight matrix for current input \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{I}_{t}\\)\u003c/span\u003e\u003c/span\u003e. \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{a}_{t-1}^{f}\\)\u003c/span\u003e\u003c/span\u003e represents previous hidden state and weight matrix for \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{a}_{t-1}^{f}\\)\u003c/span\u003e\u003c/span\u003e is represented by \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\:{U}_{f}\\)\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{f}\\)\u003c/span\u003e\u003c/span\u003e is the forward pass bias term.\u003c/p\u003e \u003cp\u003eAfter that the input sequence is passed to first backward layer, which processes data in reverse order to generate a backward hidden state \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{t}^{b}\\)\u003c/span\u003e\u003c/span\u003e which captures dependencies from future inputs. Backward hidden state hidden state is modelled as\u003cdiv id=\"Equ2\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ2\" name=\"EquationSource\"\u003e\n$$\\:{b}_{t}^{b}={\\sigma\\:}\\left({W}_{b}\\bullet\\:{I}_{t}.+\\:{U}_{b}\\bullet\\:{b}_{t+1}^{b}+{b}_{b}\\right)$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e2\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eWhere\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:,\\:{W}_{b}\\)\u003c/span\u003e\u003c/span\u003e,\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\:{U}_{b}\\)\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{b}\\)\u003c/span\u003e\u003c/span\u003e are the weight matrices for current input \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{I}_{t}\\)\u003c/span\u003e\u003c/span\u003e and subsequent hidden state \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{t+1}^{b}\\)\u003c/span\u003e\u003c/span\u003e respectively, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{b}\\)\u003c/span\u003e\u003c/span\u003e is a bias term for backward pass.\u003c/p\u003e \u003cp\u003eOutput of first forward layer \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{a}_{t}^{f}\\)\u003c/span\u003e\u003c/span\u003e is passed as an input to second forward layer, which refines learned temporal information and generates \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{a}_{t}^{a}\\)\u003c/span\u003e\u003c/span\u003e, which is denoted as\u003cdiv id=\"Equ3\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ3\" name=\"EquationSource\"\u003e\n$$\\:{a}_{t}^{a}={\\sigma\\:}\\left({W}_{a}\\bullet\\:{a}_{t}^{f}.+\\:{U}_{a}\\bullet\\:{a}_{t-1}^{a}+{b}_{a}\\right)$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e3\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eWhere, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{W}_{a}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{U}_{a}\\)\u003c/span\u003e\u003c/span\u003e are the weight matrix for output of first forward layer \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{a}_{t}^{f}\\)\u003c/span\u003e\u003c/span\u003e and previous refined state \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{a}_{t-1}^{a}\\)\u003c/span\u003e\u003c/span\u003e respectively. \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{a}\\)\u003c/span\u003e\u003c/span\u003e is a bias term for the current layer. Similarly output of first backward layer \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{t}^{b}\\)\u003c/span\u003e\u003c/span\u003e is passed to the second backward layer which refines temporal information from the first backward layer to compute \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{t}^{b}\\)\u003c/span\u003e\u003c/span\u003e\u003cdiv id=\"Equ4\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ4\" name=\"EquationSource\"\u003e\n$$\\:{b}_{t}^{b}={\\sigma\\:}\\left({W}_{b}^{{\\prime\\:}}\\bullet\\:{b}_{t}^{b}.+\\:{U}_{b}^{{\\prime\\:}}\\bullet\\:{b}_{t+1}^{b}+{b}_{b}^{{\\prime\\:}}\\right)$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e4\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eWhere, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{W}_{b}^{{\\prime\\:}}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{U}_{b}^{{\\prime\\:}}\\)\u003c/span\u003e\u003c/span\u003e are the weight matrix for output of first backward layer \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{t}^{b}\\)\u003c/span\u003e\u003c/span\u003e and next refined state \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{t+1}^{b}\\)\u003c/span\u003e\u003c/span\u003e respectively.\u003c/p\u003e \u003cp\u003eFor every time step \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:t\\)\u003c/span\u003e\u003c/span\u003e, output \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{O}_{t}\\)\u003c/span\u003e\u003c/span\u003e is generated by combining second forward layer \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{a}_{t}^{a}\\)\u003c/span\u003e\u003c/span\u003e and second backward layer \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{t}^{b}\\)\u003c/span\u003e\u003c/span\u003e. \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{O}_{t}\\)\u003c/span\u003e\u003c/span\u003e is the feature representation at time \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:t\\)\u003c/span\u003e\u003c/span\u003e.\u003cdiv id=\"Equ5\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ5\" name=\"EquationSource\"\u003e\n$$\\:{O}_{t}=V\\bullet\\:\\left[{a}_{t}^{a};{b}_{t}^{b}\\right]+c$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e5\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eWhere \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\left[{a}_{t}^{a};{b}_{t}^{b}\\right]\\)\u003c/span\u003e\u003c/span\u003e represents the concatenated forward and backward hidden states, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:V\\)\u003c/span\u003e\u003c/span\u003e represents output weights and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:c\\)\u003c/span\u003e\u003c/span\u003e represent biases.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section3\"\u003e \u003ch2\u003e3.3.2 Self-Attention Layer\u003c/h2\u003e \u003cp\u003eThe final output \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{O}_{t}\\)\u003c/span\u003e\u003c/span\u003e is a combination of information from forward and backward passes which provides a comprehensive temporal context for the sequence at time \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:t\\)\u003c/span\u003e\u003c/span\u003e. Outputs from the output layer \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{O}_{t}\\:=\\{{O}_{1},\\:{O}_{2},\\:\\dots\\:{O}_{T},\\:\\}\\)\u003c/span\u003e\u003c/span\u003e is served as input to self-attention layer. Attention layer focuses on most critical parts of the input sequence by computing attention weights. Each \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{O}_{t}\\)\u003c/span\u003e\u003c/span\u003e gets transformed into Query (Q), Key (K), and Value (V) matrices by using learnable weight matrices. Calculation of attention score is formulated as\u003cdiv id=\"Equ6\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ6\" name=\"EquationSource\"\u003e\n$$\\:Attention\\left(Q,K,V\\right)=softmax\\left(\\frac{Q{\\times\\:K}^{T}}{\\sqrt{{d}_{k}}}\\right)\\times\\:V$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e6\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eAttention score represents the critical part of input sequence relative to others, which enables model to focus on patterns that are influential like intrusion signatures.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section3\"\u003e \u003ch2\u003e3.3.3 Dropout Layer\u003c/h2\u003e \u003cp\u003eThe attention-weighted output is passed through a dropout layer for further processing in order to reduce overfitting. Dropout layer randomly deactivate fraction of neurons with a predefined probability\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\:\\left(p\\right)\\)\u003c/span\u003e\u003c/span\u003e. It improves generalization ability of a model which means it reduces risk of over-reliance on specific neurons. By deactivating some neurons, dropout layer forces model to learn more robust and generalized features which in turn enhances the model's ability to handle unseen data. This helps model to improve its performance on testing dataset.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section3\"\u003e \u003ch2\u003e3.3.4 Fully Connected Layer\u003c/h2\u003e \u003cp\u003eDropout layer output is processed by fully connected (dense) layer which learns complex patterns and prepares data for classification. This layer integrates features from previous layers and map them to a higher-dimensional space. By using a weight matrix \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{W}_{fc}\\)\u003c/span\u003e\u003c/span\u003e and bias \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{fc}\\)\u003c/span\u003e\u003c/span\u003e each input gets transformed and passed through ReLU activation function which enables model to identify patterns relative to normal and attack behavior in the input data.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section3\"\u003e \u003ch2\u003e3.3.5 Softmax Layer\u003c/h2\u003e \u003cp\u003eSoftmax layer converts the raw predictions into probabilities for multi-class classification such as normal or different types of attacks and ensures that the probabilities for all classes sum to 1. As shown in (7) probability for a given output \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{O}_{t}\\)\u003c/span\u003e\u003c/span\u003e belonging to class \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:i\\)\u003c/span\u003e\u003c/span\u003e is calculated as:\u003cdiv id=\"Equ7\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ7\" name=\"EquationSource\"\u003e\n$$\\:P\\left({class\\:}_{i}\\right|\\:{O}_{t})\\:=\\:\\frac{{e}^{{W}_{o}^{i}·{O}_{t}+{b}_{o}^{i}}}{{\\sum\\:}_{j=1}^{C}{e}^{{W}_{o}^{i}·{O}_{t}+{b}_{o}^{i}}}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e7\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eIn (7) total number of classes is represented with \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:C\\)\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{W}_{o}^{i}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{b}_{o}^{i}\\)\u003c/span\u003e\u003c/span\u003e are the weight and bias for class \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:i\\)\u003c/span\u003e\u003c/span\u003e. \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:P\\left({class\\:}_{i}\\right|\\:{O}_{t})\\)\u003c/span\u003e\u003c/span\u003e is the probability of \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{O}_{t}\\)\u003c/span\u003e\u003c/span\u003e for given class \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:i\\)\u003c/span\u003e\u003c/span\u003e. The class with the highest probability is selected for model's prediction.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"4. Datasets","content":"\u003cp\u003eThe NSL-KDD dataset [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e][\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e] addresses the redundancy and class imbalance issues that are present in KDD Cup 1999 dataset. NSL-KDD dataset is upgraded version of KDD Cup 1999 dataset which is designed for reliability during research. As per Table \u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e dataset consist of 41 features categorically divided into basic, content, traffic and host features. Each record is labelled as either normal or attack. Attacks are clustered into four sub-categories Denial of Service (DoS) which overwhelm resources, Probe which scan vulnerabilities, User-to-Root for privilege escalation, Remote-to-Local known for attempting unauthorized access.\u003c/p\u003e \u003cp\u003eThe X-IIoTID dataset [\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e][\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e] is constructed specifically for IIoT environments, mimicking real-world network behaviours and threats. It focuses on specific attributes like device type, communication protocols, traffic features, and operational data. Attack data includes DoS, Man-in-the-Middle (MITM) and malicious payload injections. Data is labelled as benign and specific attack categories. Similarly Edge-IIoTset dataset [\u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e] is also designed for IIoT environment specifically capturing the complexities of edge-based networks. Data features includes network traffic, system logs, and device telemetry mimicking real-world IIoT scenarios with various devices and communication protocols such as MQTT and CoAP. Dataset contains different types of traditional network intrusion attacks like DoS, spoofing and IIoT-specific attacks like data exfiltration, firmware tampering. Edge-IIoTset is a multi-domain dataset which helps to develop robust security solutions.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eComparative summary Datasets\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAttributes\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNSL-KDD\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eX-IIoTID\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eEdge-IIoT\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eFeatures\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e41\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e68\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e61\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTotal Data Records\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e148,517\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e820,834\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e22,339,021\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMain Categories\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e"},{"header":"5. Performance Analysis","content":"\u003cp\u003eTrained model is evaluated by using testing data. Data points are classified as normal or attack as per Eq.\u0026nbsp;(\u003cspan class=\"InternalRef\"\u003e8\u003c/span\u003e). For class prediction, class with the highest probability for data point at time \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:t\\)\u003c/span\u003e\u003c/span\u003e get selected.\u003c/p\u003e\n\u003cdiv id=\"Equ8\" class=\"Equation\"\u003e\n \u003cdiv class=\"mathdisplay\" id=\"FileID_Equ8\" name=\"EquationSource\"\u003e$$\\:{\\widehat{y}}_{t}={agrmax}_{i}\\:P\\left({class}_{i}\\right|{O}_{t})$$\u003c/div\u003e\n \u003cdiv class=\"EquationNumber\"\u003e8\u003c/div\u003e\n\u003c/div\u003e\n\u003cp\u003eWhere predicted class label for time step \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:t\\)\u003c/span\u003e\u003c/span\u003e is represented by \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\widehat{y}}_{t}\\:\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{agrmax}_{i}\\:\\)\u003c/span\u003e\u003c/span\u003eidentifies the index \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:i\\)\u003c/span\u003e\u003c/span\u003e corresponding to the maximum probability.\u003c/p\u003e\n\u003cp\u003eSoftmax layer calculate the probabilities which are then used for classification. Each data point is classified into predefined categories such as \u0026ldquo;Normal\u0026rdquo; or other specific attack types like DoS, Probe. Standard metrics like accuracy, precision, recall, and F1-score are used for evaluating classification performance.\u003c/p\u003e\n\u003cp\u003eFigure \u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e, \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e illustrates the training and validation performance of a multiclass classification model on the NSL-KDD dataset over multiple epochs. As depicted in Fig. \u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e. Training and validation accuracy curve increases steadily over 20 epochs and cover 90% accuracy on 10th epoch which indicates effective learning with minimal overfitting. While the training and validation loss curve decrease expeditiously till 10th epoch.\u003c/p\u003e\n\u003cp\u003eFigure 5 depicts the comparative performance analysis of proposed model with existing intrusion detection models that are trained and validated on NSL-KDD dataset. Standard metrics like accuracy, precision, recall, and F1-score are used. \u0026nbsp;Proposed CNN-AH-BiLSTM model has the highest accuracy of 99.96%, 99.84% precision, 99.81% recall, and 99.83% F1-score. These scores indicate the effective detection capability of proposed model. DNN [52] model achieved the accuracy of 98.00% and precision and recall score of 97.00% respectively. RANN [52] model has maintained balance between precision and recall with a score of 92.18% and 92.35% respectively with an F1-score of 92.29%. PCC-CNN [47] model has achieved accuracy of 94.00%, but the recall and F1-score is significantly lower with 77.00% and 80.00% respectively. 5-Layer AE [21] model has the lowest accuracy of 90.61% amongst all other models and has the highest recall score of 98.43% which makes it effective for intrusions detection but can be prone to false positives. Proposed CNN-AH-BiLSTM scores balanced scores across all metrics.\u003c/p\u003e\n\u003cp\u003eFigure 6 and \u003cspan class=\"InternalRef\"\u003e7\u003c/span\u003e illustrates the training and validation performance for multiclass classification on the Edge-IIoTset dataset. Figure 6 depicts training and validation accuracy over 20 epochs where both curves are increasing steadily achieving the training accuracy of 99.98% and validation accuracy of 99.82% on final epoch which indicates effective model learning with minimal overfitting. Figure \u003cspan class=\"InternalRef\"\u003e7\u003c/span\u003e shows the loss graph showing sharp decrease in initial epochs. Both figures indicate that the model achieved high convergence rate with no overfitting or underfitting.\u003c/p\u003e\n\u003cp\u003eFigure 8 illustrate the comparative performance analysis of various existing models with proposed model on Edge-IIoTSet dataset. Here SA-DCNN [2] has achieved the accuracy of 99.96%, precision score 99.83%, recall 99.79% and F1-score 99.81%, followed by MAGRU-IDS [10] with an accuracy of 99.94%. The Bi-GRU-CL and Bi-GRU-FL [16] achieved lowest accuracy of 94.60% and 95.70%, respectively. Proposed CNN-AH-BiLSTM achieved the accuracy 99.82%, precision score of 98.43%, recall 99.10% and F1-score of 98.76%,\u003c/p\u003e\n\u003cp\u003eFigure 9 and \u003cspan class=\"InternalRef\"\u003e10\u003c/span\u003e depicts models performance on X-IIoTD dataset over 20 epochs for multiclass classification. Figure 9 shows the training and validation accuracy. Till epoch 7 graph shows rapid increase in accuracy. After 98% there is plateau till 20th epoch which shows model learning rate slows down rapidly and no longer learning new patterns at the end. Figure \u003cspan class=\"InternalRef\"\u003e10\u003c/span\u003e shows the loss graph depicting steady decrease in loss. Alignment between both curves shows the minimal overfitting.\u003c/p\u003e\n\u003cp\u003eFigure \u003cspan class=\"InternalRef\"\u003e11\u003c/span\u003e depicts the comparative performance analysis of existing models with proposed model using standard performance metrics for X-IIoTD dataset. Here FL-AE [\u003cspan class=\"CitationRef\"\u003e6\u003c/span\u003e] model achieved accuracy and F1-Score of 99.32% and 99.84% respectively. AP2PFL achieved lowest accuracy 96.42%, and its variants AP2PFL-DNN and AP2PFL-MLP achieved accuracy of 97.95% and 97.21%, respectively. Proposed CNN-AH-BiLSTM model shows strong performance across all metrics. It has established itself as a strong alternative to FL-AE model with an accuracy of 98.75% and balanced score for other metrics of 98.40%. CNN-AH-BiLSTM offers a balance between accuracy and efficiency.\u003c/p\u003e\n\u003cdiv class=\"gridtable\"\u003e\u0026nbsp;\u003ctable id=\"Tab4\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eComparative Analysis of proposed model with existing models for binary classification. Attribute like Feature Selection were not explicitly available for all models and datasets, so placeholders (\u0026quot;-\u0026quot;) have been used in cases where data is not provided in the input.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003ccolgroup cols=\"4\"\u003e\u003c/colgroup\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eModel\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eFeature Selection\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eDataset\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eAccuracy\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eXGBoost-LSTM [\u003cspan class=\"CitationRef\"\u003e4\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eXGBoost\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eUNSWNB15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e94.41\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN-LSTM [\u003cspan class=\"CitationRef\"\u003e4\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eWSN DS\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e91.18\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN-LSTM [\u003cspan class=\"CitationRef\"\u003e4\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEdge-IIoTset\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e95.21\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eFL-AE [\u003cspan class=\"CitationRef\"\u003e6\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eX-IIoTID\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.32\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEHIDS [\u003cspan class=\"CitationRef\"\u003e7\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eUNSWNB15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e96.47\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEHIDS [\u003cspan class=\"CitationRef\"\u003e7\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTON_IoT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e95.36\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eELM [\u003cspan class=\"CitationRef\"\u003e9\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTON_IoT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e98.77\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSHAP-LIME-ELM [\u003cspan class=\"CitationRef\"\u003e9\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTON_IoT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.69\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN-AH-BiLSTM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN\u0026thinsp;+\u0026thinsp;SMO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNSL-KDD\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.98\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN-AH-BiLSTM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN\u0026thinsp;+\u0026thinsp;SMO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEdge-IIoTSet\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.93\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN-AH-BiLSTM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN\u0026thinsp;+\u0026thinsp;SMO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eX-IIoTID\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e98.88\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n\u003c/div\u003e\n\u003cp\u003eTable \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e and Fig. \u003cspan class=\"InternalRef\"\u003e12\u003c/span\u003e presents various intrusion detection models with respective accuracy and datasets they are evaluated on along with feature selection technique that is employed.\u003c/p\u003e\n\u003cp\u003eAll evaluated results are for binary classification. Proposed CNN-AH-BiLSTM model uses CNN\u0026thinsp;+\u0026thinsp;SMO for feature selection and achieves accuracy of 99.98%, 99.93%, 98.88% for\u003c/p\u003e\n\u003cp\u003eNSL-KDD, Edge-IIoTSet and X-IIoTID datasets respectively. SHAP-LIME-ELM [\u003cspan class=\"CitationRef\"\u003e9\u003c/span\u003e] model achieved 99.69% accuracy on TON_IoT dataset, FL-AE [\u003cspan class=\"CitationRef\"\u003e6\u003c/span\u003e] model achieved 99.32% accuracy on X-IIoTID while EHIDS [\u003cspan class=\"CitationRef\"\u003e7\u003c/span\u003e] achieved accuracy of 96.47% on UNSWNB15 and 95.36% on TON_IoT dataset.\u003c/p\u003e\n\u003cdiv class=\"gridtable\"\u003e\u0026nbsp;\u003ctable id=\"Tab5\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eComparative Analysis of proposed model with existing models for multiclass classification. Attribute like Feature Selection were not explicitly available for all models and datasets, so placeholders (\u0026quot;-\u0026quot;) have been used in cases where data is not provided in the input.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003ccolgroup cols=\"4\"\u003e\u003c/colgroup\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eModel\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eFeature Selection\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eDataset\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eAccuracy\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eBSWO [\u003cspan class=\"CitationRef\"\u003e1\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSWO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eUNSWNB15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e97.80\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eIBSWO [\u003cspan class=\"CitationRef\"\u003e1\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSWO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eUNSWNB15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e98.70\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eBSWO [\u003cspan class=\"CitationRef\"\u003e1\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSWO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTON_IoT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.70\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eIBSWO [\u003cspan class=\"CitationRef\"\u003e1\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSWO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTON_IoT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.90\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eBSWO [\u003cspan class=\"CitationRef\"\u003e1\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSWO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNCTUKM-IIOT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.20\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eIBSWO [\u003cspan class=\"CitationRef\"\u003e1\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSWO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNCTUKM-IIOT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.70\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSA-DCNN [\u003cspan class=\"CitationRef\"\u003e2\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eDCNN\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEdge-IIoTset\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.95\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSA-DCNN [\u003cspan class=\"CitationRef\"\u003e2\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eDCNN\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eIoTID20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e96.89\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eGA-ELM [\u003cspan class=\"CitationRef\"\u003e3\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSVM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTON_IoT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.00\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eGA-ELM [\u003cspan class=\"CitationRef\"\u003e3\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSVM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eUNSWNB15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e86.00\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eXGBoost-LSTM [\u003cspan class=\"CitationRef\"\u003e4\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eUNSWNB15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e90.71\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN-LSTM [\u003cspan class=\"CitationRef\"\u003e4\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eWSN DS\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e91.09\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eDRL [\u003cspan class=\"CitationRef\"\u003e5\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eWUSTL-IIoT-2018\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.36\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eELM [\u003cspan class=\"CitationRef\"\u003e9\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTON_IoT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e88.23\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSHAP-LIME-ELM [\u003cspan class=\"CitationRef\"\u003e9\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTON_IoT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.63\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMAGRU [\u003cspan class=\"CitationRef\"\u003e10\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eXGBoost\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEdge-IIoTset\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.94\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMAGRU [\u003cspan class=\"CitationRef\"\u003e10\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eXGBoost\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMQTTset\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.99\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eIHHO-NN [\u003cspan class=\"CitationRef\"\u003e11\u003c/span\u003e]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eGOA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNa-BaIoT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e98.07\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN-AH-BiLSTM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN\u0026thinsp;+\u0026thinsp;SMO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNSL-KDD\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.96\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN-AH-BiLSTM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN\u0026thinsp;+\u0026thinsp;SMO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEdge-IIoTSet\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e99.82\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN-AH-BiLSTM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCNN\u0026thinsp;+\u0026thinsp;SMO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eX-IIoTID\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e98.75\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n\u003c/div\u003e\n\u003cp\u003eTable \u003cspan class=\"InternalRef\"\u003e5\u003c/span\u003e and Fig. \u003cspan class=\"InternalRef\"\u003e13\u003c/span\u003e presents the accuracy of various intrusion detection models with their respective dataset that they are evaluated on, along with feature selection techniques. Among all the models IBSWO [\u003cspan class=\"CitationRef\"\u003e1\u003c/span\u003e] has achieved accuracy of 99.90% on TON_IoT, 99.70% on NCTUKM-IIOT, and 98.70% on UNSWNB15. SA-DCNN [\u003cspan class=\"CitationRef\"\u003e2\u003c/span\u003e] has achieved accuracy of 99.95% on Edge-IIoTset and 96.89% on IoTID20 dataset. MAGRU [\u003cspan class=\"CitationRef\"\u003e10\u003c/span\u003e] which has used XGBoost for feature selection achieved the accuracy of 99.99% on MQTTset and 99.94% on Edge-IIoTset. ELM [\u003cspan class=\"CitationRef\"\u003e9\u003c/span\u003e] model achieved accuracy of 88.23% on TON_IoT dataset while its variant SHAP-LIME-ELM [\u003cspan class=\"CitationRef\"\u003e9\u003c/span\u003e] achieved accuracy of 99.63% on TON_IoT significantly outperforming base model while another variant GA-ELM [\u003cspan class=\"CitationRef\"\u003e3\u003c/span\u003e] achieved 99.00% accuracy on TON_IoT but performs poorly on UNSWNB15 dataset with an accuracy of 86.00%. DRL [\u003cspan class=\"CitationRef\"\u003e5\u003c/span\u003e] has achieved 99.36% accuracy on WUSTL-IIoT-2018, XGBoost-LSTM [\u003cspan class=\"CitationRef\"\u003e4\u003c/span\u003e] and CNN-LSTM [\u003cspan class=\"CitationRef\"\u003e4\u003c/span\u003e] achieved accuracy of 90.71% on UNSWNB15 and 91.09% on WSN DS, respectively. CNN-AH-BiLSTM, achieved balanced accuracy of around 99% across all datasets, making it highly effective for intrusion detection across diverse cybersecurity datasets.\u003c/p\u003e"},{"header":"6. Conclusion","content":"\u003cp\u003eIn this study, we proposed a hybrid IDS for IIoT networks, CNN-AH-BiLSTM, which integrates multiple DL techniques with SMO for feature extraction and classification thereby enhancing detection accuracy. Model used SMO for dimension reduction, CNN for robust feature extraction, to capture temporal dependencies hierarchical attentive BiLSTM is used followed by self-attention layer for focusing on critical features. Model is evaluated on three benchmark datasets namely NSL-KDD, Edge-IIoTSet and X-IIoTID achieving state-of-the-art accuracy in both binary and multiclass classification. Model achieved 99.98% accuracy on NSL-KDD, 99.93% on Edge-IIoTSet and 98.88% on X-IIoTID dataset for binary classification while model achieved 99.96% accuracy on NSL-KDD, 99.82 on Edge-IIoTSet and 98.75 X-IIoTID dataset. These results shows that model achieved balanced accuracy across all datasets, making it highly effective for intrusion detection across diverse cybersecurity datasets.\u003c/p\u003e \u003cp\u003eWhile the evaluated results are promising, there are several areas for future research proposed model can be further optimized to reduce computational complexity, Model needs to be tested on real-world IIoT traffic datasets to validate its robustness. Future research can refine and improve proposed model ensuring even more reliable and efficient intrusion detection in IIoT networks.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eEthics, Consent to Participate, and Consent to Publish declarations: Not applicable.\u003c/p\u003e\n\u003cp\u003eFunding: There is no funding\u003c/p\u003e\n\u003cp\u003eConflicts of interest: The corresponding author and all co-authors, confirms that there are no conflicts of interest to declare.\u003c/p\u003e\n\u003cp\u003eData Availability: The data supporting this study can be made available upon request.\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eThe primary work to this research which includes Literature review, conceptualization, feasibility study, design, methodology, is done by S.P. She led the development of the proposed methodology. She also conducted experimental evaluation and evaluated the performance of IDS and lastly prepared the manuscript based on experimental findings. M.K. contributed to designing and refining the experimental setup, analyzed results, and provided critical revisions to the manuscript at each step and approved the final version of the paper.\u003c/p\u003e\u003ch2\u003eAcknowledgement\u003c/h2\u003e\u003cp\u003eI want to thank my advisor, Dr. Dr. Mandar S. Karyakarte, for his essential help and direction during the creation of this paper. His knowledge, thoughtful comments, and constant support have greatly influenced the path and excellence of this project. I deeply appreciate his guidance, which has inspired and driven me throughout this process.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eM. Shtayat et al., \u0026quot;An Improved Binary Spider Wasp Optimization Algorithm for Intrusion Detection for Industrial Internet of Things,\u0026quot; in IEEE Open Journal of the Communications Society, doi: 10.1109/OJCOMS.2024.3421647.\u003c/li\u003e\n\u003cli\u003eM. S. Alshehri, O. Saidani, F. S. Alrayes, S. F. Abbasi and J. Ahmad, \u0026quot;A Self-Attention-Based Deep Convolutional Neural Networks for IIoT Networks Intrusion Detection,\u0026quot; in IEEE Access, vol. 12, pp. 45762-45772, 2024, doi: 10.1109/ACCESS.2024.3380816.\u003c/li\u003e\n\u003cli\u003eMaseno, E.M., Wang, Z. Hybrid wrapper feature selection method based on genetic algorithm and extreme learning machine for intrusion detection. J Big Data 11, 24 (2024). https://doi.org/10.1186/s40537-024-00887-9\u003c/li\u003e\n\u003cli\u003eSajid, M., Malik, K.R., Almogren, A. et al. Enhancing intrusion detection: a hybrid machine and deep learning approach. J Cloud Comp 13, 123 (2024). https://doi.org/10.1186/s13677-024-00685-x\u003c/li\u003e\n\u003cli\u003eF. Mesadieu, D. Torre and A. Chennamaneni, \u0026quot;Leveraging Deep Reinforcement Learning Technique for Intrusion Detection in SCADA Infrastructure,\u0026quot; in IEEE Access, vol. 12, pp. 63381-63399, 2024, doi: 10.1109/ACCESS.2024.3390722.\u003c/li\u003e\n\u003cli\u003eP. Verma, N. Bharot, J. G. Breslin, D. O\u0026apos;Shea, A. Vidyarthi and D. Gupta, \u0026quot;Zero-Day Guardian: A Dual Model Enabled Federated Learning Framework for Handling Zero-Day Attacks in 5G Enabled IIoT,\u0026quot; in IEEE Transactions on Consumer Electronics, vol. 70, no. 1, pp. 3856-3866, Feb. 2024, doi: 10.1109/TCE.2023.3335385.\u003c/li\u003e\n\u003cli\u003eMohamed, D., Ismael, O. Enhancement of an IoT hybrid intrusion detection system based on fog-to-cloud computing. J Cloud Comp 12, 41 (2023). https://doi.org/10.1186/s13677-023-00420-y\u003c/li\u003e\n\u003cli\u003eC. Hazman, A. Guezzaz, S. Benkirane and M. Azrour, \u0026quot;Enhanced IDS with Deep Learning for IoT-Based Smart Cities Security,\u0026quot; in Tsinghua Science and Technology, vol. 29, no. 4, pp. 929-947, August 2024, doi: 10.26599/TST.2023.9010033.\u003c/li\u003e\n\u003cli\u003eM. M. Shtayat, M. K. Hasan, R. Sulaiman, S. Islam and A. U. R. Khan, \u0026quot;An Explainable Ensemble Deep Learning Approach for Intrusion Detection in Industrial Internet of Things,\u0026quot; in IEEE Access, vol. 11, pp. 115047-115061, 2023, doi: 10.1109/ACCESS.2023.3323573.\u003c/li\u003e\n\u003cli\u003eS. Ullah, W. Boulila, A. Koub\u0026acirc;a and J. Ahmad, \u0026quot;MAGRU-IDS: A Multi-Head Attention-Based Gated Recurrent Unit for Intrusion Detection in IIoT Networks,\u0026quot; in IEEE Access, vol. 11, pp. 114590-114601, 2023, doi: 10.1109/ACCESS.2023.3324657.\u003c/li\u003e\n\u003cli\u003eF. Taher, M. Abdel-Salam, M. Elhoseny and I. M. El-Hasnony, \u0026quot;Reliable Machine Learning Model for IIoT Botnet Detection,\u0026quot; in IEEE Access, vol. 11, pp. 49319-49336, 2023, doi: 10.1109/ACCESS.2023.3253432.\u003c/li\u003e\n\u003cli\u003eGebretsadik, F.G., Nayak, S. \u0026amp; Patgiri, R. eBF: an enhanced Bloom Filter for intrusion detection in IoT. J Big Data 10, 102 (2023). https://doi.org/10.1186/s40537-023-00790-9\u003c/li\u003e\n\u003cli\u003eM. Mohy-Eddine, A. Guezzaz, S. Benkirane, M. Azrour and Y. Farhaoui, \u0026quot;An Ensemble Learning Based Intrusion Detection Model for Industrial IoT Security,\u0026quot; in Big Data Mining and Analytics, vol. 6, no. 3, pp. 273-287, September 2023, doi: 10.26599/BDMA.2022.9020032.\u003c/li\u003e\n\u003cli\u003eJ. Du, K. Yang, Y. Hu and L. Jiang, \u0026quot;NIDS-CNNLSTM: Network Intrusion Detection Classification Model Based on Deep Learning,\u0026quot; in IEEE Access, vol. 11, pp. 24808-24821, 2023, doi: 10.1109/ACCESS.2023.3254915.\u003c/li\u003e\n\u003cli\u003eS. Li et al., \u0026quot;CRSF: An Intrusion Detection Framework for Industrial Internet of Things Based on Pretrained CNN2D-RNN and SVM,\u0026quot; in IEEE Access, vol. 11, pp. 92041-92054, 2023, doi: 10.1109/ACCESS.2023.3307429.\u003c/li\u003e\n\u003cli\u003eM. Nuaimi, L. C. Fourati and B. Ben Hamed, \u0026quot;A Scalable Intrusion Detection Approach for Industrial Internet of Things Based on Federated Learning and Attention Mechanism,\u0026quot; 2023 IEEE Symposium on Computers and Communications (ISCC), Gammarth, Tunisia, 2023, pp. 1-4, doi: 10.1109/ISCC58397.2023.10218054\u003c/li\u003e\n\u003cli\u003eHakan Can Altunay, Zafer Albayrak, \u0026quot;A hybrid CNN+LSTM-based intrusion detection system for industrial IoT networks\u0026quot;, Engineering Science and Technology, an International Journal, Volume 38, 2023, 101322, ISSN 2215-0986, https://doi.org/10.1016/j.jestch.2022.101322.\u003c/li\u003e\n\u003cli\u003eM. A. Ferrag, O. Friha, D. Hamouda, L. Maglaras and H. Janicke, \u0026quot;Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications for Centralized and Federated Learning,\u0026quot; in IEEE Access, vol. 10, pp. 40281-40306, 2022, doi: 10.1109/ACCESS.2022.3165809.\u003c/li\u003e\n\u003cli\u003eS. M. Kasongo, \u0026quot;An Advanced Intrusion Detection System for IIoT Based on GA and Tree Based Algorithms,\u0026quot; in IEEE Access, vol. 9, pp. 113199-113212, 2021, doi: 10.1109/ACCESS.2021.3104113.\u003c/li\u003e\n\u003cli\u003eZ. E. Huma et al., \u0026quot;A Hybrid Deep Random Neural Network for Cyberattack Detection in the Industrial Internet of Things,\u0026quot; in IEEE Access, vol. 9, pp. 55595-55605, 2021, doi: 10.1109/ACCESS.2021.3071766.\u003c/li\u003e\n\u003cli\u003eW. Xu, J. Jang-Jaccard, A. Singh, Y. Wei and F. Sabrina, \u0026quot;Improving Performance of Autoencoder-Based Network Anomaly Detection on NSL-KDD Dataset,\u0026quot; in IEEE Access, vol. 9, pp. 140136-140146, 2021, doi: 10.1109/ACCESS.2021.3116612.\u003c/li\u003e\n\u003cli\u003eDong, R.-H., Li, X.-Y., Zhang, Q.-Y. and Yuan, H. (2020), Network intrusion detection model based on multivariate correlation analysis \u0026ndash; long short-time memory network. IET Inf. Secur., 14: 166-174. https://doi.org/10.1049/iet-ifs.2019.0294\u003c/li\u003e\n\u003cli\u003eS. Latif, Z. Idrees, Z. Zou and J. Ahmad, \u0026quot;DRaNN: A Deep Random Neural Network Model for Intrusion Detection in Industrial IoT,\u0026quot; 2020 International Conference on UK-China Emerging Technologies (UCET), Glasgow, UK, 2020, pp. 1-4, doi: 10.1109/UCET51115.2020.9205361.\u003c/li\u003e\n\u003cli\u003eZhang, L., Jiang, S., Shen, X., Gupta, B.B., \u0026amp; Tian, Z. (2021). PWG-IDS: An Intrusion Detection Model for Solving Class Imbalance in IIoT Networks Using Generative Adversarial Networks. ArXiv, abs/2110.03445.\u003c/li\u003e\n\u003cli\u003eA. Khacha, R. Saadouni, Y. Harbi and Z. Aliouat, \u0026quot;Hybrid Deep Learning-based Intrusion Detection System for Industrial Internet of Things,\u0026quot; 2022 5th International Symposium on Informatics and its Applications (ISIA), M\u0026apos;sila, Algeria, 2022, pp. 1-6, doi: 10.1109/ISIA55826.2022.9993487.\u003c/li\u003e\n\u003cli\u003eH. -Y. Chuang and R. -M. Chen, \u0026quot;Detection of Attacks on Industrial Internet of Things Using Fewer Features,\u0026quot; 2023 Sixth International Symposium on Computer, Consumer and Control (IS3C), Taichung, Taiwan, 2023, pp. 1-4, doi: 10.1109/IS3C57901.2023.00009.\u003c/li\u003e\n\u003cli\u003eChai, G., Li, S., Yang, Y., Zhou, G., \u0026amp; Wang, Y. (2023). CTSF: An Intrusion Detection Framework for Industrial Internet Based on Enhanced Feature Extraction and Decision Optimization Approach. Sensors, 23(21), 8793. https://doi.org/10.3390/s23218793\u003c/li\u003e\n\u003cli\u003eM. Koca and I. Avci, \u0026quot;A Novel Hybrid Model Detection of Security Vulnerabilities in Industrial Control Systems and IoT Using GCN+LSTM,\u0026quot; in IEEE Access, vol. 12, pp. 143343-143351, 2024, doi: 10.1109/ACCESS.2024.3466391.\u003c/li\u003e\n\u003cli\u003eJ. B. Awotunde, C. Chakraborty, A. E. Adeniyi, and A. Jolfaei. \u0026ldquo;Intrusion Detection in Industrial Internet of Things Network-Based on Deep Learning Model with Rule-Based Feature Selection\u0026rdquo;. Wirel. Commun. Mob. Comput. 2021 (2021). https://doi.org/10.1155/2021/7154587\u003c/li\u003e\n\u003cli\u003eLe, T.-T.-H., Oktian, Y. E., \u0026amp; Kim, H. (2022). XGBoost for Imbalanced Multiclass Classification-Based Industrial Internet of Things Intrusion Detection Systems. Sustainability, 14(14), 8707. https://doi.org/10.3390/su14148707\u003c/li\u003e\n\u003cli\u003eM. Abdel-Basset, V. Chang, H. Hawash, R. K. Chakrabortty and M. Ryan, \u0026quot;Deep-IFS: Intrusion Detection Approach for Industrial Internet of Things Traffic in Fog Environment,\u0026quot; in IEEE Transactions on Industrial Informatics, vol. 17, no. 11, pp. 7704-7715, Nov. 2021, doi: 10.1109/TII.2020.3025755.\u003c/li\u003e\n\u003cli\u003eAwotunde, J. B., Folorunso, S. O., Imoize, A. L., Odunuga, J. O., Lee, C.-C., Li, C.-T., \u0026amp; Do, D.-T. (2023). An Ensemble Tree-Based Model for Intrusion Detection in Industrial Internet of Things Networks. Applied Sciences, 13(4), 2479. https://doi.org/10.3390/app13042479\u003c/li\u003e\n\u003cli\u003ePotnurwar, A. V., Bongirwar, V. K., Ajani, S., Shelke, N., Dhone, M., \u0026amp; Parati, N. (2023). Deep Learning-Based Rule-Based Feature Selection for Intrusion Detection in Industrial Internet of Things Networks. International Journal of Intelligent Systems and Applications in Engineering, 11(10s), 23\u0026ndash;35. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3231\u003c/li\u003e\n\u003cli\u003eX. Wang et al., \u0026quot;Toward Accurate Anomaly Detection in Industrial Internet of Things Using Hierarchical Federated Learning,\u0026quot; in IEEE Internet of Things Journal, vol. 9, no. 10, pp. 7110-7119, 15 May15, 2022, doi: 10.1109/JIOT.2021.3074382.\u003c/li\u003e\n\u003cli\u003eY. Yang et al., \u0026quot;ASTREAM: Data-Stream-Driven Scalable Anomaly Detection With Accuracy Guarantee in IIoT Environment,\u0026quot; in IEEE Transactions on Network Science and Engineering, vol. 10, no. 5, pp. 3007-3016, 1 Sept.-Oct. 2023, doi: 10.1109/TNSE.2022.3157730.\u003c/li\u003e\n\u003cli\u003eA. Telikani, J. Shen, J. Yang and P. Wang, \u0026quot;Industrial IoT Intrusion Detection via Evolutionary Cost-Sensitive Learning and Fog Computing,\u0026quot; in IEEE Internet of Things Journal, vol. 9, no. 22, pp. 23260-23271, 15 Nov.15, 2022, doi: 10.1109/JIOT.2022.3188224.\u003c/li\u003e\n\u003cli\u003eH. Yao, P. Gao, P. Zhang, J. Wang, C. Jiang and L. Lu, \u0026quot;Hybrid Intrusion Detection System for Edge-Based IIoT Relying on Machine-Learning-Aided Detection,\u0026quot; in IEEE Network, vol. 33, no. 5, pp. 75-81, Sept.-Oct. 2019, doi: 10.1109/MNET.001.1800479\u003c/li\u003e\n\u003cli\u003eS. Latif et al., \u0026quot;Intrusion Detection Framework for the Internet of Things Using a Dense Random Neural Network,\u0026quot; in IEEE Transactions on Industrial Informatics, vol. 18, no. 9, pp. 6435-6444, Sept. 2022, doi: 10.1109/TII.2021.3130248.\u003c/li\u003e\n\u003cli\u003eJ. Zhang, C. Luo, M. Carpenter and G. Min, \u0026quot;Federated Learning for Distributed IIoT Intrusion Detection Using Transfer Approaches,\u0026quot; in IEEE Transactions on Industrial Informatics, vol. 19, no. 7, pp. 8159-8169, July 2023, doi: 10.1109/TII.2022.3216575.\u003c/li\u003e\n\u003cli\u003eP. T. Duy, T. V. Hung, N. H. Ha, H. D. Hoang and V. -H. Pham, \u0026quot;Federated learning-based intrusion detection in SDN-enabled IIoT networks,\u0026quot; 2021 8th NAFOSTED Conference on Information and Computer Science (NICS), Hanoi, Vietnam, 2021, pp. 424-429, doi: 10.1109/NICS54270.2021.9701525\u003c/li\u003e\n\u003cli\u003eA. M. Eid, A. B. Nassif, B. Soudan and M. N. Injadat, \u0026quot;IIoT Network Intrusion Detection Using Machine Learning,\u0026quot; 2023 6th International Conference on Intelligent Robotics and Control Engineering (IRCE), Jilin, China, 2023, pp. 196-201, doi: 10.1109/IRCE59430.2023.10255088.\u003c/li\u003e\n\u003cli\u003eAlshahrani, H.; Khan, A.; Rizwan, M.; Reshan, M.S.A.; Sulaiman, A.; Shaikh, A. Intrusion Detection Framework for Industrial Internet of Things Using Software Defined Network. Sustainability 2023, 15, 9001. https://doi.org/10.3390/su15119001\u003c/li\u003e\n\u003cli\u003eK. N. Qureshi, S. S. Rana, A. Ahmed, G. Jeon, \u0026ldquo;A novel and secure attacks detection framework for smart cities industrial internet of things\u0026rdquo;, Sustainable Cities and Society, Volume 61, 2020, ISSN 2210-6707, https://doi.org/10.1016/j.scs.2020.102343.\u003c/li\u003e\n\u003cli\u003eX. Li, M. Xu, P. Vijayakumar, N. Kumar and X. Liu, \u0026quot;Detection of Low-Frequency and Multi-Stage Attacks in Industrial Internet of Things,\u0026quot; in IEEE Transactions on Vehicular Technology, vol. 69, no. 8, pp. 8820-8831, Aug. 2020, doi: 10.1109/TVT.2020.2995133.\u003c/li\u003e\n\u003cli\u003eY. K. Saheed, A. I. Abiodun, S. Misra, M. K. Holone, R. Colomo-Palacios, \u0026ldquo;A machine learning-based intrusion detection for detecting internet of things network attacks\u0026rdquo;, Alexandria Engineering Journal, Volume 61, Issue 12, 2022, ISSN 1110-0168, https://doi.org/10.1016/j.aej.2022.02.063.\u003c/li\u003e\n\u003cli\u003eHan, J.; Pak, W. \u0026ldquo;Hierarchical LSTM-Based Network Intrusion Detection System Using Hybrid Classification\u0026rdquo;. Appl. Sci. 2023, 13, 3089. https://doi.org/10.3390/app13053089\u003c/li\u003e\n\u003cli\u003eBhavsar, M., Roy, K., Kelly, J. et al. Anomaly-based intrusion detection system for IoT application. Discov Internet Things 3, 5 (2023). https://doi.org/10.1007/s43926-023-00034-5\u003c/li\u003e\n\u003cli\u003eM. Injadat, A. Moubayed, A. B. Nassif and A. Shami, \u0026quot;Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection,\u0026quot; in IEEE Transactions on Network and Service Management, vol. 18, no. 2, pp. 1803-1816, June 2021, doi: 10.1109/TNSM.2020.3014929.\u003c/li\u003e\n\u003cli\u003eAltulaihan, E., Almaiah, M. A., \u0026amp; Aljughaiman, A. (2024). Anomaly Detection IDS for Detecting DoS Attacks in IoT Networks Based on Machine Learning Algorithms. Sensors, 24(2), 713. https://doi.org/10.3390/s24020713\u003c/li\u003e\n\u003cli\u003eC. Yin, Y. Zhu, J. Fei and X. He, \u0026quot;A Deep Learning Approach for Intrusion Detection Using Recurrent Neural Networks,\u0026quot; in IEEE Access, vol. 5, pp. 21954-21961, 2017, doi: 10.1109/ACCESS.2017.2762418.\u003c/li\u003e\n\u003cli\u003eAlmiani, M., AbuGhazleh, A., Al-Rahayfeh, A., Atiewi, S., \u0026amp; Razaque, A. (2019). Deep Recurrent Neural Network For IoT Intrusion Detection System. Simulation Modelling Practice and Theory, 102031. doi:10.1016/j.simpat.2019.102031\u003c/li\u003e\n\u003cli\u003eLiang, C., Shanmugam, B., Azam, S., Jonkman, M., Boer, F. D., \u0026amp; Narayansamy, G. (2019). Intrusion Detection System for Internet of Things based on a Machine Learning approach. 2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN). doi:10.1109/vitecon.2019.8899448\u003c/li\u003e\n\u003cli\u003eM. Al-Hawawreh, E. Sitnikova and N. Aboutorab, \u0026quot;Asynchronous Peer-to-Peer Federated Capability-Based Targeted Ransomware Detection Model for Industrial IoT,\u0026quot; in IEEE Access, vol. 9, pp. 148738-148755, 2021, doi: 10.1109/ACCESS.2021.3124634.\u003c/li\u003e\n\u003cli\u003eIngre, B., \u0026amp; Yadav, A. (2015). Performance analysis of NSL-KDD dataset using ANN. 2015 International Conference on Signal Processing and Communication Engineering Systems. doi:10.1109/spaces.2015.7058223 \u003c/li\u003e\n\u003cli\u003eM. A. Ferrag, O. Friha, D. Hamouda, L. Maglaras and H. Janicke, \u0026quot;Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications for Centralized and Federated Learning,\u0026quot; in \u003cem\u003eIEEE Access\u003c/em\u003e, vol. 10, pp. 40281-40306, 2022, doi: 10.1109/ACCESS.2022.3165809.\u003c/li\u003e\n\u003cli\u003eMuna Al-Hawawreh, Elena Sitnikova, Neda Aboutorab, July 30, 2021, \u0026quot;X-IIoTID: A Connectivity- and Device-agnostic Intrusion Dataset for Industrial Internet of Things\u0026quot;, IEEE Dataport, doi: https://dx.doi.org/10.21227/mpb6-py55.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"discover-internet-of-things","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"diot","sideBox":"Learn more about [Discover Internet of Things](https://www.springer.com/journal/43926)","snPcode":"","submissionUrl":"","title":"Discover Internet of Things","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Discover Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Industrial internet of things (IIOT), Intrusion detection systems (IDS), Machine learning (ML), Deep learning (DL) Bidirectional Long Short-Term Memory (BiLSTM), Spider monkey optimization (SMO)","lastPublishedDoi":"10.21203/rs.3.rs-6158243/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6158243/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eMany of intrusion detection systems (IDSs) analyses only a portion of packet data of fixed size for intrusion detection in industrial internet of things (IIoT) network, which limits the detection accuracy. In order to ensure higher detection accuracy it is important to design an IDSs that can analyse all features present in the packet. Models based on deep learning (DL) has great ability to process high-dimensional complex data. This study introduces a novel IDS called CNN-AH-BiLSTM that employs spider monkey optimization (SMO) to optimize data which enables system to not only deal with high-dimensional data but also ability to handle uncertainties in the data. Convolution Neural Network (CNN) is used for robust feature extraction. For classification a hierarchical attentive BiLSTM model is presented which enhances the system\u0026rsquo;s ability to focus on crucial temporal features. Finally self-attention layer is employed to enhance the model\u0026rsquo;s focus on critical features. Attention layer assigns weights to important parts of the input sequence. With this model we have tried to solve the problem of low detection accuracy.\u003c/p\u003e \u003cp\u003ePerformance assessment is done on three different standard datasets namely NSL-KDD, X-IIoTID and Edge-IIoTset datasets, with the accuracy 99.96%, 98.75 and 99.82 for multiclass classification and 99.98%, 98.88% and 99.93% for binary classification respectively. We have validated the proposed approach by not only conducting an extensive evaluation but also comparing the proposed model with various ML, DL models as well as with other current related research, which highlight the effectiveness of proposed model.\u003c/p\u003e","manuscriptTitle":"A Hybrid CNN and Attentive Hierarchical BiLSTM Model with SMO for Intrusion Detection in IIoT","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-04-17 05:40:19","doi":"10.21203/rs.3.rs-6158243/v1","editorialEvents":[{"type":"communityComments","content":1},{"type":"decision","content":"Revision requested","date":"2025-04-04T08:38:25+00:00","index":"","fulltext":""},{"type":"reviewerAgreed","content":"107451004714693643700609612964944232576","date":"2025-04-03T13:42:21+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-04-02T08:32:13+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"96085648572322723629974170734901548267","date":"2025-03-28T18:14:43+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-03-28T12:01:47+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"62313520836180695864673307726763893925","date":"2025-03-28T11:59:20+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"176523728612086806830879597673622885923","date":"2025-03-28T11:57:51+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-03-28T09:47:40+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-03-21T06:45:06+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-03-21T06:42:51+00:00","index":"","fulltext":""},{"type":"submitted","content":"Discover Internet of Things","date":"2025-03-05T02:54:25+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"discover-internet-of-things","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"diot","sideBox":"Learn more about [Discover Internet of Things](https://www.springer.com/journal/43926)","snPcode":"","submissionUrl":"","title":"Discover Internet of Things","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Discover Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"249e2446-ab87-4291-bbb4-3021f3ce60c5","owner":[],"postedDate":"April 17th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2025-08-29T10:38:39+00:00","versionOfRecord":[],"versionCreatedAt":"2025-04-17 05:40:19","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6158243","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6158243","identity":"rs-6158243","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.