Crime Rate Prediction using Cyber Security and Artificial Intelligent | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Crime Rate Prediction using Cyber Security and Artificial Intelligent Asan Mohideen Khansadurai, Anjana Devi Veluchamy, Suganthi Balasubramani, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-3975155/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract We examine the potential of AI in enhancing cybersecurity solutions by highlighting both its advantages and disadvantages. We also talk about the potential for future research in the realm of cybersecurity related to the development of AI approaches across many application domains. One of our society's most significant and pervasive issues is crime. Numerous crimes are perpetrated often each day. The dataset in this instance consists of the date and the annual crime rate for the corresponding years. The crime rate used in this project is only based on robberies. Utilizing historical data, we employ the linear regression algorithm to forecast the percentage of crime rate in the coming years. The algorithm receives a date as input, and the result is the proportion of crime for that particular year. Artificial intelligence Cybersecurity Cyberattacks Machine learning Crime rate number of crimes regression algorithm Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 I. INTRODUCTION Crime poses a serious threat to humanity. Numerous crimes occur frequently at regular periods. Maybe it's growing and dispersing quickly and widely. From small towns and villages to large metropolis, crimes occur. There are many distinct types of crimes, including robbery, manslaughter, rape, assault, battery, and false imprisonment. There is a need to resolve cases much more quickly because crime is rising. The police agency is responsible for containing and decreasing the crime activities, which have increased at an accelerated rate.Given the vast amount of crime data available, crime prediction and criminal identification are the police department's two biggest issues. Technology is required so that case solving can be completed more quickly. It was discovered through extensive documentation and cases that machine learning and data science may expedite and simplify the work. The purpose of this project is to forecast crime using the dataset's attributes. The official websites are where the dataset was taken. We can anticipate the type of crime that will occur in a specific location with the help of a machine learning system utilizing Python as the core.The purpose would be to train a model for prediction. The test dataset will be used to validate the training on the training data set. For the purpose of predicting crime, the Multi Linear Regression (MLR) will be utilized. To examine potential crimes that may have happened in a specific year based on population and number of crimes, a dataset is visualized. This work aids law enforcement organizations in forecasting and identifying the crime percapita in a region, which lowers the crime rate. Cybersecurity is characterized as a collection of procedures, human conduct, and technological frameworks that assist protect electronic resources.Similar to Moore's law, which predicts that the number of components on an integrated circuit will double every two years while chip production costs will decline, cybercriminals are increasingly doubling the efficiency of their attack tools every few months while cutting the cost in half (Venable D, 2017). The amount spent on cybersecurity worldwide is anticipated to reach $1 trillion between 2017 and 2021 (Morgan S, 2019) having already risen by approximately 40% from $66 billion in 2013 (Yuhas D, 2017). Researchers in cybersecurity have recently begun to investigate Artificial Intelligence (AI) methods to enhance cybersecurity. Similar to how thieves use AI to launch increasingly complex cyberattacks while evading detection.In this work, we put more of an emphasis on how AI-based cybersecurity solutions could be able to better fend off attackers and reduce or avoid data breaches. Since its inception in the 1950s, developments in AI have produced a variety of fascinating research findings and systems. Machine learning and deep learning emerged as a result of further developments (McFarland M, 2017). Today, AI is being used in a wide range of applications, including manufacturing, law, healthcare, agriculture, and space exploration (Geib C., Winick E, 2017., Morey B., 2017., Morrow S, 2019) A wide variety of AI systems with different capabilities have been developed and used as a result of the ongoing performance advancements in computer hardware and software (along with their falling costs) and new paradigms like big data and cloud computing.Many of these AI systems can currently handle a variety of difficult tasks, such as planning, learning, problem-solving, decision-making, and face- and speech-recognition. The introduction of machine learning technologies, which enable computer systems to learn and adapt to various environments by leveraging their prior experiences, patterns, and knowledge, has been a key advancement in the field of artificial intelligence (AI) since the 1980s. An area of machine learning called deep learning, which was developed ten years ago, allows computers to find hidden links in the input data, leading to more precise planning and prediction outcomes. Recently, there has been a rise in interest in using machine learning and artificial intelligence to combat cyberattacks.The current constant production of vast volumes of data necessitates the adoption of these techniques because it takes a lot of time and resources to analyze and spot any patterns, anomalies, or intrusions in traffic data. II. CYBERSECURITY THREATS AND LEGACY CYBERSECURITY SOLUTIONS There have been many different kinds of cyberthreats throughout the past ten years. Then, we quickly go over those dangers. The top 10 cyberthreats we face today, according to a recent report, include: 1) Denial of Service (DoS) attacks: These seek to deplete a victim system's computing power by flooding it with a large volume of requests that must be handled quickly.A single attacker machine can launch a distributed denial of service (DDoS) attack against a victim machine by sending a large number of network traffic packets that seem legitimate, thereby evading security measures along the way. Multiple attacker machines can take part in a distributed-style DoS attack, also known as a distributed denial of service (DDoS) attack, which has a similar effect on the victim machine. Due to the ready availability of attacker tools and the growth of the CyberCrime as a Service (CCaaS) business, DoS attacks are getting more sophisticated and difficult to detect. 2) Man-in-the-middle (MiTM) attacks: These traditional assaults involve intercepting data being transmitted on a channel between two legitimate parties that are conversing. The attacker inserts themselves between A and B, either physically or virtually, pretending to be A to speak to B by intercepting A B messages and replacing them with nefarious or altered messages. The attacker then repeats the process on the B communication line, i.e., pretending to be B and speaking to A.IP address spoofing is one of the attack's variant implementations, in which a hostile actor deceives lawful systems into believing it is a trustworthy entity in order to get access to the system. A message replay attack is when a hostile actor uses the communication channel to repeatedly transmit an old, previously stored message. 3) Phishing and spear-phishing attacks: In order to trick gullible end users into clicking a link and disclosing personal information, these are carried out by creating emails that seem valid and sending them to legitimate systems. Such attacks take advantage of social engineering concepts, whereby emails are crafted to look trustworthy to end users in order to win their trust. In order to craft emails that seem extremely legitimate and frequently contain trusted email addresses in the "from" field, bad actors must first conduct a thorough background check on potential victims. This technique is known as spear phishing. 4) Drive-by attacks: These are carried out by bad actors that browse the web looking for websites that are vulnerable so they can insert harmful scripts into the webservers. Visitors to the website eventually become infected with the malware, which causes system compromise, the release of sensitive data, and other harm. 5) Password attacks: These include utilizing common passwords to brute force their way into a system, spying on user keyboard activity, and creating complex passwords using artificial intelligence (AI) techniques. 6) Structured Query Language (SQL) Injection Attacks: These traditional cyberattacks take advantage of SQL language flaws by injecting SQL query code into a webpage's input fields. When the webserver executes the SQL code, it may reveal all or part of the data stored on a backend database server, possibly including usernames and passwords. 7) Cross-site scripting attacks: These involve inserting malicious code into a web server that is weak. Naive end users' subsequent retrieval of the hosted webpages would infect the victim's computer with malware. Such malware may send user information from the victim's computer to the servers of the malicious actor, which may then enable web sessions to be hijacked, credentials to be stolen, keystroke loggers to be installed, screenshots to be taken, and even remote control of the victim's computer. 8) Eavesdropping attempts: These can be made by sniffing out the network connection channel and subsequently misusing the data gathered. Malicious actors may aggressively attack the line, substituting messages with bogus ones, and pose as normal users, or they may passively sniff the connection and collect user data. 9) Birthday attacks: This message digest, sometimes referred to as a hash, can be generated using a common technique like the Secure Hash Algorithm 1. (SHA-1). A hash value with a set length is produced when this method is applied to a message of any length. The birthday attack describes a malicious actor's attempt to identify two distinct messages that generate the same hash value. As a result, the original message may be replaced with another message that generates the same hash value, resulting in data loss and system and service disruption. These attacks use artificial intelligence to find random messages that have the same hash value as a real message. 10) Malware attacks: One of the biggest challenges for web hosting companies is the possibility that malware can propagate through their websites. In 78 percent of websites, a significant vulnerability exists that might be used by the adversary to execute malicious code without the need for user involvement, according to Symantec's 2016 Threat Report. Using the right security controls, such as web proxies, firewalls, and intrusion detection systems, can help to bolster a website's defenses. The trade-off between the appropriate amount of security controls and the usability of the hosted websites is a significant problem in this situation. The region of vulnerability for a website increases with its level of usability. Attempts are made to initiate network assaults against the environment in order to obstruct services, steal personal or business data, and gather network intelligence. Malicious people acquire access and manipulate the operating system (OS) by taking advantage of a flaw in the OS that allows them to do so. Some of these assaults involve the theft of personal data, which can be exploited to access private or business information. According to the attack objectives, anticipated targeted device or application, data/information revealed while a certain assault is in progress, kind of environment impacted when a certain attack occurs, and how these attacks are discovered, we classified numerous network attacks in Table 1. We then quickly go over conventional (non-AI) cybersecurity methods for spotting cyberattacks: 1) Game theory has been used in cybersecurity before. The victim's computer is the other participant in the game, with the malicious actor acting as the first player. Each person makes an effort to maximize their incentive by strategic movement, arguing logically that the move will achieve the desired result. The actions of each participant can either be predicted in advance or remain a secret. An illustration of a game would be a smart grid setting where the attacker tries to stop communication between a home and a power system while the defender tries to keep these different entities connected. Both the attacker and the defender would use techniques to accomplish their respective aims at each stage of the game. 2) Rate control: DoS and DDoS attacks target the availability of systems. Through basic traffic throttling, redefining permission lists, and limiting the amount of incoming network traffic, rate-control approaches can minimize the impact on such systems' ability to function when they are attacked. 3) Heuristics: Heuristics are frequently used by firewalls and intrusion detection systems to determine the best rule for categorizing network data as legitimate or abnormal. One such method uses substring matching as part of a series of operations to find suspicious URL addresses. The VirusTotal application, which is a website where one can enter a website address and receive a scored analysis about how malicious the input website is, is used in the second phase of the proposed scheme to scan the web address. The lowest score of the two scans is taken into consideration when determining whether to allow the data packets into the network or not. 4) Signature-based intrusion detection: A signature based intrusion detection system makes use of a database that may store legitimate signatures corresponding to normal traffic or attack signatures cor responding to harmful activity. In real time, the intrusion detection system compares the information in arriving network packets with the signatures that have been previously saved. The disadvantage of this method is that intrusion detection systems are limited in their ability to reliably detect malicious data entering a network in the absence of relevant signatures. 5) Anomaly-based intrusion detection: This method builds a mental model of what is typically observed. The models may take the form of statistical methodologies, mathematical models, or rule-based systems. Changes from the ordinary are viewed as assaults. Such strategies have the advantage of not being dependent on signature patterns, which frees them from administrative efforts to collect signatures when compared to signature-based detection. 6) Autonomous systems: These can provide dependability and availability, self-protect and self-heal, as in the case of the Bionic Autonomic Nervous System (BANS). The four different modules that make up this system are called Cyber Neuron, Cyber Axon, Peripheral Nerve, and Central Nerve. Malware and spy ware are protected from by Cyber Neuron. An sophisticated tool to repair spyware and malware damage is called Cyber Axon. Similarly, Peripheral Nerve establishes a communication link between numerous cyberneurons placed on various devices to provide a strong protection against DoS/DDoS attacks.The last function of Central Nerve is to provide information to other security devices and act as a knowledge base for potential attacks. The idea of collaborative defense by peripheral nerves is to have network devices work together to thwart DoS and DDoS attacks. 7) Security controls for end users: Current end-user gadgets like mobile phones, iPads, and smart portable devices (PCs) need built-in security rather than add-ons. While some vendors try to promote automatic updates, end users might not update their devices with the most recent security patches, preventing security patches from being installed. The Wannacry ransomware attack is an illustration of an attack where the most recent vendor-provided security fixes were not installed on all end-user devices. Users frequently aren't aware of the consequences of not installing patches.Even though some users may be aware of this reality, there are times when they either fail to take the necessary steps to secure their devices or carry out the wrong processes, leaving the devices vulnerable to other attacks. Performing "out of sight" security, in which automatic updates are sent by suppliers straight to end-user devices without the user's input, is one recommended control. The difficulty would be that software developers would have to make sure that security updates protect against fresh attacks, or so-called "zero-day attacks," and that they integrate seamlessly with all pre-existing software on the end-user device. III. ARTIFICIAL INTELLIGENCE AI is concerned with how machines can understand or act correctly, given what they know. This all-encompassing definition encompasses how closely machines can mimic human thought or behavior (Fig. 1). Machines are considered intelligent on one end of the spectrum if they can maximize the result at every stage of the process. The Turing Test, on the other hand, sets the bar for artificial intelligence. When a person conversing with a computer cannot tell whether the responses are coming from a computer or a human, the computer is deemed to exhibit intelligence according to this test.AI encompasses computing disciplines such natural language processing, knowledge representation, logic, automated reasoning, machine learning, mathematics, and game theory at both ends of the computing spectrum. The first applications of AI led to thinking machines that figured out complex problems like geometry, checker], and a group of block-world issues. Agent-based artificial intelligence (AI) or "bots"—software that acts like humans—became increasingly popular after the rise of the Internet in the late 1990s. Search engines, online directories, and recommendation services have all benefited from the creation of ethical bots that crawl the Internet. They offer vandalism defense in Wikipedia pages where anyone can participate as an author. On the other hand, malevolent bots have also been developed to transmit malware, publish spam, and cheat at online games. In order to reverse engineer the game code while simulating online games, bot programmers examined the communication flow between the game console and server Instead of consistently submitting messages, the bots that sent spam imitated human online activity by browsing the pages before posting a message in a forum. Malicious bots hinder the correct operation of cyber services, harming the service providers by discouraging online users. As a result, several cybersecurity research looked into ways to identify and defend against rogue bots. When compared to humans, studies have shown that game bots are more persistent, less sociable (exchanging goods or bidding on products), and exhibit fewer variability in their activity sequences. Additionally, while human players like to work together with other players to fulfill tasks and missions, gaming bots are more motivated to collect things.Similar to spambots, malware bots can be identified by their behavior, which can be seen in various distinguishing communication patterns. Intrusion detection systems are where artificial intelligence is most applicable in the field of cybersecurity. Cybersecurity solutions frequently do traffic analysis, classifying Internet traffic as either benign or harmful. Cyberattacks were recognized in the early days of the Internet using rule-based systems, where attacks could be found based on their signatures. As the number of Internet-connected devices and their applications grew over time, it became time-consuming to monitor the massive amounts of network traffic generated in real-time and to create rules that analyze this traffic, which caused security protection systems to act defensively rather than proactively.The development of new, complex attack techniques that can elude detection by existing security systems is another trend that is being aided by technical advancements. We require cutting-edge tools and technology that can assist in speedier detection, investigation, and decision-making for new risks as the landscape of cyberthreats keeps expanding. Large volumes of Internet traffic could be automatically and intelligently analyzed and classified using AI. Based on ML technologies, cybersecurity solutions are now utilized to automate the detection of assaults and to develop and enhance their capabilities over time.Since they can handle enormous volumes of data and a variety of data properties (for example, a large number of table columns) needed for classification. ML-based solutions are being employed in intrusion detection systems. In order to discriminate between harmful and valid traffic, machine learning systems learn from the gathered Internet data. It is important to note that the term "machine learning" has come to be used synonymously with "artificial intelligence" in the cybersecurity industry since machine learning is so widely used to address cybersecurity challenges. A. MACHINE LEARNING Unsupervised learning and supervised learning are the two categories into which machine learning techniques are typically divided. Data samples are labeled in supervised learning based on their class (e.g., malicious or legitimate). Most often done manually, training data, or data labeling, requires people to recognize data patterns with their classes. The trained data is fed into an algorithm to build a mathematical model that, given fresh data samples, can output the predefined classes. Data labeling and training are not necessary for unsupervised learning.Instead, the algorithms analyze the coherence/dispersion of data samples, systematically classifying them according to the degree of data coherence within the class and the degree of data modularity between classes. The distinction between supervised and unsupervised machine learning algorithms often blurred in discussions on machine learning, though. Machine learning approaches use mathematical, statistical, and probabilistic methodologies to enable unsupervised algorithms to label the data needed by supervised algorithms. Due to the convergence of taxonomic viewpoints, it is no longer necessary to categorize machine learning algorithms as either supervised or unsupervised.From a taxonomy viewpoint, as defined in, we now give an extensive review of machine learning algorithms; however, in this part, we focus on the most popular machine learning approaches that are useful for cybersecurity solutions. Data samples are processed by machine learning algorithms based on their defining characteristics, often known as features. The processed data is organized as a table with rows and columns, with the rows acting as samples of the data and the columns as their attributes. Naive Bayes is a machine learning technique that uses the Bayesian theorem to categorize data with the assumption that each characteristic is the result of a separate event.The method determines the likelihood that new data samples will belong to a class by starting with the computed probabilities of each class across all instances. Although Nave Bayes classifiers perform worse as more features are derived from dependent events, they are frequently used because they may intrinsically accept the nave assumption that all features are derived from independent events and still produce usable results. B.DECISION TREES A method for developing a collection of rules using training data samples is to utilize a decision tree. The algorithm repeatedly selects a feature to classify data samples. Until data samples with just one class are discovered after a division, the iterative division generates a succession of rules for every side of the categories, resulting in a tree-like structure. Fig. In Example 2, a decision tree is used to categorize network traffic into two categories: regular traffic and attack traffic.The tree illustrates that, for instance, if traffic flow is low but traffic pattern duration is prolonged, it is categorized as an attack. The method offers an intuitive way to identify cybersecurity problems since it categorizes observed cybersecurity events as either legitimate events or attacks, depending on feature values, and displays the outcome of the decision as needed. Decision trees, for instance, used flow rate, size, and duration in addition to source/destination error rates to identify DoS assaults. Additionally, decision trees were used to classify numbers from CPU use, network flow, and the amount of data sent in order to detect command injection attacks against robotic vehicles.he advantage of this method is that intrusion detection systems can categorize Internet traffic in real time once the most efficient set of criteria has been identified. One of the key factors in spotting cyberattacks is the caliber of generated real-time notifications. The Rule-Learning technique is an alternative method that aims to identify a collection of feature values for each iteration while maximizing a score that characterizes the quality of the classification result, such as the quantity of erroneously categorized data samples. Such a method creates a set of classification rules, much like decision trees do. A rule-learning technique discovers a set of rules that can characterize a class, whereas decision trees locate the best feature values that lead to a class.A rule-learning technique has the advantage of incorporating expert human assistance when producing rules. Think about a study that used 28 features to find DoS attacks in cloud networks. A number of computer and network indicators were included in the features, including input/output (IO) readings, memory usage, TCP flag detection, and the number of system resources open. It created a set of rules based on the features (e.g., IO reads larger than IO reads(average)) and used feature-ranking techniques to determine which rules were most important for identifying the class. Following that, the study used human specialists to optimize the rules by, for example, eliminating redundancies.As a result, the method works well with intrusion detection systems whose setups are mostly based on rules. In order to compare the effectiveness of different machine learning algorithms in identifying network intrusions, the technique was frequently used. C. K-NEAREST NEIGHBORS The k-Nearest Neighbor (k-NN) method classifies or clusters data based on data samples. In order to determine the percentage of data samples in a neighborhood that produce a consistent estimate of a probability, it was first proposed as a non-parametric pattern analysis. In order to form clusters, the neighborhood was specified as k number of data samples based on a distance measure, typically the Euclidian distance. The distribution of additional data samples among the clusters is determined by the votes of all k neighbors. The approach described above is shown in Fig. 3.The data now includes an additional sample (the red dot). In this case, the majority of data samples from one nearby cluster were the deciding factor. Consequently, the sample was classified as Class 2 when k=3. The sample was classified as Class 1 when k=9. Even for tiny values of k, the computational complexity of this method is high. However, because it can learn from fresh traffic patterns to identify zero-day attacks as one of its unknown classes, it appeals to intrusion-detection systems. Thus, there is now active study in this field to determine how k-NN might be employed for real-time cyberattack detection.Recently, the method was used to identify assaults on smart grids and industrial control systems, such as data tampering and bogus data injection. When the data can be represented using a model that enables the measurement of the distance between them and other data, such as a Gaussian distribution or a vector, it works well. D. SUPPORT VECTOR MACHINES The linear regression model is expanded upon by the Support Vector Machines (SVMs) method. SVMs classify data samples by locating a plane that divides them into two classes (as shown in Fig. 4). Depending on the function used (referred to as a kernel), the separation plane can take the form of a linear, nonlinear, polynomial, Gaussian, Radial, sigmoid, and so forth. By using more than one plane, SVMs may also separate multiclass data, which is data that needs to be divided into more than two classes rather than only two classes like genuine versus attack class as seen in the preceding cases.Due to the fact that Internet traffic patterns frequently include multiple classes, including HyperText Transfer Protocol (HTTP), File Transfer Protocol (FTP), Post Office Protocol 3 (POP3), and Simple Mail Transfer Protocol (SMTP) ], SVMs are an appealing technique that can be used to analyze Internet traffic patterns. SVM is a type of supervised machine learning that uses training data to build classification models. As a result, it is utilized in applications that allow for the simulation of attacks. As an illustration, network traffic from penetration tests on network systems was utilized as the training data.A mathematical model was developed using SVM to separate penetration test traffic from regular traffic. Its use can be modified to produce a 1-class model for typical traffic, and the model can be used to when attack traffic was introduced, irregularities were detected. From these angles, the advantage of SMVs makes it possible to create simulation-based assault detection models. E. ARTIFICIAL NEURAL NETWORKS The functioning of neurons in the brain serves as inspiration for the Artificial Neural Networks (ANNs) learning technique. A target value is output using a series of data samples, and ANN approaches model neurons as a mathematical equation. The formula closely matches the equation for linear regression, in which a sample's data properties are weighted to produce an output value. The ANN algorithm cycles through its iterations until the output value is within the allowable error bounds of the target value. When given specific patterns seen in the data samples, the neurons learn by adjusting their weights in each iteration by calculating the deviation from the target value.When the mistake is small enough, the process produces a mathematical equation that, when given unknown data samples, gives an instructive result like the class. ANN approaches are capable of identifying patterns in noisy to incomplete data samples. They can adapt to new types of communication, making them appropriate for intrusion-detection systems. The Cascade Correlation Neural Network (CCNN), an ANN application that gradually adds additional hidden units to the hidden layer, was employed in a cybersecurity investigation . New hidden nodes are added to the network when new events are found, and only those are trained using the newly gathered data, enabling a runtime adaptive and scalable system.In this study, we use the CCNN to learn from desktop-platform traffic patterns to detect port scanning to mobile networks without having to retrain the entire network with the original data. The proliferation of mobile devices over the past ten years has given rise to new traffic patterns, rendering outdated earlier detection algorithms derived from desktop traffic. The number of ports searched per second and the frequency of received packets varied between port-scanning operations against mobile devices.The study demonstrated that the performance of ANN port-scanning detection was comparable to that of other techniques, such as Decision Trees. Because ANN can learn from current events, another advantage is that it can identify zero-day assaults. As an illustration, traffic patterns from instances involving DoS assaults were provided to ANNs as labeled training data, enabling the neurons to modify their weights and recognize undetected DoS attacks. In contrast to other instances (such as system penetration), where the attackers can hide their tracks and the victim is left as gullible, when occurrences like DoS attacks occur, the victim can testify that an attack has occurred.Since the attack class can be identified when an incident (like a DoS) occurs, ANNs is a good detection tool for cybersecurity applications that can benefit from the occurrence. F. SELF-ORGANIZING MAPS Self-Organizing Maps (SOMs) are a step up from ANNs in that they self-adjust the weight of the neurons to produce a two- or three-dimensional (2D or 3D) map that illustrates how the data might be organized. The method picks up new information by identifying correlations in data samples. In order to cluster data and produce an output in the form of a map, adjacent data samples have more similar traits than those farther apart. Due to its computational complexity, SOMs are inappropriate for real-time intrusion detection. Their main advantage is that they can visualize the data, which makes it possible to discover network irregularities. The results from intrusion-detection systems are challenging to analyze without visualization.Network administrators can more easily identify anomalies in network traffic, such as zero-day attacks, with the aid of visualization tools that help them see the typical pattern of traffic data (for example, in terms of protocol interactions and traffic volume). Although visualization techniques can effectively highlight anomalous events, skilled eyes are still needed to spot anomalies in the data. SOMs were therefore used as an additional tool for identifying cyberattacks. SOMs can visualize multidimensional data since it depicts data in a 2D or 3D map (e.g., when the data in a table have a large number of columns).In other words, SOMs make data less dimensional. Other dimensional reduction methods (such Principal Component Analysis and Curvilinear Component Analysis, for example) do exist, however they do not depict anomalies that are appropriate for interpreting cyberattacks. The protocol, userAgent, acceptEncoding, acceptCharset, and connection were the dimensions retrieved from the HTTP request header for the purpose of identifying web attacks, for instance. In order to visually represent such multidimensional data on a 2D map and to identify abnormal web traffic, SOMs were used. Similar to this, SOMs were used to distinguish between botnets and regular traffic on the map by reducing 5D data (such as protocol, source/destination IP, and source/destination port numbers) to a 2D map [30]. G. BIOLOGICALLY INSPIRED TECHNIQUES In addition to network traffic, offensive human language such as profanity, insults, hate speech, and racist/sexist statements can also cause cyberintrusions. Applications for Natural Words Processing (NLP) have evolved to separate offensive discourse from typical language. Language patterns like the usage of punctuation, sentence length, or a collection of words that are frequently used together in a sentence are examples of how NLP generates semantics. By recognizing word groupings that are different from those classified as normal, NLP is able to detect sentiments.Numerous evolutionary and biologically inspired algorithms can be used to identify offensive human languages. A variant of ANNs called Deep Neural Networks (DNNs) is the most often used algorithm. Multiple hidden layers are employed in DNNs, which enables algorithms to handle latent variables that would otherwise go unnoticed when only one layer is applied. These are appropriate for NLP applications because they may deduce semantics from linguistic structures. DNNs made it possible to identify named entities, find phrases (noun phrases and verb phrases), and classify words according to their function in the sentence (e.g., adjective, noun, verb, or conjunction) (i.e., persons, companies, and locatins.A variant of ANNs areGenerative Adversarial Networks (GANs). The methods look for features in data samples based on their classes. GANs are made up of two separate sets of neural networks, one of which is used to produce features and the other to assess how well features model the data. They can be used to detect steganography, in which one set of neurons created samples of fake images and the other set of neurons distinguished between the generated fake images and actual ones. The two groups of neurons fight against one another while changing their weights in each iteration to either produce undetected fake images or correctly identify fake from real ones.Overall in this section, we illustrated how AI approaches could improve cybersecurity solutions. Machine learning methods appear to be the most often used AI-based solutions at the moment, particularly when it comes to detecting network breaches, according to the current trend. The effectiveness and efficiency of the other AI-based solutions presented here, meanwhile, must be further investigated as cyberattacks get more complex and sophisticated in order to more accurately assess their full potential. The use of AI to improve the cybersecurity posture of various application domains is covered in the section that follows. IV. APPLYING AI TO STRENGTHEN CYBERSECURITY FOR VARIOUS APPLICATION DOMAINS The number of users, size, variety of devices, quantity and type of programs being created to operate over the internet, and other aspects of the internet continue to change. The Internet has now evolved into a crucial utility in people's daily lives all across the world, just like electricity, water, and gas have done in the past. There is a rising potential of cyberattack exposure as more devices connect to the Internet. Cybersecurity has become essential to safeguarding both these Internet-connected devices and their users. Figure 5 shows how AI can help with cybersecurity in three different contexts, including the Internet (sections IV-A to IV-D), the Internet of Things (IoT), and critical infrastructure (section IV-H).The structure for the subsequent topics in this part is also shown in the figure: Two key factors—the degree of interconnection and the need for secure systems—drive the growth of AI applications. A.THE INTERNET Cyberattacks are hostile patterns that are distinct from legitimate Internet traffic, according to AI. To differentiate Due to their ability to review a huge quantity of data and adapt to the changing nature of Internet traffic, intrusion-detection systems have been developed by using AI approaches to distinguish malicious traffic from valid traffic. Recent cyberattacks have targeted people, business logic, and network infrastructure. B. NETWORK INFRASTRUCTURE (BOTNET) Communication between clients and servers is common in Internet services. Attackers have the ability to block access to servers or, in the case of DoS attacks, stop the server from fulfilling client requests. When creating a botnet, the attackers first corrupt a number of hosts (using Trojans or other forms of malware), which they then have control over and direct to carry out tasks. For example, in a DoS attack, these infected devices can be used to flood a server with requests, leaving no resources for legitimate users' lawful requests to be processed. DoS assaults are becoming a severe concern due to the complexity growth and multi-platform operation of the botnets they deploy, which include PCs, mobile devices, and Internet of Things (IoT) devices.By using attributes that accurately describe the network behaviors of IoT devices, one study was able to identify DoS attacks performed by IoT devices. The number of distinct destination IP addresses and the number of distinct IP addresses within a 10-second window were recommended as two attributes to represent their observation that IoT devices only communicate with a small number of endpoints when running apps. Interpacket arrivals and their first and second derivatives were also suggested as additional features. This is an indication of an unexpected surge of packets transmitted by the IoT device. The study demonstrated that decision trees have a detection accuracy of 99 percent.DoS attacks caused by IoT devices can be avoided when gateways implement the suggested detection approach since the majority of IoT devices must pass a single gateway (such as a home router). New DoS assaults techniques are launched as new services emerge. DoS attacks on smart meters, are recent instances. In the interconnected network of smart meters, each of these meters also serves as a router. The authors of discovered that putting an attack packet into a meter could cause the meter to produce a large number of route packets, changing other meters' routing information to prevent data packets from getting to their destination.As a result, the network became unavailable since the network's meters made arduous attempts to get the data packet to its target. The wireless modules of smart meters are susceptible to a jamming attack, as the authors noted in. They examined the dispersion of the wireless signal's arrival distance from a location determined to be the network's center in order to spot a jamming attempt. We anticipate that when new services and computer platforms appear, so will new, more sophisticated DoS attack methods. Detecting DoS attacks in the Software-Defined Network (SDN) environment was the focus of recent studies. SDN-based network management is distinct from conventional forwarding protocols.SDN gathers and programmatically analyzes network data before forwarding network traffic, unlike conventional routers that forward traffic in accordance with their routing tables. As a result, detecting DoS attacks in an SDN context presents new difficulties. Before forwarding packets to the control plane, an SDN system switched 68 features from packets from its data plane in the work. The ratio, entropy, count, size, and flow of packets for the Internet Protocol (IP), Transmission Control Protocol (TCP), User Datagram Protocol (UDP), and Internet Control Message Protocol (ICMP) packets and flags were used to derive these characteristics.The research demonstrated that DoS assaults may be accurately detected with Deep Learning algorithms in 95.65 percent of cases. In an SDN environment, deep learning is considered a viable option for identifying DoS threats Twenty features, including the protocol, port, packet size, and others, were used by the authors of The authors demonstrated that a Deep Learning derivative known as Long Short-Term Memory can identify DoS assaults with a 99.88% accuracy rate. A variety of features were used in the study in, including the number of connections made within a 2-second window, the length of connections, the number of connections to the same service (as the current connection), the protocol type, and the volume of data flowing in each direction.It shown that, in terms of accuracy, DNNs outperformed other AI techniques including SVMs, Naive Bayes, and Decision Trees. Even though only a few features were defined, the work demonstrated that DNNs worked well because, unlike other machine learning techniques that do not build features, DNNs were able to produce hidden/latent variables that were viewed as additional features. To react to changes in the computing environment, SDN uses AI techniques. It learns from previous network data to assess new traffic patterns and forecast security trends. When AI is used to detect cyberattacks on SDNs, two constraints haven't been addressed in the research.First, there hasn't been any discussion of how AI can be used to real-time detections. Real-time classification of hostile and normal traffic is necessary to detect DoS assaults, but the AI-based approach is evolutionary in nature and takes several computer iterations to provide the desired results. The proposed system was tested in real time, but the test was conducted after a classification model had been created from training data.To the best of our knowledge, no study has suggested an AI method for SDN to quickly identify DoS threats. Second, application-layer threat detection is not a concern for SDN by natur ]. Application-layer protocols must be protected against DoS attacks, which calls for deep packet inspection or other non-centralized methods. As we go over in more detail in the following section, this is yet another situation where AI could be used. C. APPLICATION LAYER As servers operate the essential business applications for an organization, targeting servers is a desirable way to attack the company providing the services or the consumers of those services. Application-layer attacks have previously mostly targeted protocols like HTTP, Domain Name Service (DNS), or Session Initiation Protocol (SIP). For instance, unique DoS attack modeling and detection was proposed in when the new version of the HTTP/2 web browsing communications protocol was released; the authors showed how to get around intrusion detection systems.At the application layer, HTTP/2 had a flowcontrol mechanism that was missing from HTTP/1.1. While keeping the number of connections to the target server low, flooding a particular sort of flow control preempted a server implementing HTTP/2 services. This got around established detection systems, which classify network events with a lot of connections as attacks [6]. AI methods (Naive Bayes, Decision Trees, and Rule Learning) displayed a higher percentage of false alarms when the proposed HTTP/2 flood traffic was launched against an HTTP/2 service than when the same AI methods were used to detect HTTP/1.1 DDoS attacks, demonstrating that they circumvented well-known intrusion-detection systems. Given a recommended set of features pertinent to HTTP/2 detection, SVMs demonstrated no false alarms while identifying attacks (Geib C).The focus of modern application-layer attacks has switched from blocking information flow to changing the meaning of information. With the rise of online social networks, a new type of cyberattack has evolved that tries to spread misleading information to influence the way that receivers behave or make decisions (Morey B, 2019). The 2016 US presidential campaign's influence by fake news, which had an impact on national security interests, was likely the most significant misleading information (Morrow S, 2019). False information can also have an impact on people since it sometimes appears as false news, cyberbullying, and online grooming to manipulate the victim's behavior.The ability to identify fake information has emerged as a contemporary application-layer cybersecurity challenge. False information can have a significant impact on both national security and people's wellness. Since AI can swiftly examine a big amount of data, it has demonstrated to be a flexible tool to detect erroneous information. For instance, the authors of examined a corpus of 11,000 items, including news from Reuters, regional news, and blogs, and found that around 29% of the corpus' articles were classified as fraudulent. They used the iterative optimization algorithm stochastic gradient descent to accurately classify bogus news with a 77.2 percent accuracy rate.The authors of suggested correlation-based classifiers, studied more than 150,000 tweets, and showed that the proposed classifiers performed with 47 times more precision than when the method was not applied in classifying messages. authors examined 4.4 million Facebook messages and separated the phony from the real ones. Using Naive Bayes, Decision Trees, AdaBoost, and RandomForest, it was possible to distinguish between real and fraudulent news with an accuracy of 86.9%. Early detection of fake news is essential. As a result, a work proposed an early method for detecting fake news that made use of a family of ANNs.The research measured the speed and complexity of the news-propagation path. Recurrent neural networks (RNNs), which resemble directed graphs, and convolutional neural networks (CNNs), two ANN variants, were used (CNNs, a derivative of DNNs with more hidden layers). While the DNNs measured the topology of the news propagation path, the CNNs measured how news propagated over time, resulting in a tree-like structure that depicts how news spread from one user to another.Within 5 minutes of the initial fake news publication, the work was able to identify it in social media with an accuracy of 85% on Twitter and 92% on Sina Weibo. In addition, classifying texts for incorrect information uses linguistics expertise. The text classification methodologies presented and build on the traits and observations needed in cybersecurity to construct automatic detection techniques. Grammatical errors and word choice are examples of attributes that are taken from linguistic cues and mapped into machine learning features.Additionally, it is possible to recognize bomb threats on Twitter and determine the veracity of Twitter users, such as online predators, by adopting certain words with linguistic signals. These studies demonstrated AI's capacity to utilise new features while demonstrating how automatic detection strategies for erroneous information boost human wellbeing. The term-frequency and inverse document frequency, or tf-idf, is a characteristic that is frequently used in text categorization tasks. In contrast to the value of inverse document frequency, the value of term-frequency rises as there are more frequent terms found in a document.The tf-idf function has been enhanced by numerous false information-detection algorithms, along with other linguistic clues including phrases, syntax, negatives, and punctuation. SVMs can identify irony in words that could be misconstrued as news, but Naive Bayes can categorize subjects on Twitter to identify spam or phishing. DNNs have demonstrated a 93 percent accuracy rate for detecting hate speech in Twitter. Despite recent improvements in text classification tasks, semantic cyberattack detection is still in its early stages. In studies using tf-idf, relevant phrases like "dead" or "bomb" to identify threats and "age," "y," or "year" to identify predators had to be supplied by humans.This demonstrates that despite the usage of AI, human intelligence is still needed for cyberthreat identification at the current application layer. Additionally, some investigations use characteristics other than verbal cues. The presence of URLs in tweeted messages, the proportion of followers to followers on Twitter, the quantity of tweets, the presence of hash tags, users' time zones, and the timestamp of when a tweet was sent are a few examples of these nonlinguistic features that can be used to identify fake news on Twitter. These characteristics are exclusive to social media and not language cues. D. HUMAN LINK AND MALWARE The end user of the Internet, who is a human, is likely the weakest link in cybersecurity. Humans are concentrating on their work rather than continually defending against the escalating cyberattack The end user of the Internet, who is a human, is likely the weakest link in cybersecurity. People are more concerned with their daily duties than they are with the ever-growing cyberattack surface. While some of the well-known cyberthreats can be reduced by reengineering machines, humans need ongoing training based on current and prior problems. One of the key factors contributing to the success of malware propagating through contemporary phishing techniques is this necessity.Software that is intended to do harm, such as a virus, Trojan horse, or worm, is called malware. Phishing is a technique used to get unsuspecting users to do what the attacker wants them to, like click a link or open an executable file. Such behaviors either encourage the spread of malware or persuade the victims to divulge their private data. Traditionally, phishing strategies utilize human deficiencies in their sensory systems, such as through fraudulent emails or webpages, causing victims to be unable to distinguish them from authentic ones. The most advanced phishing methods used nowadays take use of the human capacity for limited omniscience.Users must evaluate the target's credibility to prevent falling for phishing hooks, and frequently this may be done by looking at the code hidden behind the links, which may call for some specialist knowledge. In this field, AI can help to improve human intelligence. These rules serve as the features for AI approaches, saving the user from having to learn every rule for phishing detection. The authors of suggested a method that makes use of SVMs to identify links that lead to fraudulent financial websites. The method makes use of five characteristics: IP address, Secure Sockets Layer (SSL) certificate, number of dots in the URL, length of the site address, and keywords from a blacklist.An SSL certificate, relatively short URL lengths inside the domain, the display of a valid domain name rather than an IP address, and the absence of a subdomain are all indicators of a reputable banking website (higher number of dots). The technique also gathered a large number of terms that are frequently used in phishing websites. The findings demonstrated that the approach has a 98.86% accuracy rate for detecting zero-day phishing. According to this study, we can improve human cybersecurity awareness by using AI training. As seen by attacks on contemporary websites and online social media, adversaries continue to take advantage of human flaws. JavaScript is used by modern websites to increase user-browser interaction and speed up browser response.JavaScript can be used by attackers to phish people or inject malware. Since detecting JavaScript infected websites needs sophisticated coding skills, it is practically hard for the common user to find such compromised websites. Furthermore, current methods include phishing to trick consumers into clicking on a link that will download malware unknowingly through online social media (also referred to as drive-by-download). In response, drive-by download attempts and malicious JavaScript websites have been identified using AI algorithms. In order to get over human limitations in recognizing and analyzing such features, in this case, AI approaches have been used to assess JavaScript word sizes, the distribution of coding characters, the frequency of bytecode in strings, commenting style, and sensitive function calls.While the results from the examined papers revealed that the players liked the game, those papers did not show how effective the games were. Their sample sizes were small, the participants were selected (rather than randomly invited), and the effect size (i.e., the difference in cyber awareness between the group that played the game and a control group) was not studied. Furthermore, critics argue that such training games suffer from privacy and trust issues . Such training games require algorithms to learn about users’ belief in their own ability to accomplish a certain goal , their attitudes toward software updates, creating strong passwords, identifying potentially malicious links, and using appropriate hardware (e.g., backup data). When information learned from the algorithms went into the hands of an adversary, the information would become useful ingredients to create tailoredphishing attacks toward a target. The participants liked the game, according to the results from the analyzed papers, but the effectiveness of the games was not demonstrated. The effect size—that is, the variation in cyber awareness between the group that played the game and a control group—was not examined, and their sample sizes were tiny. Participants were also chosen rather than randomly recruited. Furthermore, some contend that these training games have problems with trust and privacy. Such training exercises necessitate algorithms to learn about users' attitudes about software updates, their practices for setting secure passwords, spotting potentially harmful connections, and using the right hardware (e.g., backup data).When an adversary obtains information obtained through the algorithms, the information can be used to build specifically targeted phishing attacks against a target. The problem would grow more serious if such data were made available to the public or to others without authorization, raising concerns about privacy and trust. surface. While some of the well-known cyber risks can be mitigated by re-engineering machines, humans need ongoing training based on current and prior problems. One of the key factors contributing to the success of malware distributed via contemporary phishing techniques is this necessity.Software that is intended to do harm, such as viruses, Trojan horses, and worms, is called malware. Phishing is a technique used to get unsuspecting users to do what the attacker wants them to, like click a link or open an executable file. Such behaviors either encourage the spread of malware or persuade the victims to divulge their private data. Traditionally, phishing tactics take advantage of the sensory limitations of people, such as through the use of phony emails or websites, making it difficult for victims to identify them from real ones. Modern phishing methods are more complex because they take use of the human capacity for ignorance.Users must evaluate the credibility of the target to avoid falling for phishing hooks, and frequently this may be done by looking at the code hidden behind the links, which may call for some specialist knowledge. This is one area where artificial intelligence (AI) can support human intelligence. These rules serve as the features for AI approaches, saving the user from having to learn every rule for phishing detection. The authors of suggested a method for identifying links leading to fraudulent financial websites that makes use of support vector machines. The strategy makes use of five features: the IP address, SSL certificate, number of dots in the URL, length of the web address, and keywords on a blacklist.Authentic banking websites display a valid domain name rather than an IP address, have an SSL certificate, have a domain that is relatively short, and are not a part of a subdomain (higher number of dots). The technique also gathered a large number of terms that are frequently used in phishing websites. The findings demonstrated that the approach has a 98.86% accuracy rate for detecting zero-day phishing. According to this study, we can improve human cybersecurity awareness by using AI training. As seen by attacks on contemporary websites and online social media, adversaries continue to take advantage of human flaws. Through the usage of JavaScript, modern websites enhance user interaction with the browser and speed of response.JavaScript can be used by malicious parties to phish people or inject malware. Since detecting JavaScript infected websites needs sophisticated coding skills, it is practically hard for the common user to find such compromised websites.Additionally, modern methods for spreading malware through online social media involve tricking people into clicking on a link that would unwittingly download the malware (also referred to as drive-by-download). In response, drive-by download attempts and malicious JavaScript websites have been identified using AI algorithms. In order to get over human limitations in identifying and analyzing such features, in this case, AI approaches have been used to assess the JavaScript word sizes, the distribution of coding characters, the frequency of bytecode in the strings, the commenting style, and the sensitive function calls. Another AI-based method has also been used to identify malicious JavaScript that has been obfuscated and to offer fail-safe features to stop malware from spreading after users have been tricked into clicking on dangerous links.The objective of usable security is to design systems that are easy for the typical person to use while remaining safe. Using some types of games is one way to raise the average human user's knowledge of cybersecurity. The game hones the players' awareness of bogus URLs that resemble real ones; for instance, differentiating between the fake URL "www.paypa1.com" and the real URL "www.paypal.com." The authors of this study looked at 28 articles that discussed cybersecurity training games. The results from the analyzed studies indicated that the participants enjoyed the game, but they did not demonstrate the games' level of effectiveness. Their effect sizes were not quantified, the sample sizes were modest, and the individuals were chosen rather than randomly recruited.Furthermore, some contend that these training games have problems with trust and privacy. Such training games need algorithms to learn the players' attitudes about software updates, strong password creation, spotting potentially harmful links, and using the right hardware, as well as their beliefs in their own capacity to complete a task (e.g., backup data). When an adversary obtains information gleaned from the algorithms, the knowledge can be used to build phishing attacks that are specifically aimed at a target. The problem would get worse if such data were made public or made available to unauthorized individuals, which would raise concerns about privacy and trust. E. THE INTERNET OF THINGS Computers are now more powerful, portable, tiny, and reasonably priced. The IoT era began with the widespread use of mobile devices like phones and tablets. Many modern gadgets, including toys, appliances, cars, and industrial control systems, come with networking features and Internet access, which enable the Internet of Things (IoT). Fig. The development of technology that resulted in the development of the IoT is illustrated in Figure 6.Other paradigms, such big data, fog computing, and cloud computing, are allowing mobile devices with constrained resources to access a variety of remote services. Researchers introduced fog computing services by bringing the platform and application closer to the customer because the demand for better data speeds is growing. To reduce network roundtrip delays, fog computing distributes servers, notably for Content Delivery Networks (CDNs). Fog computing hence provides real-time energy and carbon footprint control in addition to improving website speed.Additionally, the development of vehicular networking apps, which allow for quick data transfers between mobile devices, was made possible by advancements in telecommunications technology. F. PRIVACY The ability of Internet-connected gadgets to collect data is advancing as they become smaller and more prevalent, outpacing people' capacity to be conscious of their actions (in capturing data). Devices gather data to enhance user experience, including voice, geolocation, ambient temperature, and lighting. However, research indicates that gathering such information might be done with bad intentions in mind. Intelligent virtual assistants (like Google Home, Apple's Siri, and Amazon Alexa) can be used to secretly record conversations or activate smart (garage) doors. According to one study, gadgets can be used to smuggle items, cyberbully people, incite panic, and redirect a user's browsing path to offer adverts.Devices can also be used to associate a place or a person with criminal activity. In the past, safe authentication methods like encryption and security certificates have been used to address privacy issues. With mobile devices and cloud-stored data, these procedures change with the IoT. When routing paths vary dynamically and data is stored by a third party, AI approaches can be used to maintain privacy in communications. For instance, artificial immune system methods were adopted to securely self-organize Wireless Sensor Network (WSN) ad hoc connections to serve mobile devices, as well as learning automata to distribute secure certificates to moving cars. Different IoT devices, like mobile devices, join and depart the network dynamically in WSNs.Due to this, conventional security methods like port security—which limits traffic to known Media Access Control (MAC) addresses—are no longer effective. In order to explain a device's behavior, the authors suggested metrics including packet receiving rate, packet mismatch rate, and energy usage per packet received from a device. They classified a device's activity as normal or pathological using artificial immune system algorithms. Unencrypted packets were dropped when strange behavior was noticed. This demonstrates the need for new privacy measures as there are more and more Internet-connected gadgets. Furthermore, privacy issues regarding how sensitive data might be accessed by cloud operators occur as a result of the significant amounts of data that are stored in the cloud.Intelligent algorithms were used to spread sensitive data among multiple cloud servers in order to address this problem and make it difficult for cloud operators to spy on users. Additionally, well-known biometrics and metrics for human behavior were used in secure authentication techniques. However, problems occur when authentication devices are unable to operate properly in a variety of operating environments. To solve these problems, AI methods (such genetic algorithms) have been applied to provide accurate face, fingerprint, and voice recognition in a variety of operating conditions.Blockchain is a disruptive technology that can get around laws to support privacy. Blockchain enables the storage of encrypted data without the intervention of a centralized authority on a network of peer-to-peer untrusted machines. Blockchain applications are facilitated by the usage of AI technologies in conjunction with blockchain. Blockchain applications can ensure secure connection between two IoT devices thanks to AI techniques. Traditionally, security measures that permit two IoT devices to communicate remotely have been based on some centralized systems. In order to provide secure communication between two remote IoT devices without the use of a centralized infrastructure, blockchain technology was proposed. In order to enable automatic resource sharing between IoT devices, information from Reinforcement Learning saved in the blockchain was used to determine if the communicated data complies with the end devices' access control restrictions.The research described how the healthcare industry may obtain medical data while protecting patients' privacy in order to forecast probable illnesses or medical problems. Algorithms for classification and prediction need a lot of data, which is counter to patients' desire to share their health information. Such medical data might be stored on a blockchain, guaranteeing patient privacy and giving them control over their personal data, such as regulating access privileges. Patients are more comfortable saving personal information and biomarkers (such as blood parameters and waist circumference) that can be used to identify hazards and reveal their health condition.Before being stored on the blockchain, medical imaging data might be utilized to extract information such as biomarkers and tumor tissues using AI techniques like DNNs. Inferring chronic disorders and probable diseases (such diabetes or cardiovascular disease) from medical information could be done using RNNs. I], smart, contract-based, data-trading systems were developed using AI techniques like similarity learning. However, a dispute develops when the purchaser's data does not match what the provider claimed. Thus, similarity learning was used in to determine the distance between the data attributes of the buyer and provider, thereby confirming the consistency of the data.This demonstrates that AI privacy jobs will take into account legal, ethical, and regulatory frameworks because disclosing personal information can improve human welfare. G. CYBER-PHYSICAL SYSTEMS CPSs (cyber-physical systems) combine monitoring, processing, and communication capabilities. They utilize embedded systems and sensor networks to gather data, and software components and actuators are used to react to the environmen]. As nations compete to become the dominant player in this industry, the core CPS concepts are being implemented on a global scale. The economic progress in Germany's "Industry 4.0", China's "Made in China 2025", and western countries' "Smart Cities", where manufacturing processes are automated and suppliers at various places link to one another, is motivated by the phenomena mentioned in CPS. CPS might be seen as the next economy powered by AI. To create things more quickly was one of the initial demands that drove intelligent manufacturing.AI techniques have been used to produce electronic circuit boards, control systems that perform real-time analysis on remote hydroelectric power plants, and evaluate the dependability and safety of railway control systems by autonomously gathering data and working together to complete tasks. The education industry, which demands adaptation to individual students, was another key motivator behind the use of AI in intelligent manufacturing. To address this demand, intelligent agents were used in the development of instructional software that can modify the level of difficulty of exercises to match the learning rate of the learner. Because they produce precise forecasts and output estimations, AI approaches are appropriate for addressing CPS needs. To predict temperature given the changing climate, the energy management industry was among the early adopters of AI approaches.In this instance, fuzzy networks were utilized to regulate the airflow in order to provide the appropriate temperature. Power distributions on a bigger scale necessitate better energy quality, capacity, and dependability. In this field, AI methods like genetic algorithms and neural networks have also been used. When selling and buying to/from the grid are subject to different energy rates, they are employed to solve profit management problems. The prevalence of tiny devices necessitates CPS since it improves data collection efficiency and opens the door to processing huge data.This is an area where AI applications in CPS intersect with AI applications in cybersecurity, because often data is remotely acquired via processing systems. How to gather data with a high level of confidence, send it securely, and share it while maintaining the data's integrity and privacy are all examples of cybersecurity difficulties in this situation. The AI applications in CPS tie in with earlier concepts of dependable data, safe networks, and privacy concerns. In smart agriculture, where sensors are planted in the soil to collect temperature data and levels of nitrogen and carbon, AI applications' convergence in CPS with cybersecurity is plain to see.In order to make informed judgments about the use of water and fertilizer, farmers combine the sensor data from their equipment with current weather predictions to create an irrigation monitoring system. The method uses genetic algorithms to determine the appropriate temperature threshold and is used in AI techniques. Cloud applications are used by sensor-based systems to store and process the data from the many sensors, giving farmers access to real-time information. Farmers can achieve their ideal crop output quality in this way. If any of these cyber entities are vulnerable to attack, cybersecurity challenges arise, including malware that can infect sensors, the quality of data transmitted over networks, the accessibility of cloud computing resources for irrigation systems, and whether sensor data can be shared. Crop harvesting could be significantly hampered if such cyber concerns are not addressed. H. CRITICAL INFRASTRUCTURE Critical infrastructures are resources that are essential to society and the nation's security. These infrastructures include telecommunications, water, air traffic control, and power (oil, gas, electricity, and nuclear). Because people's everyday lives and activities depend on the availability and integrity of essential infrastructure, protecting it is of the utmost significance. Previous conversations illustrated how the scope of cybersecurity has widened from network intrusion detection systems to include ways to enhance human wellness. The change was prompted by various industries, including the health and education sectors.Additionally, the development of AI methods to improve cybersecurity is supported by the critical infrastructure industry. The primary function of cybersecurity in critical infrastructures is to protect SCADA systems. They are the primary control systems for the infrastructure (consisting of computing nodes that communicate with other nodes). Typically, SCADA systems are located on an organization's operational technology (OT) networks. These OT networks and Information Technology (IT) networks are more exposed to internal and external cyberattacks as they are interwoven and connected to the Internet. Critical infrastructures must be resilient against such cyberattacks notwithstanding these dangers and their inherent weaknesses. Maintaining the business continuity of a vital infrastructure is thus one of the requirements and challenges.Applying AI methods can be used to maintain the SCADA systems' resilience. For instance, by using Artificial Neural Networks (ANNs) that track ambient temperature, generator speed, and pitch angle of the generator power outputs, failures in wind turbine generators could be predicted . AI methods including k-NN, Decision Trees, and SVMs have been used in water system control to categorize various anomaly events, such as cyberattacks and hardware malfunctions . Additionally, SCADA systems have been provided access control based on users' dynamic properties, such as location, time of use, and the user's work shift (when the user works on-site), using AI techniques like SVMs and ANNs.Because the critical infrastructure sector is so crucial to society, using AI to create robust resilience will continue to be an active study topic. The field of protecting critical infrastructure has absorbed other AI concepts, such as propositional logic. Because the authentication procedure in this environment necessitates a sophisticated mapping between user privileges and system regulations, the authors of presented a logic-based architecture to implement security standards for system authorisation in SCADA systems. In such a framework, rules are dispersed among the system nodes to determine the range of actions that the user may carry out on each node.An authorization server receives both the command and the user privilege information when a user with a particular privilege delivers a command to a target node. The user privilege, command, and token are all forwarded to the target node by the server once it has processed and analyzed the information received. To decide whether to approve or reject the execution of the command, the node compares the token with its local permission policy. As a result, the proposed logic-based architecture supports scalable authorization in SCADA systems since destination nodes make the option to allow or deny instructions. The idea of intelligent algorithms using logic to self-heal the communications channel of SCADA systems has also been put forth.Using session keys, SCADA systems encrypt their communication with remote nodes. It is essential for the node to restart the communications channel as soon as possible after a failure so that no unauthorized users or agents can take over the restart of the communications channel. In order to produce a fresh session key, the authors of suggested distributing re-keying materials to the remote nodes. The materials for re-keying are made up of a series of numbers produced by a formula (i.e., bivariate polynomial). Similar to how a session key is generated, a session key is created by a series of mathematical and logical operations.A remote node can generate a session key as a result, essentially self-healing the communications channel, once it has recovered from an unavailability incidence on its communication channel. In addition, self-healing electrical distribution systems have been developed using mathematical models . A collection of 22 features, including the cost of power losses, the power demand at each node, and the magnitude of voltage at each node, are used by the self-healing system to decide which network zone to isolate in the aftermath of such events. Set theory was used by the system to cluster the features. Following that, the system sent these clusters to a number of mathematical models (i.e., backward/forward sweep load-flow algorithms) that simulate the steady-state of electrical distribution systems.Thus, a variety of approaches are being employed to satisfy the cybersecurity needs of the critical infrastructure sector, including both logical and mathematical ones. The conversation outcomes from this section are listed in Table 2. The role of AI in cybersecurity will expand as the Internet develops. Applications that are essential to human welfare and national security are using AI technology. AI techniques are being utilized to make machines think and behave like people as well as to solve issues logically. V. FUTURE CHALLENGES AND RESEARCH OPPORTUNITIES A. THE RACE BETWEEN DEFENSE, OFFENSE, AND HUMANITY The competition between white hat (defenders) and black hat (offenders) hackers has been stoked by recent developments in AI research in cybersecurity. Attackers can use AI to simulate human behavior in order to gain personal satisfaction, dominance, or monetary gain. Intelligent agents that autonomously click advertising, play online games, and purchase and resale concert tickets have been developed thanks to AI. Additionally, AI has influenced the US presidential election by disseminating customised news and has influenced public opinion in Venezuela by retweeting political content. How clearly dividing lines may be formed between advancements and fundamental needs will affect future research prospects in cybersecurity.White hat hackers, black hat hackers, and end users are the three main stakeholders that are impacted by AI's use in cybersecurity (human ity). The coworkers who make up the white hat and black hat hackers encourage the creation of AI techniques. To manage the deployment of technology, a line must be drawn between the two groups, but this is challenging because one group's advancements lag behind the other. Therefore, it is crucial to look at how AI might be applied to meeting basic human needs and creating cybersecurity measures. B. INFRASTRUCTURE The application of AI to cybersecurity is seen as a competition between the government and online criminals. The winner of the race will be determined by who has access to the necessary technical knowledge and computing resources. For instance, because they are evolutionary in nature, AI systems are computationally expensive. Therefore, a focus of current research should be on creating quick algorithms for the AI solutions presented in Table 2. For instance, to facilitate quick grouping of common data samples, hashing methods have been devised as input to the k-means clustering algorithms. The recent competition has included the creation of pertinent algorithms, but hardware development is also an essential component. C. HARDWARE AND PLATFORM Access to cutting-edge computer infrastructure will make it easier to effectively and efficiently solve AI problems. It will become more urgent to perform data analysis as the number of computing devices and the amount of traffic both grow. Consequently, advanced computing systems are needed in order to analyze data utilizing AI approaches. Cluster computing tools like Apache Spark and Hadoop have been used to analyze cyber traffic in order to meet this challenge. At the top end, quantum computing will be the ground-breaking innovation that assists in resolving challenging computing issues. NASA's quantum computer, which is 100 million times faster than conventional computers, has been able to solve complex problems in a fraction of the time. D. RESOURCES When establishing effective computer solutions, having quick access to the necessary resources is essential. Energy is currently considered to be a limited resource for many computing requirements. For instance, only to commit one block, the Bitcoin blockchain uses the energy of 29 typical Australian households for an entire day. Ethical concerns about the usage of AI will surface when intelligent machines begin to use a significantly greater portion of resources that are shared with humans. If intelligent machines have their own rights, that would be one problem. The fact that computers are thought to lack consciousness in some ways makes the problem seem unimportant .Researchers are also debating whether intelligent machines ought to be granted rights regardless of what constitutes awareness . The debate over how to divide scarce resources between intelligent machines and people is expanded by the usage of AI in cybersecurity. Regulators will therefore be motivated to review their assumptions about what constitutes development and fundamental needs. Future challenges in using AI for cybersecurity will also revolve around ethical issues. VI. CONCLUSION AI has emerged as a critical tool in the field of cybersecurity as the pace and sophistication of assaults rise. This article demonstrated how cyberthreats have grown, become more complicated, and expanded in scope. We stress how current hazards are still affected by historical cyberthreats. We provided a thorough analysis of cyberthreats and available countermeasures. In particular, we discussed the impact of cyberattacks on various network architectures and applications. Even as the community recognizes cyberthreats and creates remedies utilizing a wide range of technologies and methodologies, cyberthreats will continue to increase.Modern research has demonstrated the potential of AI approaches to counter future cybersecurity threats. The methodologies suggest a variety of intelligent behaviors, from how machines can think and behave like people. Recent AI-based cybersecurity proposals have mostly concentrated on machine learning methods that use intelligent agents to differentiate between attack traffic and genuine traffic. In this scenario, intelligent agents take on the role of humans, and their job is to identify the most effective classification criteria. Today's cyberattack scene, however, shifts from causing computer disruption to causing social unrest and endangering human welfare. We talked about this topic in terms of how technological advancements are changing how cyberattacks can be launched, discovered, and mitigated.AI's contribution to cybersecurity will increase steadily as a result of these developments. To promptly identify and neutralize risks that jeopardize society stability and human welfare, innovative AI solutions must be created. It's likely that cybersecurity solutions will move beyond intelligent agents that act like humans to ones that think like humans. Although the role of AI in addressing cybersecurity challenges is still being researched, there are certain fundamental questions about how and where AI deployment can be governed.As intelligent machines, for instance, become more crucial solutions for mankind, they will gradually deplete life's essential resources. When machines and people fight for limited resources, a new type of government will emerge. This will then open up a fresh line of inquiry. DECLARATIONS Ethics Approval and Consent to Participate: No participation of humans takes place in this implementation process Human and Animal Rights: No violation of Human and Animal Rights is involved. Funding: No funding is involved in this work. Data availability statement: Data sharing not applicable to this article as no datasets were generated or analyzed during the current study Conflict of Interest: Conflict of Interest is not applicable in this work. Authorship contributions: There is no authorship contribution Acknowledgement: There is no acknowledgement involved in this work REFERENCES Venable D. 2017. “Cybersecurity ” In 2017: when Moore’s law attacks, Cybersecurity-in-2017-when-moore-s-law-attacks . https://doi.org/10.1111/risa.13687 Morgan S. 2019. “Global cybersecurity spending predicted to exceed $1 trillion from 2017-2021.” Cybercrime Magazine . 2019. https://doi.org/10.1016/j.chbr.2022.100167 Yuhas D.2017. “Doctors have trouble diagnosing alzheimer’s. AI doesn’t,” NBC News, Oct. 2017. doi: 10.3390/diagnostics11081473 McFarland M., 2017.“Farmers spot diseased crops faster with artificial intelligence.” CNN Business, Dec. 2017. https://doi.org/10.3390/agriculture12010009 Geib C. “Nasa-funded research will let unmanned spacecraft "think" using AI and blockchain.” Futurism . https://doi.org/10.1063/1.5007734 Winick E. 2017. “Lawyer-bots are shaking up jobs.” MIT Technology Review. doi .org/10.1177/0162243915605575. Morey B. 2019. “Manufacturing and AI: Promises and pitfalls.” SME . DOI: 10.1115/1.4047855 Morrow S., Crabtree. 2019. “The future of cybercrime & security, Juniper Research”. https://doi.org/10.1016/S1361-3723(18)30082-4 Tables Tables 1 and 2 are available in the Supplementary Files section. Additional Declarations No competing interests reported. Supplementary Files Tables1and2.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-3975155","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":274088045,"identity":"508ccabe-ea9f-4c6f-9a1a-cec269a4ccbc","order_by":0,"name":"Asan Mohideen Khansadurai","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABBElEQVRIiWNgGAWjYPCCAzwMBxgYGD9U2MiBuQ+I1cIscSbNGMxNIEILGDHwthxObADx8Wnh7z/87NGNijsyfMebjz2QbEhLnx92+CHQFjs53QbsWiRupJkb55x5xiN55li6QeEOm9yNt9MMgFqSjc0O4LDmBoOZdG7bYR6DGzlmEpJn0nI3zk4AaTmQuA2HFvnzx79BtNx//02Ct+1wuuHs9A94tRgcyIHZwsMG0pIgL52D3xbDGzll0jlnDgP9kmYmDQxkww3SOQUHEgxw+0Xu/PFt0jkVh+35jh9+JgmMSnn52embP3yosJPD6X0sTgWTxCoHAfkGUlSPglEwCkbBSAAAXlhqRBwvQL4AAAAASUVORK5CYII=","orcid":"","institution":"Sudharsan Engineering College","correspondingAuthor":true,"prefix":"","firstName":"Asan","middleName":"Mohideen","lastName":"Khansadurai","suffix":""},{"id":274088046,"identity":"b404684d-46ad-4e4c-a574-c637f5bef068","order_by":1,"name":"Anjana Devi Veluchamy","email":"","orcid":"","institution":"Rajalakshmi Institute of Technology","correspondingAuthor":false,"prefix":"","firstName":"Anjana","middleName":"Devi","lastName":"Veluchamy","suffix":""},{"id":274088047,"identity":"550ec61f-2156-4f37-8fef-ac7af5dd659b","order_by":2,"name":"Suganthi Balasubramani","email":"","orcid":"","institution":"RVS College of Engineering and Technology","correspondingAuthor":false,"prefix":"","firstName":"Suganthi","middleName":"","lastName":"Balasubramani","suffix":""},{"id":274088048,"identity":"77813cee-780a-4eec-bbb2-0c7a9d29dbdf","order_by":3,"name":"Kiruthiga Balasubramaniyan","email":"","orcid":"","institution":"K.Ramakrishnan College of Technology","correspondingAuthor":false,"prefix":"","firstName":"Kiruthiga","middleName":"","lastName":"Balasubramaniyan","suffix":""}],"badges":[],"createdAt":"2024-02-21 10:01:48","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-3975155/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-3975155/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":51539635,"identity":"bfb40d96-3514-4212-a478-c72da893f60d","added_by":"auto","created_at":"2024-02-23 10:48:38","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":10621,"visible":true,"origin":"","legend":"\u003cp\u003eThe range of intelligent measures ranges from thinking like a human using the Turing Test to acting like a human to maximize the result.\u003c/p\u003e","description":"","filename":"image2.png","url":"https://assets-eu.researchsquare.com/files/rs-3975155/v1/e8339927118118d4364bfd7e.png"},{"id":51539636,"identity":"27c6aafb-7a47-44f8-8598-cda84b6531c6","added_by":"auto","created_at":"2024-02-23 10:48:38","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":20898,"visible":true,"origin":"","legend":"\u003cp\u003ea decision tree example that divides network traffic into attack and non-attack types\u003c/p\u003e","description":"","filename":"image3.png","url":"https://assets-eu.researchsquare.com/files/rs-3975155/v1/4f19c5e7877f2716896d2337.png"},{"id":51539640,"identity":"29622dff-3f05-462d-8375-f848ac3b39eb","added_by":"auto","created_at":"2024-02-23 10:48:38","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":34059,"visible":true,"origin":"","legend":"\u003cp\u003eData are classified into classes 1 and 2 using the k-Nearest Neighbor (k-NN) algorithm based on the k nearby data samples from the new data sample.\u003c/p\u003e","description":"","filename":"image4.png","url":"https://assets-eu.researchsquare.com/files/rs-3975155/v1/232449706795bd1d56973d78.png"},{"id":51539639,"identity":"ba08db68-7c35-485c-96fe-5870bfd1d9ac","added_by":"auto","created_at":"2024-02-23 10:48:38","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":12370,"visible":true,"origin":"","legend":"\u003cp\u003eSVMs, or support vector machines, locate a plane dividing data samples.\u003c/p\u003e","description":"","filename":"image5.png","url":"https://assets-eu.researchsquare.com/files/rs-3975155/v1/b4692b72b484023b03f827c7.png"},{"id":51539637,"identity":"ffc7d6f6-e851-45e9-9b26-e8c8225c1ef9","added_by":"auto","created_at":"2024-02-23 10:48:38","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":32451,"visible":true,"origin":"","legend":"\u003cp\u003eapplying AI to cybersecurity across a range of use cases. The increased importance of AI is reflected in larger bubble sizes.\u003c/p\u003e","description":"","filename":"image6.png","url":"https://assets-eu.researchsquare.com/files/rs-3975155/v1/5bcd9c66a8752aa4d70315dc.png"},{"id":51539641,"identity":"65a6dbb8-84ad-4463-8488-1cd08b1b149a","added_by":"auto","created_at":"2024-02-23 10:48:38","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":232532,"visible":true,"origin":"","legend":"\u003cp\u003eInternet of Things and Internet of Content (Short Message Service [SMS]). devices.\u003c/p\u003e","description":"","filename":"image7.png","url":"https://assets-eu.researchsquare.com/files/rs-3975155/v1/78206ac3e6d460eed241337b.png"},{"id":51657216,"identity":"14951941-3ba1-4780-917f-4a0d90880777","added_by":"auto","created_at":"2024-02-26 17:43:29","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":607411,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-3975155/v1/1962e7b7-20b2-4c17-8097-a055df6900fc.pdf"},{"id":51539638,"identity":"af4eb8e8-0be4-4ab5-af4c-90c6a328a0f4","added_by":"auto","created_at":"2024-02-23 10:48:38","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":339256,"visible":true,"origin":"","legend":"","description":"","filename":"Tables1and2.docx","url":"https://assets-eu.researchsquare.com/files/rs-3975155/v1/77213ff0f17297f08491f6ec.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Crime Rate Prediction using Cyber Security and Artificial Intelligent","fulltext":[{"header":"I. INTRODUCTION ","content":"\u003cp\u003eCrime poses a serious threat to humanity. Numerous crimes occur frequently at regular periods. Maybe it\u0026apos;s growing and dispersing quickly and widely. From small towns and villages to large metropolis, crimes occur. There are many distinct types of crimes, including robbery, manslaughter, rape, assault, battery, and false imprisonment. There is a need to resolve cases much more quickly because crime is rising. The police agency is responsible for containing and decreasing the crime activities, which have increased at an accelerated rate.Given the vast amount of crime data available, crime prediction and criminal identification are the police department\u0026apos;s two biggest issues. Technology is required so that case solving can be completed more quickly. It was discovered through extensive documentation and cases that machine learning and data science may expedite and simplify the work. The purpose of this project is to forecast crime using the dataset\u0026apos;s attributes. The official websites are where the dataset was taken. We can anticipate the type of crime that will occur in a specific location with the help of a machine learning system utilizing Python as the core.The purpose would be to train a model for prediction. The test dataset will be used to validate the training on the training data set. For the purpose of predicting crime, the Multi Linear Regression (MLR) will be utilized. To examine potential crimes that may have happened in a specific year based on population and number of crimes, a dataset is visualized. This work aids law enforcement organizations in forecasting and identifying the crime percapita in a region, which lowers the crime rate. Cybersecurity is characterized as a collection of procedures, human conduct, and technological frameworks that assist protect electronic resources.Similar to Moore\u0026apos;s law, which predicts that the number of components on an integrated circuit will double every two years while chip production costs will decline, cybercriminals are increasingly doubling the efficiency of their attack tools every few months while cutting the cost in half (Venable D, 2017). The amount spent on cybersecurity worldwide is anticipated to reach $1 trillion between 2017 and 2021 (Morgan S, 2019) having already risen by approximately 40% from $66 billion in 2013 (Yuhas D, 2017). Researchers in cybersecurity have recently begun to investigate Artificial Intelligence (AI) methods to enhance cybersecurity. Similar to how thieves use AI to launch increasingly complex cyberattacks while evading detection.In this work, we put more of an emphasis on how AI-based cybersecurity solutions could be able to better fend off attackers and reduce or avoid data breaches. Since its inception in the 1950s, developments in AI have produced a variety of fascinating research findings and systems. Machine learning and deep learning emerged as a result of further developments (McFarland M, 2017). Today, AI is being used in a wide range of applications, including manufacturing, law, healthcare, agriculture, and space exploration (Geib C., Winick E, 2017., Morey B., 2017., Morrow S, 2019) A wide variety of AI systems with different capabilities have been developed and used as a result of the ongoing performance advancements in computer hardware and software (along with their falling costs) and new paradigms like big data and cloud computing.Many of these AI systems can currently handle a variety of difficult tasks, such as planning, learning, problem-solving, decision-making, and face- and speech-recognition. The introduction of machine learning technologies, which enable computer systems to learn and adapt to various environments by leveraging their prior experiences, patterns, and knowledge, has been a key advancement in the field of artificial intelligence (AI) since the 1980s. An area of machine learning called deep learning, which was developed ten years ago, allows computers to find hidden links in the input data, leading to more precise planning and prediction outcomes. Recently, there has been a rise in interest in using machine learning and artificial intelligence to combat cyberattacks.The current constant production of vast volumes of data necessitates the adoption of these techniques because it takes a lot of time and resources to analyze and spot any patterns, anomalies, or intrusions in traffic data.\u003c/p\u003e"},{"header":"II. CYBERSECURITY THREATS AND LEGACY CYBERSECURITY SOLUTIONS","content":"\u003cp\u003eThere have been many different kinds of cyberthreats throughout the past ten years. Then, we quickly go over those dangers. The top 10 cyberthreats we face today, according to a recent report, include:\u003c/p\u003e\n\u003cp\u003e1) Denial of Service (DoS) attacks: These seek to deplete a victim system\u0026apos;s computing power by flooding it with a large volume of requests that must be handled quickly.A single attacker machine can launch a distributed denial of service (DDoS) attack against a victim machine by sending a large number of network traffic packets that seem legitimate, thereby evading security measures along the way. Multiple attacker machines can take part in a distributed-style DoS attack, also known as a distributed denial of service (DDoS) attack, which has a similar effect on the victim machine. Due to the ready availability of attacker tools and the growth of the CyberCrime as a Service (CCaaS) business, DoS attacks are getting more sophisticated and difficult to detect.\u003c/p\u003e\n\u003cp\u003e2) Man-in-the-middle (MiTM) attacks: These traditional assaults involve intercepting data being transmitted on a channel between two legitimate parties that are conversing. The attacker inserts themselves between A and B, either physically or virtually, pretending to be A to speak to B by intercepting A B messages and replacing them with nefarious or altered messages. The attacker then repeats the process on the B communication line, i.e., pretending to be B and speaking to A.IP address spoofing is one of the attack\u0026apos;s variant implementations, in which a hostile actor deceives lawful systems into believing it is a trustworthy entity in order to get access to the system. A message replay attack is when a hostile actor uses the communication channel to repeatedly transmit an old, previously stored message.\u003c/p\u003e\n\u003cp\u003e3) Phishing and spear-phishing attacks: In order to trick gullible end users into clicking a link and disclosing personal information, these are carried out by creating emails that seem valid and sending them to legitimate systems. Such attacks take advantage of social engineering concepts, whereby emails are crafted to look trustworthy to end users in order to win their trust. In order to craft emails that seem extremely legitimate and frequently contain trusted email addresses in the \u0026quot;from\u0026quot; field, bad actors must first conduct a thorough background check on potential victims. This technique is known as spear phishing.\u003c/p\u003e\n\u003cp\u003e4) Drive-by attacks: These are carried out by bad actors that browse the web looking for websites that are vulnerable so they can insert harmful scripts into the webservers. Visitors to the website eventually become infected with the malware, which causes system compromise, the release of sensitive data, and other harm.\u003c/p\u003e\n\u003cp\u003e5) Password attacks: These include utilizing common passwords to brute force their way into a system, spying on user keyboard activity, and creating complex passwords using artificial intelligence (AI) techniques.\u003c/p\u003e\n\u003cp\u003e6) Structured Query Language (SQL) Injection Attacks: These traditional cyberattacks take advantage of SQL language flaws by injecting SQL query code into a webpage\u0026apos;s input fields. When the webserver executes the SQL code, it may reveal all or part of the data stored on a backend database server, possibly including usernames and passwords.\u003c/p\u003e\n\u003cp\u003e7) Cross-site scripting attacks: These involve inserting malicious code into a web server that is weak. Naive end users\u0026apos; subsequent retrieval of the hosted webpages would infect the victim\u0026apos;s computer with malware. Such malware may send user information from the victim\u0026apos;s computer to the servers of the malicious actor, which may then enable web sessions to be hijacked, credentials to be stolen, keystroke loggers to be installed, screenshots to be taken, and even remote control of the victim\u0026apos;s computer.\u003c/p\u003e\n\u003cp\u003e8) Eavesdropping attempts: These can be made by sniffing out the network connection channel and subsequently misusing the data gathered. Malicious actors may aggressively attack the line, substituting messages with bogus ones, and pose as normal users, or they may passively sniff the connection and collect user data.\u003c/p\u003e\n\u003cp\u003e9) Birthday attacks: This message digest, sometimes referred to as a hash, can be generated using a common technique like the Secure Hash Algorithm 1. (SHA-1). A hash value with a set length is produced when this method is applied to a message of any length. The birthday attack describes a malicious actor\u0026apos;s attempt to identify two distinct messages that generate the same hash value. As a result, the original message may be replaced with another message that generates the same hash value, resulting in data loss and system and service disruption. These attacks use artificial intelligence to find random messages that have the same hash value as a real message.\u003c/p\u003e\n\u003cp\u003e10) Malware attacks: One of the biggest challenges for web hosting companies is the possibility that malware can propagate through their websites. In 78 percent of websites, a significant vulnerability exists that might be used by the adversary to execute malicious code without the need for user involvement, according to Symantec\u0026apos;s 2016 Threat Report. Using the right security controls, such as web proxies, firewalls, and intrusion detection systems, can help to bolster a website\u0026apos;s defenses. The trade-off between the appropriate amount of security controls and the usability of the hosted websites is a significant problem in this situation. The region of vulnerability for a website increases with its level of usability.\u003c/p\u003e\n\u003cp\u003eAttempts are made to initiate network assaults against the environment in order to obstruct services, steal personal or business data, and gather network intelligence. Malicious people acquire access and manipulate the operating system (OS) by taking advantage of a flaw in the OS that allows them to do so. Some of these assaults involve the theft of personal data, which can be exploited to access private or business information. According to the attack objectives, anticipated targeted device or application, data/information revealed while a certain assault is in progress, kind of environment impacted when a certain attack occurs, and how these attacks are discovered, we classified numerous network attacks in Table 1.\u003c/p\u003e\n\u003cp\u003eWe then quickly go over conventional (non-AI) cybersecurity methods for spotting cyberattacks:\u003c/p\u003e\n\u003cp\u003e1) Game theory has been used in cybersecurity before. The victim\u0026apos;s computer is the other participant in the game, with the malicious actor acting as the first player. Each person makes an effort to maximize their incentive by strategic movement, arguing logically that the move will achieve the desired result. The actions of each participant can either be predicted in advance or remain a secret. An illustration of a game would be a smart grid setting where the attacker tries to stop communication between a home and a power system while the defender tries to keep these different entities connected. Both the attacker and the defender would use techniques to accomplish their respective aims at each stage of the game.\u003c/p\u003e\n\u003cp\u003e2) Rate control: DoS and DDoS attacks target the availability of systems. Through basic traffic throttling, redefining permission lists, and limiting the amount of incoming network traffic, rate-control approaches can minimize the impact on such systems\u0026apos; ability to function when they are attacked.\u003c/p\u003e\n\u003cp\u003e3) Heuristics: Heuristics are frequently used by firewalls and intrusion detection systems to determine the best rule for categorizing network data as legitimate or abnormal. One such method uses substring matching as part of a series of operations to find suspicious URL addresses. The VirusTotal application, which is a website where one can enter a website address and receive a scored analysis about how malicious the input website is, is used in the second phase of the proposed scheme to scan the web address. The lowest score of the two scans is taken into consideration when determining whether to allow the data packets into the network or not.\u003c/p\u003e\n\u003cp\u003e4) Signature-based intrusion detection: A signature based intrusion detection system makes use of a database that may store legitimate signatures corresponding to normal traffic or attack signatures cor responding to harmful activity. In real time, the intrusion detection system compares the information in arriving network packets with the signatures that have been previously saved. The disadvantage of this method is that intrusion detection systems are limited in their ability to reliably detect malicious data entering a network in the absence of relevant signatures.\u003c/p\u003e\n\u003cp\u003e5) Anomaly-based intrusion detection: This method builds a mental model of what is typically observed. The models may take the form of statistical methodologies, mathematical models, or rule-based systems. Changes from the ordinary are viewed as assaults. Such strategies have the advantage of not being dependent on signature patterns, which frees them from administrative efforts to collect signatures when compared to signature-based detection.\u003c/p\u003e\n\u003cp\u003e6) Autonomous systems: These can provide dependability and availability, self-protect and self-heal, as in the case of the Bionic Autonomic Nervous System (BANS). The four different modules that make up this system are called Cyber Neuron, Cyber Axon, Peripheral Nerve, and Central Nerve. Malware and spy ware are protected from by Cyber Neuron. An sophisticated tool to repair spyware and malware damage is called Cyber Axon. Similarly, Peripheral Nerve establishes a communication link between numerous cyberneurons placed on various devices to provide a strong protection against DoS/DDoS attacks.The last function of Central Nerve is to provide information to other security devices and act as a knowledge base for potential attacks. The idea of collaborative defense by peripheral nerves is to have network devices work together to thwart DoS and DDoS attacks.\u003c/p\u003e\n\u003cp\u003e7) Security controls for end users: Current end-user gadgets like mobile phones, iPads, and smart portable devices (PCs) need built-in security rather than add-ons. While some vendors try to promote automatic updates, end users might not update their devices with the most recent security patches, preventing security patches from being installed. The Wannacry ransomware attack is an illustration of an attack where the most recent vendor-provided security fixes were not installed on all end-user devices. Users frequently aren\u0026apos;t aware of the consequences of not installing patches.Even though some users may be aware of this reality, there are times when they either fail to take the necessary steps to secure their devices or carry out the wrong processes, leaving the devices vulnerable to other attacks. Performing \u0026quot;out of sight\u0026quot; security, in which automatic updates are sent by suppliers straight to end-user devices without the user\u0026apos;s input, is one recommended control. The difficulty would be that software developers would have to make sure that security updates protect against fresh attacks, or so-called \u0026quot;zero-day attacks,\u0026quot; and that they integrate seamlessly with all pre-existing software on the end-user device.\u003c/p\u003e"},{"header":"III. ARTIFICIAL INTELLIGENCE ","content":"\u003cp\u003eAI is concerned with how machines can understand or act correctly, given what they know. This all-encompassing definition encompasses how closely machines can mimic human thought or behavior (Fig. 1). Machines are considered intelligent on one end of the spectrum if they can maximize the result at every stage of the process. The Turing Test, on the other hand, sets the bar for artificial intelligence. When a person conversing with a computer cannot tell whether the responses are coming from a computer or a human, the computer is deemed to exhibit intelligence according to this test.AI encompasses computing disciplines such natural language processing, knowledge representation, logic, automated reasoning, machine learning, mathematics, and game theory at both ends of the computing spectrum. The first applications of AI led to thinking machines that figured out complex problems like geometry, checker], and a group of block-world issues. Agent-based artificial intelligence (AI) or \u0026quot;bots\u0026quot;\u0026mdash;software that acts like humans\u0026mdash;became increasingly popular after the rise of the Internet in the late 1990s. Search engines, online directories, and recommendation services have all benefited from the creation of ethical bots that crawl the Internet. They offer vandalism defense in Wikipedia pages where anyone can participate as an author. On the other hand, malevolent bots have also been developed to transmit malware, publish spam, and cheat at online games. In order to reverse engineer the game code while simulating online games, bot programmers examined the communication flow between the game console and server Instead of consistently submitting messages, the bots that sent spam imitated human online activity by browsing the pages before posting a message in a forum. Malicious bots hinder the correct operation of cyber services, harming the service providers by discouraging online users. As a result, several cybersecurity research looked into ways to identify and defend against rogue bots. When compared to humans, studies have shown that game bots are more persistent, less sociable (exchanging goods or bidding on products), and exhibit fewer variability in their activity sequences. Additionally, while human players like to work together with other players to fulfill tasks and missions, gaming bots are more motivated to collect things.Similar to spambots, malware bots can be identified by their behavior, which can be seen in various distinguishing communication patterns. Intrusion detection systems are where artificial intelligence is most applicable in the field of cybersecurity. Cybersecurity solutions frequently do traffic analysis, classifying Internet traffic as either benign or harmful. Cyberattacks were recognized in the early days of the Internet using rule-based systems, where attacks could be found based on their signatures. As the number of Internet-connected devices and their applications grew over time, it became time-consuming to monitor the massive amounts of network traffic generated in real-time and to create rules that analyze this traffic, which caused security protection systems to act defensively rather than proactively.The development of new, complex attack techniques that can elude detection by existing security systems is another trend that is being aided by technical advancements. We require cutting-edge tools and technology that can assist in speedier detection, investigation, and decision-making for new risks as the landscape of cyberthreats keeps expanding. Large volumes of Internet traffic could be automatically and intelligently analyzed and classified using AI. Based on ML technologies, cybersecurity solutions are now utilized to automate the detection of assaults and to develop and enhance their capabilities over time.Since they can handle enormous volumes of data and a variety of data properties (for example, a large number of table columns) needed for classification. ML-based solutions are being employed in intrusion detection systems. In order to discriminate between harmful and valid traffic, machine learning systems learn from the gathered Internet data. It is important to note that the term \u0026quot;machine learning\u0026quot; has come to be used synonymously with \u0026quot;artificial intelligence\u0026quot; in the cybersecurity industry since machine learning is so widely used to address cybersecurity challenges.\u003c/p\u003e\n\u003cp\u003eA. MACHINE LEARNING \u003c/p\u003e\n\u003cp\u003eUnsupervised learning and supervised learning are the two categories into which machine learning techniques are typically divided. Data samples are labeled in supervised learning based on their class (e.g., malicious or legitimate). Most often done manually, training data, or data labeling, requires people to recognize data patterns with their classes. The trained data is fed into an algorithm to build a mathematical model that, given fresh data samples, can output the predefined classes. Data labeling and training are not necessary for unsupervised learning.Instead, the algorithms analyze the coherence/dispersion of data samples, systematically classifying them according to the degree of data coherence within the class and the degree of data modularity between classes. The distinction between supervised and unsupervised machine learning algorithms often blurred in discussions on machine learning, though. Machine learning approaches use mathematical, statistical, and probabilistic methodologies to enable unsupervised algorithms to label the data needed by supervised algorithms. Due to the convergence of taxonomic viewpoints, it is no longer necessary to categorize machine learning algorithms as either supervised or unsupervised.From a taxonomy viewpoint, as defined in, we now give an extensive review of machine learning algorithms; however, in this part, we focus on the most popular machine learning approaches that are useful for cybersecurity solutions. Data samples are processed by machine learning algorithms based on their defining characteristics, often known as features. The processed data is organized as a table with rows and columns, with the rows acting as samples of the data and the columns as their attributes. Naive Bayes is a machine learning technique that uses the Bayesian theorem to categorize data with the assumption that each characteristic is the result of a separate event.The method determines the likelihood that new data samples will belong to a class by starting with the computed probabilities of each class across all instances. Although Nave Bayes classifiers perform worse as more features are derived from dependent events, they are frequently used because they may intrinsically accept the nave assumption that all features are derived from independent events and still produce usable results.\u003c/p\u003e\n\u003cp\u003eB.DECISION TREES \u003c/p\u003e\n\u003cp\u003eA method for developing a collection of rules using training data samples is to utilize a decision tree. The algorithm repeatedly selects a feature to classify data samples. Until data samples with just one class are discovered after a division, the iterative division generates a succession of rules for every side of the categories, resulting in a tree-like structure. Fig. In Example 2, a decision tree is used to categorize network traffic into two categories: regular traffic and attack traffic.The tree illustrates that, for instance, if traffic flow is low but traffic pattern duration is prolonged, it is categorized as an attack. The method offers an intuitive way to identify cybersecurity problems since it categorizes observed cybersecurity events as either legitimate events or attacks, depending on feature values, and displays the outcome of the decision as needed. Decision trees, for instance, used flow rate, size, and duration in addition to source/destination error rates to identify DoS assaults. Additionally, decision trees were used to classify numbers from CPU use, network flow, and the amount of data sent in order to detect command injection attacks against robotic vehicles.he advantage of this method is that intrusion detection systems can categorize Internet traffic in real time once the most efficient set of criteria has been identified. One of the key factors in spotting cyberattacks is the caliber of generated real-time notifications. The Rule-Learning technique is an alternative method that aims to identify a collection of feature values for each iteration while maximizing a score that characterizes the quality of the classification result, such as the quantity of erroneously categorized data samples. Such a method creates a set of classification rules, much like decision trees do. A rule-learning technique discovers a set of rules that can characterize a class, whereas decision trees locate the best feature values that lead to a class.A rule-learning technique has the advantage of incorporating expert human assistance when producing rules. Think about a study that used 28 features to find DoS attacks in cloud networks. A number of computer and network indicators were included in the features, including input/output (IO) readings, memory usage, TCP flag detection, and the number of system resources open. It created a set of rules based on the features (e.g., IO reads larger than IO reads(average)) and used feature-ranking techniques to determine which rules were most important for identifying the class. Following that, the study used human specialists to optimize the rules by, for example, eliminating redundancies.As a result, the method works well with intrusion detection systems whose setups are mostly based on rules. In order to compare the effectiveness of different machine learning algorithms in identifying network intrusions, the technique was frequently used.\u003c/p\u003e\n\u003cp\u003eC. K-NEAREST NEIGHBORS \u003c/p\u003e\n\u003cp\u003eThe k-Nearest Neighbor (k-NN) method classifies or clusters data based on data samples. In order to determine the percentage of data samples in a neighborhood that produce a consistent estimate of a probability, it was first proposed as a non-parametric pattern analysis. In order to form clusters, the neighborhood was specified as k number of data samples based on a distance measure, typically the Euclidian distance. The distribution of additional data samples among the clusters is determined by the votes of all k neighbors. The approach described above is shown in Fig. 3.The data now includes an additional sample (the red dot). In this case, the majority of data samples from one nearby cluster were the deciding factor. Consequently, the sample was classified as Class 2 when k=3. The sample was classified as Class 1 when k=9. Even for tiny values of k, the computational complexity of this method is high. However, because it can learn from fresh traffic patterns to identify zero-day attacks as one of its unknown classes, it appeals to intrusion-detection systems. Thus, there is now active study in this field to determine how k-NN might be employed for real-time cyberattack detection.Recently, the method was used to identify assaults on smart grids and industrial control systems, such as data tampering and bogus data injection. When the data can be represented using a model that enables the measurement of the distance between them and other data, such as a Gaussian distribution or a vector, it works well.\u003c/p\u003e\n\u003cp\u003eD. SUPPORT VECTOR MACHINES \u003c/p\u003e\n\u003cp\u003eThe linear regression model is expanded upon by the Support Vector Machines (SVMs) method. SVMs classify data samples by locating a plane that divides them into two classes (as shown in Fig. 4). Depending on the function used (referred to as a kernel), the separation plane can take the form of a linear, nonlinear, polynomial, Gaussian, Radial, sigmoid, and so forth. By using more than one plane, SVMs may also separate multiclass data, which is data that needs to be divided into more than two classes rather than only two classes like genuine versus attack class as seen in the preceding cases.Due to the fact that Internet traffic patterns frequently include multiple classes, including HyperText Transfer Protocol (HTTP), File Transfer Protocol (FTP), Post Office Protocol 3 (POP3), and Simple Mail Transfer Protocol (SMTP) ], SVMs are an appealing technique that can be used to analyze Internet traffic patterns. SVM is a type of supervised machine learning that uses training data to build classification models. As a result, it is utilized in applications that allow for the simulation of attacks. As an illustration, network traffic from penetration tests on network systems was utilized as the training data.A mathematical model was developed using SVM to separate penetration test traffic from regular traffic. Its use can be modified to produce a 1-class model for typical traffic, and the model can be used to when attack traffic was introduced, irregularities were detected. From these angles, the advantage of SMVs makes it possible to create simulation-based assault detection models.\u003c/p\u003e\n\u003cp\u003eE. ARTIFICIAL NEURAL NETWORKS\u003c/p\u003e\n\u003cp\u003eThe functioning of neurons in the brain serves as inspiration for the Artificial Neural Networks (ANNs) learning technique. A target value is output using a series of data samples, and ANN approaches model neurons as a mathematical equation. The formula closely matches the equation for linear regression, in which a sample\u0026apos;s data properties are weighted to produce an output value. The ANN algorithm cycles through its iterations until the output value is within the allowable error bounds of the target value. When given specific patterns seen in the data samples, the neurons learn by adjusting their weights in each iteration by calculating the deviation from the target value.When the mistake is small enough, the process produces a mathematical equation that, when given unknown data samples, gives an instructive result like the class. ANN approaches are capable of identifying patterns in noisy to incomplete data samples. They can adapt to new types of communication, making them appropriate for intrusion-detection systems. The Cascade Correlation Neural Network (CCNN), an ANN application that gradually adds additional hidden units to the hidden layer, was employed in a cybersecurity investigation . New hidden nodes are added to the network when new events are found, and only those are trained using the newly gathered data, enabling a runtime adaptive and scalable system.In this study, we use the CCNN to learn from desktop-platform traffic patterns to detect port scanning to mobile networks without having to retrain the entire network with the original data. The proliferation of mobile devices over the past ten years has given rise to new traffic patterns, rendering outdated earlier detection algorithms derived from desktop traffic. The number of ports searched per second and the frequency of received packets varied between port-scanning operations against mobile devices.The study demonstrated that the performance of ANN port-scanning detection was comparable to that of other techniques, such as Decision Trees. Because ANN can learn from current events, another advantage is that it can identify zero-day assaults. As an illustration, traffic patterns from instances involving DoS assaults were provided to ANNs as labeled training data, enabling the neurons to modify their weights and recognize undetected DoS attacks. In contrast to other instances (such as system penetration), where the attackers can hide their tracks and the victim is left as gullible, when occurrences like DoS attacks occur, the victim can testify that an attack has occurred.Since the attack class can be identified when an incident (like a DoS) occurs, ANNs is a good detection tool for cybersecurity applications that can benefit from the occurrence.\u003c/p\u003e\n\u003cp\u003eF. SELF-ORGANIZING MAPS \u003c/p\u003e\n\u003cp\u003eSelf-Organizing Maps (SOMs) are a step up from ANNs in that they self-adjust the weight of the neurons to produce a two- or three-dimensional (2D or 3D) map that illustrates how the data might be organized. The method picks up new information by identifying correlations in data samples. In order to cluster data and produce an output in the form of a map, adjacent data samples have more similar traits than those farther apart. Due to its computational complexity, SOMs are inappropriate for real-time intrusion detection. Their main advantage is that they can visualize the data, which makes it possible to discover network irregularities. The results from intrusion-detection systems are challenging to analyze without visualization.Network administrators can more easily identify anomalies in network traffic, such as zero-day attacks, with the aid of visualization tools that help them see the typical pattern of traffic data (for example, in terms of protocol interactions and traffic volume). Although visualization techniques can effectively highlight anomalous events, skilled eyes are still needed to spot anomalies in the data. SOMs were therefore used as an additional tool for identifying cyberattacks. SOMs can visualize multidimensional data since it depicts data in a 2D or 3D map (e.g., when the data in a table have a large number of columns).In other words, SOMs make data less dimensional. Other dimensional reduction methods (such Principal Component Analysis and Curvilinear Component Analysis, for example) do exist, however they do not depict anomalies that are appropriate for interpreting cyberattacks. The protocol, userAgent, acceptEncoding, acceptCharset, and connection were the dimensions retrieved from the HTTP request header for the purpose of identifying web attacks, for instance. In order to visually represent such multidimensional data on a 2D map and to identify abnormal web traffic, SOMs were used. Similar to this, SOMs were used to distinguish between botnets and regular traffic on the map by reducing 5D data (such as protocol, source/destination IP, and source/destination port numbers) to a 2D map [30].\u003c/p\u003e\n\u003cp\u003eG. BIOLOGICALLY INSPIRED TECHNIQUES\u003c/p\u003e\n\u003cp\u003eIn addition to network traffic, offensive human language such as profanity, insults, hate speech, and racist/sexist statements can also cause cyberintrusions. Applications for Natural Words Processing (NLP) have evolved to separate offensive discourse from typical language. Language patterns like the usage of punctuation, sentence length, or a collection of words that are frequently used together in a sentence are examples of how NLP generates semantics. By recognizing word groupings that are different from those classified as normal, NLP is able to detect sentiments.Numerous evolutionary and biologically inspired algorithms can be used to identify offensive human languages. A variant of ANNs called Deep Neural Networks (DNNs) is the most often used algorithm. Multiple hidden layers are employed in DNNs, which enables algorithms to handle latent variables that would otherwise go unnoticed when only one layer is applied. These are appropriate for NLP applications because they may deduce semantics from linguistic structures. DNNs made it possible to identify named entities, find phrases (noun phrases and verb phrases), and classify words according to their function in the sentence (e.g., adjective, noun, verb, or conjunction) (i.e., persons, companies, and locatins.A variant of ANNs areGenerative Adversarial Networks (GANs). The methods look for features in data samples based on their classes. GANs are made up of two separate sets of neural networks, one of which is used to produce features and the other to assess how well features model the data. They can be used to detect steganography, in which one set of neurons created samples of fake images and the other set of neurons distinguished between the generated fake images and actual ones. The two groups of neurons fight against one another while changing their weights in each iteration to either produce undetected fake images or correctly identify fake from real ones.Overall in this section, we illustrated how AI approaches could improve cybersecurity solutions. Machine learning methods appear to be the most often used AI-based solutions at the moment, particularly when it comes to detecting network breaches, according to the current trend. The effectiveness and efficiency of the other AI-based solutions presented here, meanwhile, must be further investigated as cyberattacks get more complex and sophisticated in order to more accurately assess their full potential. The use of AI to improve the cybersecurity posture of various application domains is covered in the section that follows.\u003c/p\u003e"},{"header":"IV. APPLYING AI TO STRENGTHEN CYBERSECURITY FOR VARIOUS APPLICATION DOMAINS","content":"\u003cp\u003eThe number of users, size, variety of devices, quantity and type of programs being created to operate over the internet, and other aspects of the internet continue to change. The Internet has now evolved into a crucial utility in people\u0026apos;s daily lives all across the world, just like electricity, water, and gas have done in the past. There is a rising potential of cyberattack exposure as more devices connect to the Internet. Cybersecurity has become essential to safeguarding both these Internet-connected devices and their users. Figure 5 shows how AI can help with cybersecurity in three different contexts, including the Internet (sections IV-A to IV-D), the Internet of Things (IoT), and critical infrastructure (section IV-H).The structure for the subsequent topics in this part is also shown in the figure: Two key factors\u0026mdash;the degree of interconnection and the need for secure systems\u0026mdash;drive the growth of AI applications.\u003c/p\u003e\n\u003cp\u003eA.THE INTERNET \u003c/p\u003e\n\u003cp\u003eCyberattacks are hostile patterns that are distinct from legitimate Internet traffic, according to AI. To differentiate Due to their ability to review a huge quantity of data and adapt to the changing nature of Internet traffic, intrusion-detection systems have been developed by using AI approaches to distinguish malicious traffic from valid traffic. Recent cyberattacks have targeted people, business logic, and network infrastructure.\u003c/p\u003e\n\u003cp\u003eB. NETWORK INFRASTRUCTURE (BOTNET) \u003c/p\u003e\n\u003cp\u003eCommunication between clients and servers is common in Internet services. Attackers have the ability to block access to servers or, in the case of DoS attacks, stop the server from fulfilling client requests. When creating a botnet, the attackers first corrupt a number of hosts (using Trojans or other forms of malware), which they then have control over and direct to carry out tasks. For example, in a DoS attack, these infected devices can be used to flood a server with requests, leaving no resources for legitimate users\u0026apos; lawful requests to be processed. DoS assaults are becoming a severe concern due to the complexity growth and multi-platform operation of the botnets they deploy, which include PCs, mobile devices, and Internet of Things (IoT) devices.By using attributes that accurately describe the network behaviors of IoT devices, one study was able to identify DoS attacks performed by IoT devices. The number of distinct destination IP addresses and the number of distinct IP addresses within a 10-second window were recommended as two attributes to represent their observation that IoT devices only communicate with a small number of endpoints when running apps. Interpacket arrivals and their first and second derivatives were also suggested as additional features. This is an indication of an unexpected surge of packets transmitted by the IoT device. The study demonstrated that decision trees have a detection accuracy of 99 percent.DoS attacks caused by IoT devices can be avoided when gateways implement the suggested detection approach since the majority of IoT devices must pass a single gateway (such as a home router). New DoS assaults techniques are launched as new services emerge. DoS attacks on smart meters, are recent instances. In the interconnected network of smart meters, each of these meters also serves as a router. The authors of discovered that putting an attack packet into a meter could cause the meter to produce a large number of route packets, changing other meters\u0026apos; routing information to prevent data packets from getting to their destination.As a result, the network became unavailable since the network\u0026apos;s meters made arduous attempts to get the data packet to its target. The wireless modules of smart meters are susceptible to a jamming attack, as the authors noted in. They examined the dispersion of the wireless signal\u0026apos;s arrival distance from a location determined to be the network\u0026apos;s center in order to spot a jamming attempt. We anticipate that when new services and computer platforms appear, so will new, more sophisticated DoS attack methods. Detecting DoS attacks in the Software-Defined Network (SDN) environment was the focus of recent studies. SDN-based network management is distinct from conventional forwarding protocols.SDN gathers and programmatically analyzes network data before forwarding network traffic, unlike conventional routers that forward traffic in accordance with their routing tables. As a result, detecting DoS attacks in an SDN context presents new difficulties. Before forwarding packets to the control plane, an SDN system switched 68 features from packets from its data plane in the work. The ratio, entropy, count, size, and flow of packets for the Internet Protocol (IP), Transmission Control Protocol (TCP), User Datagram Protocol (UDP), and Internet Control Message Protocol (ICMP) packets and flags were used to derive these characteristics.The research demonstrated that DoS assaults may be accurately detected with Deep Learning algorithms in 95.65 percent of cases. In an SDN environment, deep learning is considered a viable option for identifying DoS threats Twenty features, including the protocol, port, packet size, and others, were used by the authors of The authors demonstrated that a Deep Learning derivative known as Long Short-Term Memory can identify DoS assaults with a 99.88% accuracy rate. A variety of features were used in the study in, including the number of connections made within a 2-second window, the length of connections, the number of connections to the same service (as the current connection), the protocol type, and the volume of data flowing in each direction.It shown that, in terms of accuracy, DNNs outperformed other AI techniques including SVMs, Naive Bayes, and Decision Trees. Even though only a few features were defined, the work demonstrated that DNNs worked well because, unlike other machine learning techniques that do not build features, DNNs were able to produce hidden/latent variables that were viewed as additional features. To react to changes in the computing environment, SDN uses AI techniques. It learns from previous network data to assess new traffic patterns and forecast security trends. When AI is used to detect cyberattacks on SDNs, two constraints haven\u0026apos;t been addressed in the research.First, there hasn\u0026apos;t been any discussion of how AI can be used to real-time detections. Real-time classification of hostile and normal traffic is necessary to detect DoS assaults, but the AI-based approach is evolutionary in nature and takes several computer iterations to provide the desired results. The proposed system was tested in real time, but the test was conducted after a classification model had been created from training data.To the best of our knowledge, no study has suggested an AI method for SDN to quickly identify DoS threats. Second, application-layer threat detection is not a concern for SDN by natur ]. Application-layer protocols must be protected against DoS attacks, which calls for deep packet inspection or other non-centralized methods. As we go over in more detail in the following section, this is yet another situation where AI could be used.\u003c/p\u003e\n\u003cp\u003eC. APPLICATION LAYER \u003c/p\u003e\n\u003cp\u003eAs servers operate the essential business applications for an organization, targeting servers is a desirable way to attack the company providing the services or the consumers of those services. Application-layer attacks have previously mostly targeted protocols like HTTP, Domain Name Service (DNS), or Session Initiation Protocol (SIP). For instance, unique DoS attack modeling and detection was proposed in when the new version of the HTTP/2 web browsing communications protocol was released; the authors showed how to get around intrusion detection systems.At the application layer, HTTP/2 had a flowcontrol mechanism that was missing from HTTP/1.1. While keeping the number of connections to the target server low, flooding a particular sort of flow control preempted a server implementing HTTP/2 services. This got around established detection systems, which classify network events with a lot of connections as attacks [6]. AI methods (Naive Bayes, Decision Trees, and Rule Learning) displayed a higher percentage of false alarms when the proposed HTTP/2 flood traffic was launched against an HTTP/2 service than when the same AI methods were used to detect HTTP/1.1 DDoS attacks, demonstrating that they circumvented well-known intrusion-detection systems. Given a recommended set of features pertinent to HTTP/2 detection, SVMs demonstrated no false alarms while identifying attacks (Geib C).The focus of modern application-layer attacks has switched from blocking information flow to changing the meaning of information. With the rise of online social networks, a new type of cyberattack has evolved that tries to spread misleading information to influence the way that receivers behave or make decisions (Morey B, 2019). The 2016 US presidential campaign\u0026apos;s influence by fake news, which had an impact on national security interests, was likely the most significant misleading information (Morrow S, 2019). False information can also have an impact on people since it sometimes appears as false news, cyberbullying, and online grooming to manipulate the victim\u0026apos;s behavior.The ability to identify fake information has emerged as a contemporary application-layer cybersecurity challenge. False information can have a significant impact on both national security and people\u0026apos;s wellness. Since AI can swiftly examine a big amount of data, it has demonstrated to be a flexible tool to detect erroneous information. For instance, the authors of examined a corpus of 11,000 items, including news from Reuters, regional news, and blogs, and found that around 29% of the corpus\u0026apos; articles were classified as fraudulent. They used the iterative optimization algorithm stochastic gradient descent to accurately classify bogus news with a 77.2 percent accuracy rate.The authors of suggested correlation-based classifiers, studied more than 150,000 tweets, and showed that the proposed classifiers performed with 47 times more precision than when the method was not applied in classifying messages. authors examined 4.4 million Facebook messages and separated the phony from the real ones. Using Naive Bayes, Decision Trees, AdaBoost, and RandomForest, it was possible to distinguish between real and fraudulent news with an accuracy of 86.9%. Early detection of fake news is essential. As a result, a work proposed an early method for detecting fake news that made use of a family of ANNs.The research measured the speed and complexity of the news-propagation path. Recurrent neural networks (RNNs), which resemble directed graphs, and convolutional neural networks (CNNs), two ANN variants, were used (CNNs, a derivative of DNNs with more hidden layers). While the DNNs measured the topology of the news propagation path, the CNNs measured how news propagated over time, resulting in a tree-like structure that depicts how news spread from one user to another.Within 5 minutes of the initial fake news publication, the work was able to identify it in social media with an accuracy of 85% on Twitter and 92% on Sina Weibo. In addition, classifying texts for incorrect information uses linguistics expertise. The text classification methodologies presented and build on the traits and observations needed in cybersecurity to construct automatic detection techniques. Grammatical errors and word choice are examples of attributes that are taken from linguistic cues and mapped into machine learning features.Additionally, it is possible to recognize bomb threats on Twitter and determine the veracity of Twitter users, such as online predators, by adopting certain words with linguistic signals. These studies demonstrated AI\u0026apos;s capacity to utilise new features while demonstrating how automatic detection strategies for erroneous information boost human wellbeing. The term-frequency and inverse document frequency, or tf-idf, is a characteristic that is frequently used in text categorization tasks. In contrast to the value of inverse document frequency, the value of term-frequency rises as there are more frequent terms found in a document.The tf-idf function has been enhanced by numerous false information-detection algorithms, along with other linguistic clues including phrases, syntax, negatives, and punctuation. SVMs can identify irony in words that could be misconstrued as news, but Naive Bayes can categorize subjects on Twitter to identify spam or phishing. DNNs have demonstrated a 93 percent accuracy rate for detecting hate speech in Twitter. Despite recent improvements in text classification tasks, semantic cyberattack detection is still in its early stages. In studies using tf-idf, relevant phrases like \u0026quot;dead\u0026quot; or \u0026quot;bomb\u0026quot; to identify threats and \u0026quot;age,\u0026quot; \u0026quot;y,\u0026quot; or \u0026quot;year\u0026quot; to identify predators had to be supplied by humans.This demonstrates that despite the usage of AI, human intelligence is still needed for cyberthreat identification at the current application layer. Additionally, some investigations use characteristics other than verbal cues. The presence of URLs in tweeted messages, the proportion of followers to followers on Twitter, the quantity of tweets, the presence of hash tags, users\u0026apos; time zones, and the timestamp of when a tweet was sent are a few examples of these nonlinguistic features that can be used to identify fake news on Twitter. These characteristics are exclusive to social media and not language cues.\u003c/p\u003e\n\u003cp\u003eD. HUMAN LINK AND MALWARE \u003c/p\u003e\n\u003cp\u003eThe end user of the Internet, who is a human, is likely the weakest link in cybersecurity. Humans are concentrating on their work rather than continually defending against the escalating cyberattack The end user of the Internet, who is a human, is likely the weakest link in cybersecurity. People are more concerned with their daily duties than they are with the ever-growing cyberattack surface. While some of the well-known cyberthreats can be reduced by reengineering machines, humans need ongoing training based on current and prior problems. One of the key factors contributing to the success of malware propagating through contemporary phishing techniques is this necessity.Software that is intended to do harm, such as a virus, Trojan horse, or worm, is called malware. Phishing is a technique used to get unsuspecting users to do what the attacker wants them to, like click a link or open an executable file. Such behaviors either encourage the spread of malware or persuade the victims to divulge their private data. Traditionally, phishing strategies utilize human deficiencies in their sensory systems, such as through fraudulent emails or webpages, causing victims to be unable to distinguish them from authentic ones. The most advanced phishing methods used nowadays take use of the human capacity for limited omniscience.Users must evaluate the target\u0026apos;s credibility to prevent falling for phishing hooks, and frequently this may be done by looking at the code hidden behind the links, which may call for some specialist knowledge. In this field, AI can help to improve human intelligence. These rules serve as the features for AI approaches, saving the user from having to learn every rule for phishing detection. The authors of suggested a method that makes use of SVMs to identify links that lead to fraudulent financial websites. The method makes use of five characteristics: IP address, Secure Sockets Layer (SSL) certificate, number of dots in the URL, length of the site address, and keywords from a blacklist.An SSL certificate, relatively short URL lengths inside the domain, the display of a valid domain name rather than an IP address, and the absence of a subdomain are all indicators of a reputable banking website (higher number of dots). The technique also gathered a large number of terms that are frequently used in phishing websites. The findings demonstrated that the approach has a 98.86% accuracy rate for detecting zero-day phishing. According to this study, we can improve human cybersecurity awareness by using AI training. As seen by attacks on contemporary websites and online social media, adversaries continue to take advantage of human flaws. JavaScript is used by modern websites to increase user-browser interaction and speed up browser response.JavaScript can be used by attackers to phish people or inject malware. Since detecting JavaScript infected websites needs sophisticated coding skills, it is practically hard for the common user to find such compromised websites. Furthermore, current methods include phishing to trick consumers into clicking on a link that will download malware unknowingly through online social media (also referred to as drive-by-download). In response, drive-by download attempts and malicious JavaScript websites have been identified using AI algorithms. In order to get over human limitations in recognizing and analyzing such features, in this case, AI approaches have been used to assess JavaScript word sizes, the distribution of coding characters, the frequency of bytecode in strings, commenting style, and sensitive function calls.While the results from the examined papers revealed that the players liked the game, those papers did not show how effective the games were. Their sample sizes were small, the participants were selected (rather than randomly invited), and the effect size (i.e., the difference in cyber awareness between the group that played the game and a control group) was not studied. Furthermore, critics argue that such training games suffer from privacy and trust issues . Such training games require algorithms to learn about users\u0026rsquo; belief in their own ability to accomplish a certain goal , their attitudes toward software updates, creating strong passwords, identifying potentially malicious links, and using appropriate hardware (e.g., backup data). When information learned from the algorithms went into the hands of an adversary, the information would become useful ingredients to create tailoredphishing attacks toward a target. The participants liked the game, according to the results from the analyzed papers, but the effectiveness of the games was not demonstrated. The effect size\u0026mdash;that is, the variation in cyber awareness between the group that played the game and a control group\u0026mdash;was not examined, and their sample sizes were tiny. Participants were also chosen rather than randomly recruited. Furthermore, some contend that these training games have problems with trust and privacy. Such training exercises necessitate algorithms to learn about users\u0026apos; attitudes about software updates, their practices for setting secure passwords, spotting potentially harmful connections, and using the right hardware (e.g., backup data).When an adversary obtains information obtained through the algorithms, the information can be used to build specifically targeted phishing attacks against a target. The problem would grow more serious if such data were made available to the public or to others without authorization, raising concerns about privacy and trust. surface. While some of the well-known cyber risks can be mitigated by re-engineering machines, humans need ongoing training based on current and prior problems. One of the key factors contributing to the success of malware distributed via contemporary phishing techniques is this necessity.Software that is intended to do harm, such as viruses, Trojan horses, and worms, is called malware. Phishing is a technique used to get unsuspecting users to do what the attacker wants them to, like click a link or open an executable file. Such behaviors either encourage the spread of malware or persuade the victims to divulge their private data. Traditionally, phishing tactics take advantage of the sensory limitations of people, such as through the use of phony emails or websites, making it difficult for victims to identify them from real ones. Modern phishing methods are more complex because they take use of the human capacity for ignorance.Users must evaluate the credibility of the target to avoid falling for phishing hooks, and frequently this may be done by looking at the code hidden behind the links, which may call for some specialist knowledge. This is one area where artificial intelligence (AI) can support human intelligence. These rules serve as the features for AI approaches, saving the user from having to learn every rule for phishing detection. The authors of suggested a method for identifying links leading to fraudulent financial websites that makes use of support vector machines. The strategy makes use of five features: the IP address, SSL certificate, number of dots in the URL, length of the web address, and keywords on a blacklist.Authentic banking websites display a valid domain name rather than an IP address, have an SSL certificate, have a domain that is relatively short, and are not a part of a subdomain (higher number of dots). The technique also gathered a large number of terms that are frequently used in phishing websites. The findings demonstrated that the approach has a 98.86% accuracy rate for detecting zero-day phishing. According to this study, we can improve human cybersecurity awareness by using AI training. As seen by attacks on contemporary websites and online social media, adversaries continue to take advantage of human flaws. Through the usage of JavaScript, modern websites enhance user interaction with the browser and speed of response.JavaScript can be used by malicious parties to phish people or inject malware. Since detecting JavaScript infected websites needs sophisticated coding skills, it is practically hard for the common user to find such compromised websites.Additionally, modern methods for spreading malware through online social media involve tricking people into clicking on a link that would unwittingly download the malware (also referred to as drive-by-download). In response, drive-by download attempts and malicious JavaScript websites have been identified using AI algorithms. In order to get over human limitations in identifying and analyzing such features, in this case, AI approaches have been used to assess the JavaScript word sizes, the distribution of coding characters, the frequency of bytecode in the strings, the commenting style, and the sensitive function calls. Another AI-based method has also been used to identify malicious JavaScript that has been obfuscated and to offer fail-safe features to stop malware from spreading after users have been tricked into clicking on dangerous links.The objective of usable security is to design systems that are easy for the typical person to use while remaining safe. Using some types of games is one way to raise the average human user\u0026apos;s knowledge of cybersecurity. The game hones the players\u0026apos; awareness of bogus URLs that resemble real ones; for instance, differentiating between the fake URL \u0026quot;www.paypa1.com\u0026quot; and the real URL \u0026quot;www.paypal.com.\u0026quot; The authors of this study looked at 28 articles that discussed cybersecurity training games. The results from the analyzed studies indicated that the participants enjoyed the game, but they did not demonstrate the games\u0026apos; level of effectiveness. Their effect sizes were not quantified, the sample sizes were modest, and the individuals were chosen rather than randomly recruited.Furthermore, some contend that these training games have problems with trust and privacy. Such training games need algorithms to learn the players\u0026apos; attitudes about software updates, strong password creation, spotting potentially harmful links, and using the right hardware, as well as their beliefs in their own capacity to complete a task (e.g., backup data). When an adversary obtains information gleaned from the algorithms, the knowledge can be used to build phishing attacks that are specifically aimed at a target. The problem would get worse if such data were made public or made available to unauthorized individuals, which would raise concerns about privacy and trust.\u003c/p\u003e\n\u003cp\u003eE. THE INTERNET OF THINGS \u003c/p\u003e\n\u003cp\u003eComputers are now more powerful, portable, tiny, and reasonably priced. The IoT era began with the widespread use of mobile devices like phones and tablets. Many modern gadgets, including toys, appliances, cars, and industrial control systems, come with networking features and Internet access, which enable the Internet of Things (IoT). Fig. The development of technology that resulted in the development of the IoT is illustrated in Figure 6.Other paradigms, such big data, fog computing, and cloud computing, are allowing mobile devices with constrained resources to access a variety of remote services. Researchers introduced fog computing services by bringing the platform and application closer to the customer because the demand for better data speeds is growing. To reduce network roundtrip delays, fog computing distributes servers, notably for Content Delivery Networks (CDNs). Fog computing hence provides real-time energy and carbon footprint control in addition to improving website speed.Additionally, the development of vehicular networking apps, which allow for quick data transfers between mobile devices, was made possible by advancements in telecommunications technology.\u003c/p\u003e\n\u003cp\u003eF. PRIVACY \u003c/p\u003e\n\u003cp\u003eThe ability of Internet-connected gadgets to collect data is advancing as they become smaller and more prevalent, outpacing people\u0026apos; capacity to be conscious of their actions (in capturing data). Devices gather data to enhance user experience, including voice, geolocation, ambient temperature, and lighting. However, research indicates that gathering such information might be done with bad intentions in mind. Intelligent virtual assistants (like Google Home, Apple\u0026apos;s Siri, and Amazon Alexa) can be used to secretly record conversations or activate smart (garage) doors. According to one study, gadgets can be used to smuggle items, cyberbully people, incite panic, and redirect a user\u0026apos;s browsing path to offer adverts.Devices can also be used to associate a place or a person with criminal activity. In the past, safe authentication methods like encryption and security certificates have been used to address privacy issues. With mobile devices and cloud-stored data, these procedures change with the IoT. When routing paths vary dynamically and data is stored by a third party, AI approaches can be used to maintain privacy in communications. For instance, artificial immune system methods were adopted to securely self-organize Wireless Sensor Network (WSN) ad hoc connections to serve mobile devices, as well as learning automata to distribute secure certificates to moving cars. Different IoT devices, like mobile devices, join and depart the network dynamically in WSNs.Due to this, conventional security methods like port security\u0026mdash;which limits traffic to known Media Access Control (MAC) addresses\u0026mdash;are no longer effective. In order to explain a device\u0026apos;s behavior, the authors suggested metrics including packet receiving rate, packet mismatch rate, and energy usage per packet received from a device. They classified a device\u0026apos;s activity as normal or pathological using artificial immune system algorithms. Unencrypted packets were dropped when strange behavior was noticed. This demonstrates the need for new privacy measures as there are more and more Internet-connected gadgets. Furthermore, privacy issues regarding how sensitive data might be accessed by cloud operators occur as a result of the significant amounts of data that are stored in the cloud.Intelligent algorithms were used to spread sensitive data among multiple cloud servers in order to address this problem and make it difficult for cloud operators to spy on users. Additionally, well-known biometrics and metrics for human behavior were used in secure authentication techniques. However, problems occur when authentication devices are unable to operate properly in a variety of operating environments. To solve these problems, AI methods (such genetic algorithms) have been applied to provide accurate face, fingerprint, and voice recognition in a variety of operating conditions.Blockchain is a disruptive technology that can get around laws to support privacy. Blockchain enables the storage of encrypted data without the intervention of a centralized authority on a network of peer-to-peer untrusted machines. Blockchain applications are facilitated by the usage of AI technologies in conjunction with blockchain. Blockchain applications can ensure secure connection between two IoT devices thanks to AI techniques. Traditionally, security measures that permit two IoT devices to communicate remotely have been based on some centralized systems. In order to provide secure communication between two remote IoT devices without the use of a centralized infrastructure, blockchain technology was proposed. In order to enable automatic resource sharing between IoT devices, information from Reinforcement Learning saved in the blockchain was used to determine if the communicated data complies with the end devices\u0026apos; access control restrictions.The research described how the healthcare industry may obtain medical data while protecting patients\u0026apos; privacy in order to forecast probable illnesses or medical problems. Algorithms for classification and prediction need a lot of data, which is counter to patients\u0026apos; desire to share their health information. Such medical data might be stored on a blockchain, guaranteeing patient privacy and giving them control over their personal data, such as regulating access privileges. Patients are more comfortable saving personal information and biomarkers (such as blood parameters and waist circumference) that can be used to identify hazards and reveal their health condition.Before being stored on the blockchain, medical imaging data might be utilized to extract information such as biomarkers and tumor tissues using AI techniques like DNNs. Inferring chronic disorders and probable diseases (such diabetes or cardiovascular disease) from medical information could be done using RNNs. I], smart, contract-based, data-trading systems were developed using AI techniques like similarity learning. However, a dispute develops when the purchaser\u0026apos;s data does not match what the provider claimed. Thus, similarity learning was used in to determine the distance between the data attributes of the buyer and provider, thereby confirming the consistency of the data.This demonstrates that AI privacy jobs will take into account legal, ethical, and regulatory frameworks because disclosing personal information can improve human welfare.\u003c/p\u003e\n\u003cp\u003eG. CYBER-PHYSICAL SYSTEMS\u003c/p\u003e\n\u003cp\u003eCPSs (cyber-physical systems) combine monitoring, processing, and communication capabilities. They utilize embedded systems and sensor networks to gather data, and software components and actuators are used to react to the environmen]. As nations compete to become the dominant player in this industry, the core CPS concepts are being implemented on a global scale. The economic progress in Germany\u0026apos;s \u0026quot;Industry 4.0\u0026quot;, China\u0026apos;s \u0026quot;Made in China 2025\u0026quot;, and western countries\u0026apos; \u0026quot;Smart Cities\u0026quot;, where manufacturing processes are automated and suppliers at various places link to one another, is motivated by the phenomena mentioned in CPS. CPS might be seen as the next economy powered by AI. To create things more quickly was one of the initial demands that drove intelligent manufacturing.AI techniques have been used to produce electronic circuit boards, control systems that perform real-time analysis on remote hydroelectric power plants, and evaluate the dependability and safety of railway control systems by autonomously gathering data and working together to complete tasks. The education industry, which demands adaptation to individual students, was another key motivator behind the use of AI in intelligent manufacturing. To address this demand, intelligent agents were used in the development of instructional software that can modify the level of difficulty of exercises to match the learning rate of the learner. Because they produce precise forecasts and output estimations, AI approaches are appropriate for addressing CPS needs. To predict temperature given the changing climate, the energy management industry was among the early adopters of AI approaches.In this instance, fuzzy networks were utilized to regulate the airflow in order to provide the appropriate temperature. Power distributions on a bigger scale necessitate better energy quality, capacity, and dependability. In this field, AI methods like genetic algorithms and neural networks have also been used. When selling and buying to/from the grid are subject to different energy rates, they are employed to solve profit management problems. The prevalence of tiny devices necessitates CPS since it improves data collection efficiency and opens the door to processing huge data.This is an area where AI applications in CPS intersect with AI applications in cybersecurity, because often data is remotely acquired via processing systems. How to gather data with a high level of confidence, send it securely, and share it while maintaining the data\u0026apos;s integrity and privacy are all examples of cybersecurity difficulties in this situation. The AI applications in CPS tie in with earlier concepts of dependable data, safe networks, and privacy concerns. In smart agriculture, where sensors are planted in the soil to collect temperature data and levels of nitrogen and carbon, AI applications\u0026apos; convergence in CPS with cybersecurity is plain to see.In order to make informed judgments about the use of water and fertilizer, farmers combine the sensor data from their equipment with current weather predictions to create an irrigation monitoring system. The method uses genetic algorithms to determine the appropriate temperature threshold and is used in AI techniques. Cloud applications are used by sensor-based systems to store and process the data from the many sensors, giving farmers access to real-time information. Farmers can achieve their ideal crop output quality in this way. If any of these cyber entities are vulnerable to attack, cybersecurity challenges arise, including malware that can infect sensors, the quality of data transmitted over networks, the accessibility of cloud computing resources for irrigation systems, and whether sensor data can be shared. Crop harvesting could be significantly hampered if such cyber concerns are not addressed.\u003c/p\u003e\n\u003cp\u003eH. CRITICAL INFRASTRUCTURE \u003c/p\u003e\n\u003cp\u003eCritical infrastructures are resources that are essential to society and the nation\u0026apos;s security. These infrastructures include telecommunications, water, air traffic control, and power (oil, gas, electricity, and nuclear). Because people\u0026apos;s everyday lives and activities depend on the availability and integrity of essential infrastructure, protecting it is of the utmost significance. Previous conversations illustrated how the scope of cybersecurity has widened from network intrusion detection systems to include ways to enhance human wellness. The change was prompted by various industries, including the health and education sectors.Additionally, the development of AI methods to improve cybersecurity is supported by the critical infrastructure industry. The primary function of cybersecurity in critical infrastructures is to protect SCADA systems. They are the primary control systems for the infrastructure (consisting of computing nodes that communicate with other nodes). Typically, SCADA systems are located on an organization\u0026apos;s operational technology (OT) networks. These OT networks and Information Technology (IT) networks are more exposed to internal and external cyberattacks as they are interwoven and connected to the Internet. Critical infrastructures must be resilient against such cyberattacks notwithstanding these dangers and their inherent weaknesses. Maintaining the business continuity of a vital infrastructure is thus one of the requirements and challenges.Applying AI methods can be used to maintain the SCADA systems\u0026apos; resilience. For instance, by using Artificial Neural Networks (ANNs) that track ambient temperature, generator speed, and pitch angle of the generator power outputs, failures in wind turbine generators could be predicted . AI methods including k-NN, Decision Trees, and SVMs have been used in water system control to categorize various anomaly events, such as cyberattacks and hardware malfunctions . Additionally, SCADA systems have been provided access control based on users\u0026apos; dynamic properties, such as location, time of use, and the user\u0026apos;s work shift (when the user works on-site), using AI techniques like SVMs and ANNs.Because the critical infrastructure sector is so crucial to society, using AI to create robust resilience will continue to be an active study topic. The field of protecting critical infrastructure has absorbed other AI concepts, such as propositional logic. Because the authentication procedure in this environment necessitates a sophisticated mapping between user privileges and system regulations, the authors of presented a logic-based architecture to implement security standards for system authorisation in SCADA systems. In such a framework, rules are dispersed among the system nodes to determine the range of actions that the user may carry out on each node.An authorization server receives both the command and the user privilege information when a user with a particular privilege delivers a command to a target node. The user privilege, command, and token are all forwarded to the target node by the server once it has processed and analyzed the information received. To decide whether to approve or reject the execution of the command, the node compares the token with its local permission policy. As a result, the proposed logic-based architecture supports scalable authorization in SCADA systems since destination nodes make the option to allow or deny instructions. The idea of intelligent algorithms using logic to self-heal the communications channel of SCADA systems has also been put forth.Using session keys, SCADA systems encrypt their communication with remote nodes. It is essential for the node to restart the communications channel as soon as possible after a failure so that no unauthorized users or agents can take over the restart of the communications channel. In order to produce a fresh session key, the authors of suggested distributing re-keying materials to the remote nodes. The materials for re-keying are made up of a series of numbers produced by a formula (i.e., bivariate polynomial). Similar to how a session key is generated, a session key is created by a series of mathematical and logical operations.A remote node can generate a session key as a result, essentially self-healing the communications channel, once it has recovered from an unavailability incidence on its communication channel. In addition, self-healing electrical distribution systems have been developed using mathematical models . A collection of 22 features, including the cost of power losses, the power demand at each node, and the magnitude of voltage at each node, are used by the self-healing system to decide which network zone to isolate in the aftermath of such events. Set theory was used by the system to cluster the features. Following that, the system sent these clusters to a number of mathematical models (i.e., backward/forward sweep load-flow algorithms) that simulate the steady-state of electrical distribution systems.Thus, a variety of approaches are being employed to satisfy the cybersecurity needs of the critical infrastructure sector, including both logical and mathematical ones. The conversation outcomes from this section are listed in Table 2. The role of AI in cybersecurity will expand as the Internet develops. Applications that are essential to human welfare and national security are using AI technology. AI techniques are being utilized to make machines think and behave like people as well as to solve issues logically.\u003c/p\u003e"},{"header":"V. FUTURE CHALLENGES AND RESEARCH OPPORTUNITIES ","content":"\u003cp\u003eA. THE RACE BETWEEN DEFENSE, OFFENSE, AND HUMANITY\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe competition between white hat (defenders) and black hat (offenders) hackers has been stoked by recent developments in AI research in cybersecurity. Attackers can use AI to simulate human behavior in order to gain personal satisfaction, dominance, or monetary gain. Intelligent agents that autonomously click advertising, play online games, and purchase and resale concert tickets have been developed thanks to AI. Additionally, AI has influenced the US presidential election by disseminating customised news and has influenced public opinion in Venezuela by retweeting political content. How clearly dividing lines may be formed between advancements and fundamental needs will affect future research prospects in cybersecurity.White hat hackers, black hat hackers, and end users are the three main stakeholders that are impacted by AI\u0026apos;s use in cybersecurity (human ity). The coworkers who make up the white hat and black hat hackers encourage the creation of AI techniques. To manage the deployment of technology, a line must be drawn between the two groups, but this is challenging because one group\u0026apos;s advancements lag behind the other. Therefore, it is crucial to look at how AI might be applied to meeting basic human needs and creating cybersecurity measures.\u003c/p\u003e\n\u003cp\u003eB. INFRASTRUCTURE The application of AI to cybersecurity is seen as a competition between the government and online criminals. The winner of the race will be determined by who has access to the necessary technical knowledge and computing resources. For instance, because they are evolutionary in nature, AI systems are computationally expensive. Therefore, a focus of current research should be on creating quick algorithms for the AI solutions presented in Table 2. For instance, to facilitate quick grouping of common data samples, hashing methods have been devised as input to the k-means clustering algorithms. The recent competition has included the creation of pertinent algorithms, but hardware development is also an essential component.\u003c/p\u003e\n\u003cp\u003eC. HARDWARE AND PLATFORM Access to cutting-edge computer infrastructure will make it easier to effectively and efficiently solve AI problems. It will become more urgent to perform data analysis as the number of computing devices and the amount of traffic both grow. Consequently, advanced computing systems are needed in order to analyze data utilizing AI approaches. Cluster computing tools like Apache Spark and Hadoop have been used to analyze cyber traffic in order to meet this challenge. At the top end, quantum computing will be the ground-breaking innovation that assists in resolving challenging computing issues. NASA\u0026apos;s quantum computer, which is 100 million times faster than conventional computers, has been able to solve complex problems in a fraction of the time.\u003c/p\u003e\n\u003cp\u003eD. RESOURCES When establishing effective computer solutions, having quick access to the necessary resources is essential. Energy is currently considered to be a limited resource for many computing requirements. For instance, only to commit one block, the Bitcoin blockchain uses the energy of 29 typical Australian households for an entire day. Ethical concerns about the usage of AI will surface when intelligent machines begin to use a significantly greater portion of resources that are shared with humans. If intelligent machines have their own rights, that would be one problem. The fact that computers are thought to lack consciousness in some ways makes the problem seem unimportant .Researchers are also debating whether intelligent machines ought to be granted rights regardless of what constitutes awareness . The debate over how to divide scarce resources between intelligent machines and people is expanded by the usage of AI in cybersecurity. Regulators will therefore be motivated to review their assumptions about what constitutes development and fundamental needs. Future challenges in using AI for cybersecurity will also revolve around ethical issues.\u003c/p\u003e"},{"header":"VI. CONCLUSION ","content":"\u003cp\u003eAI has emerged as a critical tool in the field of cybersecurity as the pace and sophistication of assaults rise. This article demonstrated how cyberthreats have grown, become more complicated, and expanded in scope. We stress how current hazards are still affected by historical cyberthreats. We provided a thorough analysis of cyberthreats and available countermeasures. In particular, we discussed the impact of cyberattacks on various network architectures and applications. Even as the community recognizes cyberthreats and creates remedies utilizing a wide range of technologies and methodologies, cyberthreats will continue to increase.Modern research has demonstrated the potential of AI approaches to counter future cybersecurity threats. The methodologies suggest a variety of intelligent behaviors, from how machines can think and behave like people. Recent AI-based cybersecurity proposals have mostly concentrated on machine learning methods that use intelligent agents to differentiate between attack traffic and genuine traffic. In this scenario, intelligent agents take on the role of humans, and their job is to identify the most effective classification criteria. Today\u0026apos;s cyberattack scene, however, shifts from causing computer disruption to causing social unrest and endangering human welfare. We talked about this topic in terms of how technological advancements are changing how cyberattacks can be launched, discovered, and mitigated.AI\u0026apos;s contribution to cybersecurity will increase steadily as a result of these developments. To promptly identify and neutralize risks that jeopardize society stability and human welfare, innovative AI solutions must be created. It\u0026apos;s likely that cybersecurity solutions will move beyond intelligent agents that act like humans to ones that think like humans. Although the role of AI in addressing cybersecurity challenges is still being researched, there are certain fundamental questions about how and where AI deployment can be governed.As intelligent machines, for instance, become more crucial solutions for mankind, they will gradually deplete life\u0026apos;s essential resources. When machines and people fight for limited resources, a new type of government will emerge. This will then open up a fresh line of inquiry.\u003c/p\u003e"},{"header":"DECLARATIONS","content":"\u003cp\u003eEthics Approval and Consent to Participate:\u003c/p\u003e\n\u003cp\u003eNo participation of humans takes place in this implementation process\u003c/p\u003e\n\u003cp\u003eHuman and Animal Rights:\u003c/p\u003e\n\u003cp\u003eNo violation of Human and Animal Rights is involved.\u003c/p\u003e\n\u003cp\u003eFunding: \u003c/p\u003e\n\u003cp\u003eNo funding is involved in this work.\u003c/p\u003e\n\u003cp\u003eData availability statement:\u003c/p\u003e\n\u003cp\u003eData sharing not applicable to this article as no datasets were generated or analyzed during the current study\u003c/p\u003e\n\u003cp\u003eConflict of Interest:\u003c/p\u003e\n\u003cp\u003eConflict of Interest is not applicable in this work.\u003c/p\u003e\n\u003cp\u003eAuthorship contributions: \u003c/p\u003e\n\u003cp\u003eThere is no authorship contribution\u003c/p\u003e\n\u003cp\u003eAcknowledgement:\u003c/p\u003e\n\u003cp\u003eThere is no acknowledgement involved in this work\u003c/p\u003e"},{"header":"REFERENCES","content":"\u003col\u003e\n\u003cli\u003eVenable D. 2017. \u0026ldquo;Cybersecurity\u003cem\u003e\u0026rdquo; In 2017: when Moore\u0026rsquo;s law attacks, Cybersecurity-in-2017-when-moore-s-law-attacks\u003c/em\u003e. \u003cstrong\u003ehttps://doi.org/10.1111/risa.13687\u003c/strong\u003e\u003c/li\u003e\n\u003cli\u003eMorgan S. 2019. \u0026ldquo;Global cybersecurity spending predicted to exceed $1 trillion from 2017-2021.\u0026rdquo;\u003cem\u003eCybercrime Magazine\u003c/em\u003e. 2019. https://doi.org/10.1016/j.chbr.2022.100167\u003c/li\u003e\n\u003cli\u003eYuhas D.2017. \u0026ldquo;Doctors have trouble diagnosing alzheimer\u0026rsquo;s. AI doesn\u0026rsquo;t,\u0026rdquo; NBC News, Oct. 2017. doi: 10.3390/diagnostics11081473\u003c/li\u003e\n\u003cli\u003eMcFarland M., 2017.\u0026ldquo;Farmers spot diseased crops faster with artificial intelligence.\u0026rdquo; CNN Business, Dec. 2017. \u003cstrong\u003ehttps://doi.org/10.3390/agriculture12010009\u003c/strong\u003e\u003c/li\u003e\n\u003cli\u003eGeib C. \u0026ldquo;Nasa-funded research will let unmanned spacecraft \u0026quot;think\u0026quot; using AI and blockchain.\u0026rdquo; \u003cem\u003eFuturism\u003c/em\u003e. https://doi.org/10.1063/1.5007734\u003c/li\u003e\n\u003cli\u003eWinick E. 2017. \u0026ldquo;Lawyer-bots are shaking up jobs.\u0026rdquo; \u003cem\u003eMIT Technology Review.\u003c/em\u003e \u003cem\u003e\u003cstrong\u003edoi\u003c/strong\u003e\u003c/em\u003e.org/10.1177/0162243915605575.\u003c/li\u003e\n\u003cli\u003eMorey B. 2019. \u0026ldquo;Manufacturing and AI: Promises and pitfalls.\u0026rdquo; \u003cem\u003eSME\u003c/em\u003e. DOI: 10.1115/1.4047855\u003c/li\u003e\n\u003cli\u003eMorrow S., Crabtree. 2019. \u0026ldquo;The future of cybercrime \u0026amp; security, Juniper Research\u0026rdquo;. https://doi.org/10.1016/S1361-3723(18)30082-4\u003c/li\u003e\n\u003c/ol\u003e"},{"header":"Tables","content":"\u003cp\u003eTables 1 and 2 are available in the Supplementary Files section.\u003c/p\u003e "}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Artificial intelligence, Cybersecurity, Cyberattacks, Machine learning, Crime rate, number of crimes, regression algorithm","lastPublishedDoi":"10.21203/rs.3.rs-3975155/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-3975155/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eWe examine the potential of AI in enhancing cybersecurity solutions by highlighting both its advantages and disadvantages. We also talk about the potential for future research in the realm of cybersecurity related to the development of AI approaches across many application domains. One of our society's most significant and pervasive issues is crime. Numerous crimes are perpetrated often each day. The dataset in this instance consists of the date and the annual crime rate for the corresponding years. The crime rate used in this project is only based on robberies. Utilizing historical data, we employ the linear regression algorithm to forecast the percentage of crime rate in the coming years. The algorithm receives a date as input, and the result is the proportion of crime for that particular year.\u003c/p\u003e","manuscriptTitle":"Crime Rate Prediction using Cyber Security and Artificial Intelligent","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-02-23 10:48:33","doi":"10.21203/rs.3.rs-3975155/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"63b3c611-da02-429d-868f-4cc6a69af49f","owner":[],"postedDate":"February 23rd, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-02-26T17:35:22+00:00","versionOfRecord":[],"versionCreatedAt":"2024-02-23 10:48:33","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-3975155","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-3975155","identity":"rs-3975155","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.