Publications | Science of Security Virtual Organization

Defending the Cloud: Understanding the Role of Explainable AI in Intrusion Detection Systems

As cloud computing continues to evolve, the security of cloud-based systems remains a paramount concern. This research paper delves into the intricate realm of intrusion detection systems (IDS) within cloud environments, shedding light on their diverse types, associated challenges, and inherent limitations. In parallel, the study dissects the realm of Explainable AI (XAI), unveiling its conceptual essence and its transformative role in illuminating the inner workings of complex AI models. Amidst the dynamic landscape of cybersecurity, this paper unravels the synergistic potential of fusing XAI with intrusion detection, accentuating how XAI can enrich transparency and interpretability in the decision-making processes of AI-driven IDS. The exploration of XAI s promises extends to its capacity to mitigate contemporary challenges faced by traditional IDS, particularly in reducing false positives and false negatives. By fostering an understanding of these challenges and their ram-ifications this study elucidates the path forward in enhancing cloud-based security mechanisms. Ultimately, the culmination of insights reinforces the imperative role of Explainable AI in fortifying intrusion detection systems, paving the way for a more robust and comprehensible cybersecurity landscape in the cloud.

Authored by Utsav Upadhyay, Alok Kumar, Satyabrata Roy, Umashankar Rawat, Sandeep Chaurasia

AI-based Network Security Enhancement for 5G Industrial Internet of Things Environments

The recent 5G networks aim to provide higher speed, lower latency, and greater capacity; therefore, compared to the previous mobile networks, more advanced and intelligent network security is essential for 5G networks. To detect unknown and evolving 5G network intrusions, this paper presents an artificial intelligence (AI)-based network threat detection system to perform data labeling, data filtering, data preprocessing, and data learning for 5G network flow and security event data. The performance evaluations are first conducted on two well-known datasets-NSL-KDD and CICIDS 2017; then, the practical testing of proposed system is performed in 5G industrial IoT environments. To demonstrate detection against network threats in real 5G environments, this study utilizes the 5G model factory, which is downscaled to a real smart factory that comprises a number of 5G industrial IoT-based devices.

Authored by Jonghoon Lee, Hyunjin Kim, Chulhee Park, Youngsoo Kim, Jong-Geun Park

Honeynets and Cloud Security

Cloud computing has become increasingly popular in the modern world. While it has brought many positives to the innovative technological era society lives in today, cloud computing has also shown it has some drawbacks. These drawbacks are present in the security aspect of the cloud and its many services. Security practices differ in the realm of cloud computing as the role of securing information systems is passed onto a third party. While this reduces managerial strain on those who enlist cloud computing it also brings risk to their data and the services they may provide. Cloud services have become a large target for those with malicious intent due to the high density of valuable data stored in one relative location. By soliciting help from the use of honeynets, cloud service providers can effectively improve their intrusion detection systems as well as allow for the opportunity to study attack vectors used by malicious actors to further improve security controls. Implementing honeynets into cloud-based networks is an investment in cloud security that will provide ever-increasing returns in the hardening of information systems against cyber threats.

Authored by Eric Toth, Md Chowdhury

Using Kernel SHAP XAI Method to Optimize the Network Anomaly Detection Model

Anomaly detection and its explanation is important in many research areas such as intrusion detection, fraud detection, unknown attack detection in network traffic and logs. It is challenging to identify the cause or explanation of “why one instance is an anomaly?” and the other is not due to its unbounded and lack of supervisory nature. The answer to this question is possible with the emerging technique of explainable artificial intelligence (XAI). XAI provides tools and techniques to interpret and explain the output and working of complex models such as Deep Learning (DL). This paper aims to detect and explain network anomalies with XAI, kernelSHAP method. The same approach is used to improve the network anomaly detection model in terms of accuracy, recall, precision and f-score. The experiment is conduced with the latest CICIDS2017 dataset. Two models are created (Model\_1 and OPT\_Model) and compared. The overall accuracy and F-score of OPT\_Model (when trained in unsupervised way) are 0.90 and 0.76, respectively.

Authored by Khushnaseeb Roshan, Aasim Zafar

Using Kernel SHAP XAI Method to Optimize the Network Anomaly Detection Model

Authored by Khushnaseeb Roshan, Aasim Zafar

Unsupervised and Ensemble-based Anomaly Detection Method for Network Security

Bigdata and IoT technologies are developing rapidly. Accordingly, consideration of network security is also emphasized, and efficient intrusion detection technology is required for detecting increasingly sophisticated network attacks. In this study, we propose an efficient network anomaly detection method based on ensemble and unsupervised learning. The proposed model is built by training an autoencoder, a representative unsupervised deep learning model, using only normal network traffic data. The anomaly score of the detection target data is derived by ensemble the reconstruction loss and the Mahalanobis distances for each layer output of the trained autoencoder. By applying a threshold to this score, network anomaly traffic can be efficiently detected. To evaluate the proposed model, we applied our method to UNSW-NB15 dataset. The results show that the overall performance of the proposed method is superior to those of the model using only the reconstruction loss of the autoencoder and the model applying the Mahalanobis distance to the raw data.

Authored by Donghun Yang, Myunggwon Hwang

DDoS and Botnet Attacks: A Survey of Detection and Prevention Techniques

The Internet of Things (IoT) heralds a innovative generation in communication via enabling regular gadgets to supply, receive, and percentage records easily. IoT applications, which prioritise venture automation, aim to present inanimate items autonomy; they promise increased consolation, productivity, and automation. However, strong safety, privateness, authentication, and recuperation methods are required to understand this goal. In order to assemble give up-to-quit secure IoT environments, this newsletter meticulously evaluations the security troubles and risks inherent to IoT applications. It emphasises the vital necessity for architectural changes.The paper starts by conducting an examination of security worries before exploring emerging and advanced technologies aimed at nurturing a sense of trust, in Internet of Things (IoT) applications. The primary focus of the discussion revolves around how these technologies aid in overcoming security challenges and fostering an ecosystem for IoT.

Authored by Pranav A, Sathya S, HariHaran B

Leveraging Explainable AI Methods Towards Identifying Classification Issues on IDS Datasets

Nowadays, anomaly-based network intrusion detection system (NIDS) still have limited real-world applications; this is particularly due to false alarms, a lack of datasets, and a lack of confidence. In this paper, we propose to use explainable artificial intelligence (XAI) methods for tackling these issues. In our experimentation, we train a random forest (RF) model on the NSL-KDD dataset, and use SHAP to generate global explanations. We find that these explanations deviate substantially from domain expertise. To shed light on the potential causes, we analyze the structural composition of the attack classes. There, we observe severe imbalances in the number of records per attack type subsumed in the attack classes of the NSL-KDD dataset, which could lead to generalization and overfitting regarding classification. Hence, we train a new RF classifier and SHAP explainer directly on the attack types. Classification performance is considerably improved, and the new explanations are matching the expectations based on domain knowledge better. Thus, we conclude that the imbalances in the dataset bias classification and consequently also the results of XAI methods like SHAP. However, the XAI methods can also be employed to find and debug issues and biases in the data and the applied model. Furthermore, the debugging results in higher trustworthiness of anomaly-based NIDS.

Authored by Eric Lanfer, Sophia Sylvester, Nils Aschenbruck, Martin Atzmueller

Deceiving Post-Hoc Explainable AI (XAI) Methods in Network Intrusion Detection

Artificial Intelligence used in future networks is vulnerable to biases, misclassifications, and security threats, which seeds constant scrutiny in accountability. Explainable AI (XAI) methods bridge this gap in identifying unaccounted biases in black-box AI/ML models. However, scaffolding attacks can hide the internal biases of the model from XAI methods, jeopardizing any auditory or monitoring processes, service provisions, security systems, regulators, auditors, and end-users in future networking paradigms, including Intent-Based Networking (IBN). For the first time ever, we formalize and demonstrate a framework on how an attacker would adopt scaffoldings to deceive the security auditors in Network Intrusion Detection Systems (NIDS). Furthermore, we propose a detection method that auditors can use to detect the attack efficiently. We rigorously test the attack and detection methods using the NSL-KDD. We then simulate the attack on 5G network data. Our simulation illustrates that the attack adoption method is successful, and the detection method can identify an affected model with extremely high confidence.

Authored by Thulitha Senevirathna, Bartlomiej Siniarski, Madhusanka Liyanage, Shen Wang

Real-Time Zero-Day Intrusion Detection System for Automotive Controller Area Network on FPGAs

Increasing automation in vehicles enabled by increased connectivity to the outside world has exposed vulnerabilities in previously siloed automotive networks like controller area networks (CAN). Attributes of CAN such as broadcast-based communication among electronic control units (ECUs) that lowered deployment costs are now being exploited to carry out active injection attacks like denial of service (DoS), fuzzing, and spoofing attacks. Research literature has proposed multiple supervised machine learning models deployed as Intrusion detection systems (IDSs) to detect such malicious activity; however, these are largely limited to identifying previously known attack vectors. With the ever-increasing complexity of active injection attacks, detecting zero-day (novel) attacks in these networks in real-time (to prevent propagation) becomes a problem of particular interest. This paper presents an unsupervised-learning-based convolutional autoencoder architecture for detecting zero-day attacks, which is trained only on benign (attack-free) CAN messages. We quantise the model using Vitis-AI tools from AMD/Xilinx targeting a resource-constrained Zynq Ultrascale platform as our IDS-ECU system for integration. The proposed model successfully achieves equal or higher classification accuracy (\textgreater 99.5\%) on unseen DoS, fuzzing, and spoofing attacks from a publicly available attack dataset when compared to the state-of-the-art unsupervised learning-based IDSs. Additionally, by cleverly overlapping IDS operation on a window of CAN messages with the reception, the model is able to meet line-rate detection (0.43 ms per window) of high-speed CAN, which when coupled with the low energy consumption per inference, makes this architecture ideally suited for detecting zero-day attacks on critical CAN networks.

Authored by Shashwat Khandelwal, Shanker Shreejith

Adaptive Intrusion Detection Systems: Class Incremental Learning for IoT Emerging Threats

In the evolving landscape of Internet of Things (IoT) security, the need for continuous adaptation of defenses is critical. Class Incremental Learning (CIL) can provide a viable solution by enabling Machine Learning (ML) and Deep Learning (DL) models to ( i) learn and adapt to new attack types (0-day attacks), ( ii) retain their ability to detect known threats, (iii) safeguard computational efficiency (i.e. no full re-training). In IoT security, where novel attacks frequently emerge, CIL offers an effective tool to enhance Intrusion Detection Systems (IDS) and secure network environments. In this study, we explore how CIL approaches empower DL-based IDS in IoT networks, using the publicly-available IoT-23 dataset. Our evaluation focuses on two essential aspects of an IDS: ( a) attack classification and ( b) misuse detection. A thorough comparison against a fully-retrained IDS, namely starting from scratch, is carried out. Finally, we place emphasis on interpreting the predictions made by incremental IDS models through eXplainable AI (XAI) tools, offering insights into potential avenues for improvement.

Authored by Francesco Cerasuolo, Giampaolo Bovenzi, Christian Marescalco, Francesco Cirillo, Domenico Ciuonzo, Antonio Pescapè

Detecting Conventional and Adversarial Attacks Using Deep Learning Techniques: A Systematic Review

Significant progress has been made towards developing Deep Learning (DL) in Artificial Intelligence (AI) models that can make independent decisions. However, this progress has also highlighted the emergence of malicious entities that aim to manipulate the outcomes generated by these models. Due to increasing complexity, this is a concerning issue in various fields, such as medical image classification, autonomous vehicle systems, malware detection, and criminal justice. Recent research advancements have highlighted the vulnerability of these classifiers to both conventional and adversarial assaults, which may skew their results in both the training and testing stages. The Systematic Literature Review (SLR) aims to analyse traditional and adversarial attacks comprehensively. It evaluates 45 published works from 2017 to 2023 to better understand adversarial attacks, including their impact, causes, and standard mitigation approaches.

Authored by Tarek Ali, Amna Eleyan, Tarek Bejaoui

Advancing Network Survivability and Reliability: Integrating XAI-Enhanced Autoencoders and LDA for Effective Detection of Unknown Attacks

This study presents a novel approach for fortifying network security systems, crucial for ensuring network reliability and survivability against evolving cyber threats. Our approach integrates Explainable Artificial Intelligence (XAI) with an en-semble of autoencoders and Linear Discriminant Analysis (LDA) to create a robust framework for detecting both known and elusive zero-day attacks. We refer to this integrated method as AE- LDA. Our method stands out in its ability to effectively detect both known and previously unidentified network intrusions. By employing XAI for feature selection, we ensure improved inter-pretability and precision in identifying key patterns indicative of network anomalies. The autoencoder ensemble, trained on benign data, is adept at recognising a broad spectrum of network behaviours, thereby significantly enhancing the detection of zero-day attacks. Simultaneously, LDA aids in the identification of known threats, ensuring a comprehensive coverage of potential network vulnerabilities. This hybrid model demonstrates superior performance in anomaly detection accuracy and complexity management. Our results highlight a substantial advancement in network intrusion detection capabilities, showcasing an effective strategy for bolstering network reliability and resilience against a diverse range of cyber threats.

Authored by Fatemeh Stodt, Fabrice Theoleyre, Christoph Reich

The Ultimate Battle Against Zero-Day Exploits: Toward Fully Autonomous Cyber-Physical Defense

The last decade has shown that networked cyber-physical systems (NCPS) are the future of critical infrastructure such as transportation systems and energy production. However, they have introduced an uncharted territory of security vulnerabilities and a wider attack surface, mainly due to network openness and the deeply integrated physical and cyber spaces. On the other hand, relying on manual analysis of intrusion detection alarms might be effective in stopping run-of-the-mill automated probes but remain useless against the growing number of targeted, persistent, and often AI-enabled attacks on large-scale NCPS. Hence, there is a pressing need for new research directions to provide advanced protection. This paper introduces a novel security paradigm for emerging NCPS, namely Autonomous Cyber-Physical Defense (ACPD). We lay out the theoretical foundations and describe the methods for building autonomous and stealthy cyber-physical defense agents that are able to dynamically hunt, detect, and respond to intelligent and sophisticated adversaries in real time without human intervention. By leveraging the power of game theory and multi-agent reinforcement learning, these self-learning agents will be able to deploy complex cyber-physical deception scenarios on the fly, generate optimal and adaptive security policies without prior knowledge of potential threats, and defend themselves against adversarial learning. Nonetheless, serious challenges including trustworthiness, scalability, and transfer learning are yet to be addressed for these autonomous agents to become the next-generation tools of cyber-physical defense.

Authored by Talal Halabi, Mohammad Zulkernine

An Adversarial Approach for Intrusion Detection Using Hybrid Deep Learning Model

Attacks against computer system are viewed to be the most serious threat in the modern world. A zero-day vulnerability is an unknown vulnerability to the vendor of the system. Deep learning techniques are widely used for anomaly-based intrusion detection. The technique gives a satisfactory result for known attacks but for zero-day attacks the models give contradictory results. In this work, at first, two separate environments were setup to collect training and test data for zero-day attack. Zero-day attack data were generated by simulating real-time zero-day attacks. Ranking of the features from the train and test data was generated using explainable AI (XAI) interface. From the collected training data more attack data were generated by applying time series generative adversarial network (TGAN) for top 12 features. The train data was concatenated with the AWID dataset. A hybrid deep learning model using Long short-term memory (LSTM) and Convolutional neural network (CNN) was developed to test the zero-day data against the GAN generated concatenated dataset and the original AWID dataset. Finally, it was found that the result using the concatenated dataset gives better performance with 93.53\% accuracy, where the result from only AWID dataset gives 84.29\% accuracy.

Authored by Md. Asaduzzaman, Md. Rahman

Zero Day Threat Detection Using Graph and Flow Based Security Telemetry

Zero Day Threats (ZDT) are novel methods used by malicious actors to attack and exploit information technology (IT) networks or infrastructure. In the past few years, the number of these threats has been increasing at an alarming rate and have been costing organizations millions of dollars to remediate. The increasing expansion of network attack surfaces and the exponentially growing number of assets on these networks necessitate the need for a robust AI-based Zero Day Threat detection model that can quickly analyze petabyte-scale data for potentially malicious and novel activity. In this paper, the authors introduce a deep learning based approach to Zero Day Threat detection that can generalize, scale, and effectively identify threats in near real-time. The methodology utilizes network flow telemetry augmented with asset-level graph features, which are passed through a dual-autoencoder structure for anomaly and novelty detection respectively. The models have been trained and tested on four large scale datasets that are representative of real-world organizational networks and they produce strong results with high precision and recall values. The models provide a novel methodology to detect complex threats with low false positive rates that allow security operators to avoid alert fatigue while drastically reducing their mean time to response with near-real-time detection. Furthermore, the authors also provide a novel, labelled, cyber attack dataset generated from adversarial activity that can be used for validation or training of other models. With this paper, the authors’ overarching goal is to provide a novel architecture and training methodology for cyber anomaly detectors that can generalize to multiple IT networks with minimal to no retraining while still maintaining strong performance.

Authored by Christopher Redino, Dhruv Nandakumar, Robert Schiller, Kevin Choi, Abdul Rahman, Edward Bowen, Aaron Shaha, Joe Nehila, Matthew Weeks

Tell Me More: Black Box Explainability for APT Detection on System Provenance Graphs

Nowadays, companies, critical infrastructure and governments face cyber attacks every day ranging from simple denial-of-service and password guessing attacks to complex nationstate attack campaigns, so-called advanced persistent threats (APTs). Defenders employ intrusion detection systems (IDSs) among other tools to detect malicious activity and protect network assets. With the evolution of threats, detection techniques have followed with modern systems usually relying on some form of artificial intelligence (AI) or anomaly detection as part of their defense portfolio. While these systems are able to achieve higher accuracy in detecting APT activity, they cannot provide much context about the attack, as the underlying models are often too complex to interpret. This paper presents an approach to explain single predictions (i. e., detected attacks) of any graphbased anomaly detection systems. By systematically modifying the input graph of an anomaly and observing the output, we leverage a variation of permutation importance to identify parts of the graph that are likely responsible for the detected anomaly. Our approach treats the anomaly detection function as a black box and is thus applicable to any whole-graph explanation problems. Our results on two established datasets for APT detection (StreamSpot \& DARPA TC Engagement Three) indicate that our approach can identify nodes that are likely part of the anomaly. We quantify this through our area under baseline (AuB) metric and show how the AuB is higher for anomalous graphs. Further analysis via the Wilcoxon rank-sum test confirms that these results are statistically significant with a p-value of 0.0041\%.

Authored by Felix Welter, Florian Wilkens, Mathias Fischer

Cyber Defence Based on Artificial Intelligence and Neural Network Model in Cybersecurity

Cybersecurity is an increasingly critical aspect of modern society, with cyber attacks becoming more sophisticated and frequent. Artificial intelligence (AI) and neural network models have emerged as promising tools for improving cyber defense. This paper explores the potential of AI and neural network models in cybersecurity, focusing on their applications in intrusion detection, malware detection, and vulnerability analysis. Intruder detection, or "intrusion detection," is the process of identifying Invasion of Privacy to a computer system. AI-based security systems that can spot intrusions (IDS) use AI-powered packet-level network traffic analysis and intrusion detection patterns to signify an assault. Neural network models can also be used to improve IDS accuracy by modeling the behavior of legitimate users and detecting anomalies. Malware detection involves identifying malicious software on a computer system. AI-based malware machine-learning algorithms are used by detecting systems to assess the behavior of software and recognize patterns that indicate malicious activity. Neural network models can also serve to hone the precision of malware identification by modeling the behavior of known malware and identifying new variants. Vulnerability analysis involves identifying weaknesses in a computer system that could be exploited by attackers. AI-based vulnerability analysis systems use machine learning algorithms to analyze system configurations and identify potential vulnerabilities. Neural network models can also be used to improve the accuracy of vulnerability analysis by modeling the behavior of known vulnerabilities and identifying new ones. Overall, AI and neural network models have significant potential in cybersecurity. By improving intrusion detection, malware detection, and vulnerability analysis, they can help organizations better defend against cyber attacks. However, these technologies also present challenges, including a lack of understanding of the importance of data in machine learning and the potential for attackers to use AI themselves. As such, careful consideration is necessary when implementing AI and neural network models in cybersecurity.

Authored by D. Sugumaran, Y. John, Jansi C, Kireet Joshi, G. Manikandan, Geethamanikanta Jakka

Autonomous and Security-Aware Dynamic Vehicular Platoon Formation

As vehicles increasingly embed digital systems, new security vulnerabilities are also being introduced. Computational constraints make it challenging to add security oversight layers on top of core vehicle systems, especially when the security layers rely on additional deep learning models for anomaly detection. To improve security-aware decision-making for autonomous vehicles (AV), this paper proposes a bi-level security framework. The ﬁrst security level consists of a one-shot resource allocation game that enables a single vehicle to fend off an attacker by optimizing the conﬁguration of its intrusion prevention system based on risk estimation. The second level relies on a reinforcement learning (RL) environment where an agent is responsible for forming and managing a platoon of vehicles on the ﬂy while also dealing with a potential attacker. We solve the ﬁrst problem using a minimax algorithm to identify optimal strategies for each player. Then, we train RL agents and analyze their performance in forming security-aware platoons. The trained agents demonstrate superior performance compared to our baseline strategies that do not consider security risk.

Authored by Dominic Phillips, Talal Halabi, Mohammad Zulkernine

The Ultimate Battle Against Zero-Day Exploits: Toward Fully Autonomous Cyber-Physical Defense

The last decade has shown that networked cyberphysical systems (NCPS) are the future of critical infrastructure such as transportation systems and energy production. However, they have introduced an uncharted territory of security vulnerabilities and a wider attack surface, mainly due to network openness and the deeply integrated physical and cyber spaces. On the other hand, relying on manual analysis of intrusion detection alarms might be effective in stopping run-of-the-mill automated probes but remain useless against the growing number of targeted, persistent, and often AI-enabled attacks on large-scale NCPS. Hence, there is a pressing need for new research directions to provide advanced protection. This paper introduces a novel security paradigm for emerging NCPS, namely Autonomous CyberPhysical Defense (ACPD). We lay out the theoretical foundations and describe the methods for building autonomous and stealthy cyber-physical defense agents that are able to dynamically hunt, detect, and respond to intelligent and sophisticated adversaries in real time without human intervention. By leveraging the power of game theory and multi-agent reinforcement learning, these selflearning agents will be able to deploy complex cyber-physical deception scenarios on the ﬂy, generate optimal and adaptive security policies without prior knowledge of potential threats, and defend themselves against adversarial learning. Nonetheless, serious challenges including trustworthiness, scalability, and transfer learning are yet to be addressed for these autonomous agents to become the next-generation tools of cyber-physical defense.

Authored by Talal Halabi, Mohammad Zulkernine

AI-based Network Security Enhancement for 5G Industrial Internet of Things Environments

Intelligent Data and Security - The recent 5G networks aim to provide higher speed, lower latency, and greater capacity; therefore, compared to the previous mobile networks, more advanced and intelligent network security is essential for 5G networks. To detect unknown and evolving 5G network intrusions, this paper presents an artificial intelligence (AI)-based network threat detection system to perform data labeling, data filtering, data preprocessing, and data learning for 5G network flow and security event data. The performance evaluations are first conducted on two well-known datasets-NSL-KDD and CICIDS 2017; then, the practical testing of proposed system is performed in 5G industrial IoT environments. To demonstrate detection against network threats in real 5G environments, this study utilizes the 5G model factory, which is downscaled to a real smart factory that comprises a number of 5G industrial IoT-based devices.

Authored by Jonghoon Lee, Hyunjin Kim, Chulhee Park, Youngsoo Kim, Jong-Geun Park

Feature Selection by Improved Sand Cat Swarm Optimizer for Intrusion Detection

The rapid growth of number of devices that are connected to internet of things (IoT) networks, increases the severity of security problems that need to be solved in order to provide safe environment for network data exchange. The discovery of new vulnerabilities is everyday challenge for security experts and many novel methods for detection and prevention of intrusions are being developed for dealing with this issue. To overcome these shortcomings, artificial intelligence (AI) can be used in development of advanced intrusion detection systems (IDS). This allows such system to adapt to emerging threats, react in real-time and adjust its behavior based on previous experiences. On the other hand, the traffic classification task becomes more difficult because of the large amount of data generated by network systems and high processing demands. For this reason, feature selection (FS) process is applied to reduce data complexity by removing less relevant data for the active classification task and therefore improving algorithm's accuracy. In this work, hybrid version of recently proposed sand cat swarm optimizer algorithm is proposed for feature selection with the goal of increasing performance of extreme learning machine classifier. The performance improvements are demonstrated by validating the proposed method on two well-known datasets - UNSW-NB15 and CICIDS-2017, and comparing the results with those reported for other cutting-edge algorithms that are dealing with the same problems and work in a similar configuration.

Authored by Dijana Jovanovic, Marina Marjanovic, Milos Antonijevic, Miodrag Zivkovic, Nebojsa Budimirovic, Nebojsa Bacanin

Analysis of Intrusion Detection Performance by Smoothing Factor of Gaussian NB Model Using Modified NSL-KDD Dataset

Recently, research on AI-based network intrusion detection has been actively conducted. In previous studies, the machine learning models such as SVM (Support Vector Machine) and RF (Random Forest) showed consistently high performance, whereas the NB (Naïve Bayes) showed various performances with large deviations. In the paper, after analyzing the cause of the NB models showing various performances addressed in the several studies, we measured the performance of the Gaussian NB model according to the smoothing factor that is closely related to these causes. Furthermore, we compared the performance of the Gaussian NB model with that of the other models as a zero-day attack detection system. As a result of the experiment, the accuracy was 38.80% and 87.99% in case that the smoothing factor is 0 and default respectively, and the highest accuracy was 94.53% in case that the smoothing factor is 1e-01. In the experiment, we used only some types of the attack data in the NSL-KDD dataset. The experiments showed the applicability of the Gaussian NB model as a zero-day attack detection system in the future. In addition, it is clarified that the smoothing factor of the Gaussian NB model determines the shape of gaussian distribution that is related to the likelihood.

Authored by Kijung Bong, Jonghyun Kim

Data Quality Problem in AI-Based Network Intrusion Detection Systems Studies and a Solution Proposal

Network Intrusion Detection Systems (IDSs) have been used to increase the level of network security for many years. The main purpose of such systems is to detect and block malicious activity in the network traffic. Researchers have been improving the performance of IDS technology for decades by applying various machine-learning techniques. From the perspective of academia, obtaining a quality dataset (i.e. a sufficient amount of captured network packets that contain both malicious and normal traffic) to support machine learning approaches has always been a challenge. There are many datasets publicly available for research purposes, including NSL-KDD, KDDCUP 99, CICIDS 2017 and UNSWNB15. However, these datasets are becoming obsolete over time and may no longer be adequate or valid to model and validate IDSs against state-of-the-art attack techniques. As attack techniques are continuously evolving, datasets used to develop and test IDSs also need to be kept up to date. Proven performance of an IDS tested on old attack patterns does not necessarily mean it will perform well against new patterns. Moreover, existing datasets may lack certain data fields or attributes necessary to analyse some of the new attack techniques. In this paper, we argue that academia needs up-to-date high-quality datasets. We compare publicly available datasets and suggest a way to provide up-to-date high-quality datasets for researchers and the security industry. The proposed solution is to utilize the network traffic captured from the Locked Shields exercise, one of the world’s largest live-fire international cyber defence exercises held annually by the NATO CCDCOE. During this three-day exercise, red team members consisting of dozens of white hackers selected by the governments of over 20 participating countries attempt to infiltrate the networks of over 20 blue teams, who are tasked to defend a fictional country called Berylia. After the exercise, network packets captured from each blue team’s network are handed over to each team. However, the countries are not willing to disclose the packet capture (PCAP) files to the public since these files contain specific information that could reveal how a particular nation might react to certain types of cyberattacks. To overcome this problem, we propose to create a dedicated virtual team, capture all the traffic from this team’s network, and disclose it to the public so that academia can use it for unclassified research and studies. In this way, the organizers of Locked Shields can effectively contribute to the advancement of future artificial intelligence (AI) enabled security solutions by providing annual datasets of up-to-date attack patterns.

Authored by Maj. Halisdemir, Hacer Karacan, Mauno Pihelgas, Toomas Lepik, Sungbaek Cho