Malwares have been being a major security threats to enterprises, government organizations and end-users. Beside traditional malwares, such as viruses, worms and trojans, new types of malwares, such as botnets, ransomwares, IoT malwares and crypto-jackings are released daily. To cope with malware threats, several measures for monitoring, detecting and preventing malwares have been developed and deployed in practice, such as signature-based detection, static and dynamic file analysis. This paper proposes 2 malware detection models based on statistics and machine learning using opcode n-grams. The proposed models aim at achieving high detection accuracy as well as reducing the amount of time for training and detection. Experimental results show that our proposed models give better performance measures than previous proposals. Specifically, the proposed statistics-based model is very fast and it achieves a high detection accuracy of 92.75\% and the random forest-based model produces the highest detection accuracy of 96.29\%.
Authored by Xuan Hoang, Ba Nguyen, Thi Ninh
The term Internet of Things(IoT) describes a network of real-world items, gadgets, structures, and other things that are equipped with communication and sensors for gathering and exchanging data online. The likelihood of Android malware attacks on IoT devices has risen due to their widespread use. Regular security precautions might not be practical for these devices because they frequently have limited resources. The detection of malware attacks on IoT environments has found hope in ML approaches. In this paper, some machine learning(ML) approaches have been utilized to detect IoT Android malware threats. This method uses a collection of Android malware samples and good apps to build an ML model. Using the Android Malware dataset, many ML techniques, including Naive Bayes (NB), K-Nearest Neighbour (KNN), Decision Tree (DT), and Random Forest (RF), are used to detect malware in IoT. The accuracy of the DT model is 95\%, which is the highest accuracy rate, while that of the NB, KNN, and RF models have accuracy rates of 84\%, 89\%, and 92\%, respectively.
Authored by Anshika Sharma, Himanshi Babbar
With the development of network technologies, network intrusion has become increasing complex which makes the intrusion detection challenging. Traditional intrusion detection algorithms detect intrusion traffic through intrusion traffic characteristics or machine learning. These methods are inefficient due to the dependence of manual work. Therefore, in order to improve the efficiency and the accuracy, we propose an intrusion detection method based on deep learning. We integrate the Transformer and LSTM module with intrusion detection model to automatically detect network intrusion. The Transformer and LSTM can capture the temporal information of the traffic data which benefits to distinguish the abnormal data from normal data. We conduct experiments on the publicly available NSL-KDD dataset to evaluate the performance of our proposed model. The experimental results show that the proposed model outperforms other deep learning based models.
Authored by Zhipeng Zhang, Xiaotian Si, Linghui Li, Yali Gao, Xiaoyong Li, Jie Yuan, Guoqiang Xing
Network intrusion detection is a crucial task in ensuring the security and reliability of computer networks. In recent years, machine learning algorithms have shown promising results in identifying anomalous activities indicative of network intrusions. In the context of intrusion detection systems, novelty detection often receives limited attention within machine learning communities. This oversight can be attributed to the historical emphasis on optimizing performance metrics using established datasets, which may not adequately represent the evolving landscape of cyber threats. This research aims to compare four widely used novelty detection algorithms for network intrusion detection, namely SGDOneClassSVM, LocalOutlierDetection, EllipticalEnvelope Covariance, and Isolation Forest. Our experiments with the UNSW-NB15 dataset show that Isolation Forest was the best-performing algorithm with an F1-score of 0.723. The result shows that network-based intrusion detection systems are still challenging for novelty detection algorithms.
Authored by Maxmilian Halim, Baskoro Pratomo, Bagus Santoso
In the face of a large number of network attacks, intrusion detection system can issue early warning, indicating the emergence of network attacks. In order to improve the traditional machine learning network intrusion detection model to identify the behavior of network attacks, improve the detection accuracy and accuracy. Convolutional neural network is used to construct intrusion detection model, which has better ability to solve complex problems and better adaptability of algorithm. In order to solve the problems such as dimension explosion caused by input data, the albino PCA algorithm is used to extract data features and reduce data dimensions. For the common problem of convolutional neural networks in intrusion detection such as overfitting, Dropout layers are added before and after the fully connected layer of CNN, and Sigmoid is selected as the intrusion classification prediction function. This reduces the overfitting, improves the robustness of the intrusion detection model, and enhances the fault tolerance and generalization ability of the model to improve the accuracy of the intrusion detection model. The effectiveness of the proposed method in intrusion detection is verified by comparison and analysis of numerical examples.
Authored by Peiqing Zhang, Guangke Tian, Haiying Dong
The use of computers and the internet has spread rapidly over the course of the past few decades. Every day, more and more peopleare coming to rely heavily on the internet. When it comes to the field of information security, the subject of security is one that is becoming an increasingly important focus. It is vital to design a powerful intrusion detection system in order to prevent computer hackers and other intruders from effectively getting into computer networks or systems. This can be accomplished by: (IDS). The danger and attack detection capabilities of the computer system are built into the intrusion detection system. Abuse has occurred and can be used to identify invasions when there is a deviation between a preset pattern of intrusion and an observedpattern of intrusion. An intrusion detection system (IDS) is a piece of hardware (or software) that is used to generate reports for a Management Station as well as monitor network and/or system activities for unethical behaviour or policy violations. In the current study, an approach known as machine learning is suggested as a possible paradigm for the development of a network intrusion detection system. The results of the experiment show that the strategy that was suggested improves the capability of intrusion detection.
Authored by Ajmeera Kiran, Wilson Prakash, Anand Kumar, Likhitha, Tammana Sameeratmaja, Ungarala Charan
Intelligent environments rely heavily on the Internet of Things, which can be targeted by malicious attacks. Therefore, the autonomous capabilities of agents in intelligent health-care environments, and the agents’ characteristics (accuracy, reliability, efficiency and responsiveness), should be exploited to devise an autonomous intelligent agent that can safeguard the entire environment from malicious attacks. Hence, this paper contributes to achieving this aim by selecting the eight most valuable features out of 50 features from the adopted dataset using the Chi-squared test. Then, three wellknown machine learning classifiers (i.e. naive Bayes, random forest and logistic regression) are compared in classifying malicious attacks from non-attacks in an intelligent health-care environment. The highest achieved classification accuracy was for the random forest classifier (99.92\%).
Authored by Abdulkreem Alzahrani
Organizations strive to secure their valuable data and minimise potential damages, recognising that critical operations are susceptible to attacks. This research paper seeks to elucidate the concept of proactive cyber threat hunting. The proposed framework is to help organisations check their preparedness against upcoming threats and their probable mitigation plan. While traditional threat detection methods have been implemented, they often need to address the evolving landscape of advanced cyber threats. Organisations must adopt proactive threat-hunting strategies to safeguard business operations and identify and mitigate unknown or undetected network threats. This research proposes a conceptual model based on a review of the literature. The proposed framework will help the organisation recover from the attack. As the recovery time is less, the financial loss for the company will also be reduced. Also, the attacker might need more time to gather data, so there will be less stealing of confidential information. Cybersecurity companies use proactive cyber defence strategies to reduce an attacker s time on the network. The different frameworks used are SANS, MITRE, Hunting ELK, Logstash, Digital Kill Chain, Model in Diamonds, and NIST Framework for Cybersecurity, which proposes a proactive approach. It is beneficial for the defensive security team to assess their capabilities to defend against Advanced Threats Persistent (ATP) and a wide range of attack vectors.
Authored by Mugdha Kulkarni, Dudhia Ashit, Chauhan Chetan
Advanced persistent threats (APTs) have novel features such as multi-stage penetration, highly-tailored intention, and evasive tactics. APTs defense requires fusing multi-dimensional Cyber threat intelligence data to identify attack intentions and conducts efficient knowledge discovery strategies by data-driven machine learning to recognize entity relationships. However, data-driven machine learning lacks generalization ability on fresh or unknown samples, reducing the accuracy and practicality of the defense model. Besides, the private deployment of these APT defense models on heterogeneous environments and various network devices requires significant investment in context awareness (such as known attack entities, continuous network states, and current security strategies). In this paper, we propose a few-shot multi-domain knowledge rearming (FMKR) scheme for context-aware defense against APTs. By completing multiple small tasks that are generated from different network domains with meta-learning, the FMKR firstly trains a model with good discrimination and generalization ability for fresh and unknown APT attacks. In each FMKR task, both threat intelligence and local entities are fused into the support/query sets in meta-learning to identify possible attack stages. Secondly, to rearm current security strategies, an finetuning-based deployment mechanism is proposed to transfer learned knowledge into the student model, while minimizing the defense cost. Compared to multiple model replacement strategies, the FMKR provides a faster response to attack behaviors while consuming less scheduling cost. Based on the feedback from multiple real users of the Industrial Internet of Things (IIoT) over 2 months, we demonstrate that the proposed scheme can improve the defense satisfaction rate.
Authored by Gaolei Li, Yuanyuan Zhao, Wenqi Wei, Yuchen Liu
As the use of machine learning continues to grow in prominence, so does the need for increased knowledge of the threats posed by artificial intelligence. Now more than ever, people are worried about poison attacks, one of the many AI-generated dangers that have already been made public. To fool a classifier during testing, an attacker may "poison" it by altering a portion of the dataset it utilised for training. The poison-resistance strategy presented in this article is novel. The approach uses a recently developed basic called the keyed nonlinear probability test to determine whether or not the training input is consistent with a previously learnt Ddistribution when the odds are stacked against the model. We use an adversary-unknown secret key in our operation. Since the caveats are kept hidden, an adversary cannot use them to fool a keyed nonparametric normality test into concluding that a (substantially) modified dataset really originates from the designated dataset (D).
Authored by Ramesh Saini
This survey paper provides an overview of the current state of AI attacks and risks for AI security and privacy as artificial intelligence becomes more prevalent in various applications and services. The risks associated with AI attacks and security breaches are becoming increasingly apparent and cause many financial and social losses. This paper will categorize the different types of attacks on AI models, including adversarial attacks, model inversion attacks, poisoning attacks, data poisoning attacks, data extraction attacks, and membership inference attacks. The paper also emphasizes the importance of developing secure and robust AI models to ensure the privacy and security of sensitive data. Through a systematic literature review, this survey paper comprehensively analyzes the current state of AI attacks and risks for AI security and privacy and detection techniques.
Authored by Md Rahman, Aiasha Arshi, Md Hasan, Sumayia Mishu, Hossain Shahriar, Fan Wu
AI technology is widely used in different fields due to the effectiveness and accurate results that have been achieved. The diversity of usage attracts many attackers to attack AI systems to reach their goals. One of the most important and powerful attacks launched against AI models is the label-flipping attack. This attack allows the attacker to compromise the integrity of the dataset, where the attacker is capable of degrading the accuracy of ML models or generating specific output that is targeted by the attacker. Therefore, this paper studies the robustness of several Machine Learning models against targeted and non-targeted label-flipping attacks against the dataset during the training phase. Also, it checks the repeatability of the results obtained in the existing literature. The results are observed and explained in the domain of the cyber security paradigm.
Authored by Alanoud Almemari, Raviha Khan, Chan Yeun
Federated learning is proposed as a typical distributed AI technique to protect user privacy and data security, and it is based on decentralized datasets that train machine learning models by sharing model gradients rather than sharing user data. However, while this particular machine learning approach safeguards data from being shared, it also increases the likelihood that servers will be attacked. Joint learning models are sensitive to poisoning attacks and can effectively pose a threat to the global model when an attacker directly contaminates the global model by passing poisoned gradients. In this paper, we propose a joint learning poisoning attack method based on feature selection. Unlike traditional poisoning attacks, it only modifies important features of the data and ignores other features, which ensures the effectiveness of the attack while being highly stealthy and can bypass general defense methods. After experiments, we demonstrate the feasibility of the method.
Authored by Zhengqi Liu, Ziwei Liu, Xu Yang
Machine learning models are susceptible to a class of attacks known as adversarial poisoning where an adversary can maliciously manipulate training data to hinder model performance or, more concerningly, insert backdoors to exploit at inference time. Many methods have been proposed to defend against adversarial poisoning by either identifying the poisoned samples to facilitate removal or developing poison agnostic training algorithms. Although effective, these proposed approaches can have unintended consequences on the model, such as worsening performance on certain data sub-populations, thus inducing a classification bias. In this work, we evaluate several adversarial poisoning defenses. In addition to traditional security metrics, i.e., robustness to poisoned samples, we also adapt a fairness metric to measure the potential undesirable discrimination of sub-populations resulting from using these defenses. Our investigation highlights that many of the evaluated defenses trade decision fairness to achieve higher adversarial poisoning robustness. Given these results, we recommend our proposed metric to be part of standard evaluations of machine learning defenses.
Authored by Nathalie Baracaldo, Farhan Ahmed, Kevin Eykholt, Yi Zhou, Shriti Priya, Taesung Lee, Swanand Kadhe, Mike Tan, Sridevi Polavaram, Sterling Suggs, Yuyang Gao, David Slater
Wireless Sensor Networks (WSN s) have gained prominence in technology for diverse applications, such as environmental monitoring, health care, smart agriculture, and industrial automation. Comprising small, low-power sensor nodes that sense and collect data from the environment, process it locally, and communicate wirelessly with a central sink or gateway, WSN s face challenges related to limited energy resources, communication constraints, and data processing requirements. This paper presents a comprehensive review of the current state of research in WSN s, focusing on aspects such as network architecture, communication protocols, energy management techniques, data processing and fusion, security and privacy, and applications. Existing solutions are critically analysed regarding their strengths, weaknesses, research gaps, and future directions for WSNs.
Authored by Santosh Jaiswal, Anshu Dwivedi
Theoretical Limits of Provable Security Against Model Extraction by Efficient Observational Defenses
Can we hope to provide provable security against model extraction attacks? As a step towards a theoretical study of this question, we unify and abstract a wide range of “observational” model extraction defenses (OMEDs) - roughly, those that attempt to detect model extraction by analyzing the distribution over the adversary s queries. To accompany the abstract OMED, we define the notion of complete OMEDs - when benign clients can freely interact with the model - and sound OMEDs - when adversarial clients are caught and prevented from reverse engineering the model. Our formalism facilitates a simple argument for obtaining provable security against model extraction by complete and sound OMEDs, using (average-case) hardness assumptions for PAC-learning, in a way that abstracts current techniques in the prior literature. The main result of this work establishes a partial computational incompleteness theorem for the OMED: any efficient OMED for a machine learning model computable by a polynomial size decision tree that satisfies a basic form of completeness cannot satisfy soundness, unless the subexponential Learning Parity with Noise (LPN) assumption does not hold. To prove the incompleteness theorem, we introduce a class of model extraction attacks called natural Covert Learning attacks based on a connection to the Covert Learning model of Canetti and Karchmer (TCC 21), and show that such attacks circumvent any defense within our abstract mechanism in a black-box, nonadaptive way. As a further technical contribution, we extend the Covert Learning algorithm of Canetti and Karchmer to work over any “concise” product distribution (albeit for juntas of a logarithmic number of variables rather than polynomial size decision trees), by showing that the technique of learning with a distributional inverter of Binnendyk et al. (ALT 22) remains viable in the Covert Learning setting.
Authored by Ari Karchmer
Vehicular Ad Hoc Networks (VANETs) have the capability of swapping every node of every individual while driving and traveling on the roadside. The VANET-connected vehicle can send and receive data such as requests for emergency assistance, current traffic conditions, etc. VANET assistance with a vehicle for communication purposes is desperately needed. The routing method has the characteristics of safe routing to repair the trust-based features on a specific node.When malicious activity is uncovered, intrusion detection systems (IDS) are crucial tools for mitigating the damage. Collaborations between vehicles in a VANET enhance detection precision by spreading information about interactions across their nodes. This makes the machine learning distribution system feasible, scalable, and usable for creating VANET-based cooperative detection techniques. Privacy considerations are a major impediment to collaborative learning due to the data flow between nodes. A malicious node can get private details about other nodes by observing them. This study proposes a cooperative IDS for VANETs that safeguards the data generated by machine learning. In the intrusion detection phase, the selected optimal characteristics is used to detect network intrusion via a hybrid Deep Neural Network and Bidirectional Long Short-Term Memory approach. The Trust-based routing protocol then performs the intrusion prevention process, stopping the hostile node by having it select the most efficient routing path possible.
Authored by Raghunath Kawale, Ritesh Patil, Lalit Patil
Entering the critical year of the 14th Five Year Plan, China s information security industry has entered a new stage of development. With the increasing importance of information security, its industrial development has been paid attention to, but the data fragmentation of China s information security industry is serious, and there are few corresponding summaries and predictions. To achieve the development prediction of the industry, this article studies the intelligent prediction of information security industry data based on machine learning and new adaptive weighted fusion, and deduces the system based on the research results to promote industry development. Firstly, collect, filter, integrate, and preprocess industry data. Based on the characteristics of the data, machine learning algorithms such as linear regression, ridge regression, logical regression, polynomial regression and random forest are selected to predict the data, and the corresponding optimal parameters are found and set in the model creation. And an improved adaptive weighted fusion model based on model prediction performance was proposed. Its principle is to adaptively select the model with the lowest mean square error (MSE) value for fusion based on the real-time prediction performance of multiple machine learning models, and its weight is also calculated adaptively to improve prediction accuracy. Secondly, using technologies such as Matplotlib and Pyecharts to visualize the data and predicted results, it was found that the development trend of the information security industry is closely related to factors such as national information security laws and regulations, the situation between countries, and social emergencies. According to the predicted results of the data, it is observed that both industry input and output have shown an upward trend in recent years. In the future, China s information security industry is expected to maintain stable and rapid growth driven by the domestic market.
Authored by Lijiao Ding, Ting Wang, Jinze Sun, Changqiang Jing
The term "Internet of things (IoT) security" refers to the software industry concerned with protecting the IoT and connected devices. Internet of Things (IoT) is a network of devices connected with computers, sensors, actuators, or users. In IoT, each device has a distinct identity and is required to automatically transmit data over the network. Allowing computers to connect to the Internet exposes them to a number of major vulnerabilities if they are not properly secured. IoT security concerns must be monitored and analyzed to ensure the proper working of IoT models. Protecting personal safety while ensuring accessibility is the main objective of IoT security. This article has surveyed some of the methods and techniques used to secure data. Accuracy, precision, recall, f1 score, and area under the Receiver Operating Characteristic Curve are the assessment metrics utilized to compare the performance of the existing techniques. Further the utilization of machine learning algorithms like Decision Tree, Random Forest, and ANN tests have resulted in an accuracy of 99.4\%. Despite the results, Random Forest (RF) performs significantly better. This study will help to gain more knowledge on the smart home automation and its security challenges.
Authored by Robinson Joel, G. Manikandan, G Bhuvaneswari
Machine-learning-based approaches have emerged as viable solutions for automatic detection of container-related cyber attacks. Choosing the best anomaly detection algorithms to identify such cyber attacks can be difficult in practice, and it becomes even more difficult for zero-day attacks for which no prior attack data has been labeled. In this paper, we aim to address this issue by adopting an ensemble learning strategy: a combination of different base anomaly detectors built using conventional machine learning algorithms. The learning strategy provides a highly accurate zero-day container attack detection. We first architect a testbed to facilitate data collection and storage, model training and inference. We then perform two case studies of cyber attacks. We show that, for both case studies, despite the fact that individual base detector performance varies greatly between model types and model hyperparameters, the ensemble learning can consistently produce detection results that are close to the best base anomaly detectors. Additionally, we demonstrate that the detection performance of the resulting ensemble models is on average comparable to the best-performing deep learning anomaly detection approaches, but with much higher robustness, shorter training time, and much less training data. This makes the ensemble learning approach very appealing for practical real-time cyber attack detection scenarios with limited training data.
Authored by Shuai Guo, Thanikesavan Sivanthi, Philipp Sommer, Maëlle Kabir-Querrec, Nicolas Coppik, Eshaan Mudgal, Alessandro Rossotti
Deploying Connected and Automated Vehicles (CAVs) on top of 5G and Beyond networks (5GB) makes them vulnerable to increasing vectors of security and privacy attacks. In this context, a wide range of advanced machine/deep learningbased solutions have been designed to accurately detect security attacks. Specifically, supervised learning techniques have been widely applied to train attack detection models. However, the main limitation of such solutions is their inability to detect attacks different from those seen during the training phase, or new attacks, also called zero-day attacks. Moreover, training the detection model requires significant data collection and labeling, which increases the communication overhead, and raises privacy concerns. To address the aforementioned limits, we propose in this paper a novel detection mechanism that leverages the ability of the deep auto-encoder method to detect attacks relying only on the benign network traffic pattern. Using federated learning, the proposed intrusion detection system can be trained with large and diverse benign network traffic, while preserving the CAVs’ privacy, and minimizing the communication overhead. The in-depth experiment on a recent network traffic dataset shows that the proposed system achieved a high detection rate while minimizing the false positive rate, and the detection delay.
Authored by Abdelaziz Korba, Abdelwahab Boualouache, Bouziane Brik, Rabah Rahal, Yacine Ghamri-Doudane, Sidi Senouci
An intrusion detection system (IDS) is a crucial software or hardware application that employs security mechanisms to identify suspicious activity in a system or network. According to the detection technique, IDS is divided into two, namely signature-based and anomaly-based. Signature-based is said to be incapable of handling zero-day attacks, while anomaly-based is able to handle it. Machine learning techniques play a vital role in the development of IDS. There are differences of opinion regarding the most optimal algorithm for IDS classification in several previous studies, such as Random Forest, J48, and AdaBoost. Therefore, this study aims to evaluate the performance of the three algorithm models, using the NSL-KDD and UNSW-NB15 datasets used in previous studies. Empirical results demonstrate that utilizing AdaBoost+J48 with NSL-KDD achieves an accuracy of 99.86\%, along with precision, recall, and f1-score rates of 99.9\%. These results surpass previous studies using AdaBoost+Random Tree, with an accuracy of 98.45\%. Furthermore, this research explores the effectiveness of anomaly-based systems in dealing with zero-day attacks. Remarkably, the results show that anomaly-based systems perform admirably in such scenarios. For instance, employing Random Forest with the UNSW-NB15 dataset yielded the highest performance, with an accuracy rating of 99.81\%.
Authored by Nurul Fauzi, Fazmah Yulianto, Hilal Nuha
Android is the most popular smartphone operating system with a market share of 68.6\% in Apr 2023. Hence, Android is a more tempting target for cybercriminals. This research aims at contributing to the ongoing efforts to enhance the security of Android applications and protect users from the ever-increasing sophistication of malware attacks. Zero-day attacks pose a significant challenge to traditional signature-based malware detection systems, as they exploit vulnerabilities that are unknown to all. In this context, static analysis can be an encouraging approach for detecting malware in Android applications, leveraging machine learning (ML) and deep learning (DL)-based models. In this research, we have used single feature and combination of features extracted from the static properties of mobile apps as input(s) to the ML and DL based models, enabling it to learn and differentiate between normal and malicious behavior. We have evaluated the performance of those models based on a diverse dataset (DREBIN) comprising of real-world Android applications features, including both benign and zero-day malware samples. We have achieved F1 Score 96\% from the multi-view model (DL Model) in case of Zero-day malware scenario. So, this research can be helpful for mitigating the risk of unknown malware.
Authored by Jabunnesa Sara, Shohrab Hossain
The most serious risk to network security can arise from a zero-day attack. Zero-day attacks are challenging to identify as they exhibit unseen behavior. Intrusion detection systems (IDS) have gained considerable attention as an effective tool for detecting such attacks. IDS are deployed in network systems to monitor the network and to detect any potential threats. Recently, a lot of Machine learning (ML) and Deep Learning (DL) techniques have been employed in Intrusion Detection Systems, and it has been found that these techniques can detect zero-day attacks efficiently. This paper provides an overview of the background, importance, and different types of ML and DL techniques adopted for detecting zero-day attacks. Then it conducts a comprehensive review of recent ML and DL techniques for detecting zero-day attacks and discusses the associated issues. Further, we analyze the results and highlight the research challenges and future scope for improving the ML and DL approaches for zero-day attack detection.
Authored by Nowsheen Mearaj, Arif Wani
Explainable AI (XAI) techniques are used for understanding the internals of the AI algorithms and how they produce a particular result. Several software packages are available implementing XAI techniques however, their use requires a deep knowledge of the AI algorithms and their output is not intuitive for non-experts. In this paper we present a framework, (XAI4PublicPolicy), that provides customizable and reusable dashboards for XAI ready to be used both for data scientists and general users with no code. The models, and data sets are selected dragging and dropping from repositories While dashboards are generated selecting the type of charts. The framework can work with structured data and images in different formats. This XAI framework was developed and is being used in the context of the AI4PublicPolicy European project for explaining the decisions made by machine learning models applied to the implementation of public policies.
Authored by Marta Martínez, Ainhoa Azqueta-Alzúaz