Malware poses a significant threat to global cy-bersecurity, with machine learning emerging as the primary method for its detection and analysis. However, the opaque nature of machine learning s decision-making process of-ten leads to confusion among stakeholders, undermining their confidence in the detection outcomes. To enhance the trustworthiness of malware detection, Explainable Artificial Intelligence (XAI) is employed to offer transparent and comprehensible explanations of the detection mechanisms, which enable stakeholders to gain a deeper understanding of detection mechanisms and assist in developing defensive strategies. Despite the recent XAI advancements, several challenges remain unaddressed. In this paper, we explore the specific obstacles encountered in applying XAI to malware detection and analysis, aiming to provide a road map for future research in this critical domain.
Authored by L. Rui, Olga Gadyatskaya
Security vulnerabilities are weaknesses of software due for instance to design flaws or implementation bugs that can be exploited and lead to potentially devastating security breaches. Traditionally, static code analysis is recognized as effective in the detection of software security vulnerabilities but at the expense of a high human effort required for checking a large number of produced false positive cases. Deep-learning methods have been recently proposed to overcome such a limitation of static code analysis and detect the vulnerable code by using vulnerability-related patterns learned from large source code datasets. However, the use of these methods for localizing the causes of the vulnerability in the source code, i.e., localize the statements that contain the bugs, has not been extensively explored. In this work, we experiment the use of deep-learning and explainability methods for detecting and localizing vulnerability-related statements in code fragments (named snippets). We aim at understanding if the code features adopted by deep-learning methods to identify vulnerable code snippets can also support the developers in debugging the code, thus localizing the vulnerability’s cause Our work shows that deep-learning methods can be effective in detecting the vulnerable code snippets, under certain conditions, but the code features that such methods use can only partially face the actual causes of the vulnerabilities in the code.CCS Concepts• Security and privacy \rightarrow Vulnerability management; Systems security; Malware and its mitigation; \cdot Software and its engineering \rightarrow Software testing and debugging.
Authored by Alessandro Marchetto
Cybersecurity is an increasingly critical aspect of modern society, with cyber attacks becoming more sophisticated and frequent. Artificial intelligence (AI) and neural network models have emerged as promising tools for improving cyber defense. This paper explores the potential of AI and neural network models in cybersecurity, focusing on their applications in intrusion detection, malware detection, and vulnerability analysis. Intruder detection, or "intrusion detection," is the process of identifying Invasion of Privacy to a computer system. AI-based security systems that can spot intrusions (IDS) use AI-powered packet-level network traffic analysis and intrusion detection patterns to signify an assault. Neural network models can also be used to improve IDS accuracy by modeling the behavior of legitimate users and detecting anomalies. Malware detection involves identifying malicious software on a computer system. AI-based malware machine-learning algorithms are used by detecting systems to assess the behavior of software and recognize patterns that indicate malicious activity. Neural network models can also serve to hone the precision of malware identification by modeling the behavior of known malware and identifying new variants. Vulnerability analysis involves identifying weaknesses in a computer system that could be exploited by attackers. AI-based vulnerability analysis systems use machine learning algorithms to analyze system configurations and identify potential vulnerabilities. Neural network models can also be used to improve the accuracy of vulnerability analysis by modeling the behavior of known vulnerabilities and identifying new ones. Overall, AI and neural network models have significant potential in cybersecurity. By improving intrusion detection, malware detection, and vulnerability analysis, they can help organizations better defend against cyber attacks. However, these technologies also present challenges, including a lack of understanding of the importance of data in machine learning and the potential for attackers to use AI themselves. As such, careful consideration is necessary when implementing AI and neural network models in cybersecurity.
Authored by D. Sugumaran, Y. John, Jansi C, Kireet Joshi, G. Manikandan, Geethamanikanta Jakka
Cyber security is a critical problem that causes data breaches, identity theft, and harm to millions of people and businesses. As technology evolves, new security threats emerge as a result of a dearth of cyber security specialists equipped with up-to-date information. It is hard for security firms to prevent cyber-attacks without the cooperation of senior professionals. However, by depending on artificial intelligence to combat cyber-attacks, the strain on specialists can be lessened. as the use of Artificial Intelligence (AI) can improve Machine Learning (ML) approaches that can mine data to detect the sources of cyberattacks or perhaps prevent them as an AI method, it enables and facilitates malware detection by utilizing data from prior cyber-attacks in a variety of methods, including behavior analysis, risk assessment, bot blocking, endpoint protection, and security task automation. However, deploying AI may present new threats, therefore cyber security experts must establish a balance between risk and benefit. While AI models can aid cybersecurity experts in making decisions and forming conclusions, they will never be able to make all cybersecurity decisions and judgments.
Authored by Safiya Alawadhi, Areej Zowayed, Hamad Abdulla, Moaiad Khder, Basel Ali
This paper focuses on the challenges and issues of detecting malware in to-day s world where cyberattacks continue to grow in number and complexity. The paper reviews current trends and technologies in malware detection and the limitations of existing detection methods such as signaturebased detection and heuristic analysis. The emergence of new types of malware, such as file-less malware, is also discussed, along with the need for real-time detection and response. The research methodology used in this paper is presented, which includes a literature review of recent papers on the topic, keyword searches, and analysis and representation methods used in each study. In this paper, the authors aim to address the key issues and challenges in detecting malware today, the current trends and technologies in malware detection, and the limitations of existing methods. They also explore emerging threats and trends in malware attacks and highlight future directions for research and development in the field. To achieve this, the authors use a research methodology that involves a literature review of recent papers related to the topic. They focus on detecting and analyzing methods, as well as representation and ex-traction methods used in each study. Finally, they classify the literature re-view, and through reading and criticism, highlight future trends and problems in the field of malware detection.
Authored by Anas AliAhmad, Derar Eleyan, Amna Eleyan, Tarek Bejaoui, Mohamad Zolkipli, Mohammed Al-Khalidi
With the continuous improvement of the current level of information technology, the malicious software produced by attackers is also becoming more complex. It s difficult for computer users to protect themselves against malicious software attacks. Malicious software can steal the user s privacy, damage the user s computer system, and often cause serious consequences and huge economic losses to the user or the organization. Hence, this research study presents a novel deep learning-based malware detection scheme considering packers and encryption. The proposed model has 2 aspects of innovations: (1) Generation steps of the packer malware is analyzed. Packing involves adding code to the program to be protected, and original program is compressed and encrypted during the packing process. By understanding this step, the analysis of the software will be efficient. (2) The deep learning based detection model is designed. Through the experiment compared with the latest methods, the performance is proven to be efficient.
Authored by Weixiang Cai
Malware detection constitutes a fundamental step in safe and secure computational systems, including industrial systems and the Internet of Things (IoT). Modern malware detection is based on machine learning methods that classify software samples as malware or benign, based on features that are extracted from the samples through static and/or dynamic analysis. State-of-the-art malware detection systems employ Deep Neural Networks (DNNs) whose accuracy increases as more data are analyzed and exploited. However, organizations also have significant privacy constraints and concerns which limit the data that they share with centralized security providers or other organizations, despite the malware detection accuracy improvements that can be achieved with the aggregated data. In this paper we investigate the effectiveness of federated learning (FL) methods for developing and distributing aggregated DNNs among autonomous interconnected organizations. We analyze a solution where multiple organizations use independent malware analysis platforms as part of their Security Operations Centers (SOCs) and train their own local DNN model on their own private data. Exploiting cross-silo FL, we combine these DNNs into a global one which is then distributed to all organizations, achieving the distribution of combined malware detection models using data from multiple sources without sample or feature sharing. We evaluate the approach using the EMBER benchmark dataset and demonstrate that our approach effectively reaches the same accuracy as the non-federated centralized DNN model, which is above 93\%.
Authored by Dimitrios Serpanos, Georgios Xenos
IBMD(Intelligent Behavior-Based Malware Detection) aims to detect and mitigate malicious activities in cloud computing environments by analyzing the behavior of cloud resources, such as virtual machines, containers, and applications.The system uses different machine learning methods like deep learning and artificial neural networks, to analyze the behavior of cloud resources and detect anomalies that may indicate malicious activity. The IBMD system can also monitor and accumulate the data from various resources, such as network traffic and system logs, to provide a comprehensive view of the behavior of cloud resources. IBMD is designed to operate in a cloud computing environment, taking advantage of the scalability and flexibility of the cloud to detect malware and respond to security incidents. The system can also be integrated with existing security tools and services, such as firewalls and intrusion detection systems, to provide a comprehensive security solution for cloud computing environments.
Authored by Jibu Samuel, Mahima Jacob, Melvin Roy, Sayoojya M, Anu Joy
In today s digital landscape, the task of identifying various types of malicious files has become progressively challenging. Modern malware exhibits increasing sophistication, often evading conventional anti-malware solutions. The scarcity of data on distinct and novel malware strains further complicates effective detection. In response, this research presents an innovative approach to malware detection, specifically targeting multiple distinct categories of malicious software. In the initial stage, Principal Component Analysis (PCA) is performed and achieved a remarkable accuracy rate of 95.39\%. Our methodology revolves around leveraging features commonly accessible from user-uploaded files, aligning with the contextual behavior of typical users seeking to identify malignancy. This underscores the efficacy of the unique featurebased detection strategy and its potential to enhance contemporary malware identification methodologies. The outcomes achieved attest to the significance of addressing emerging malware threats through inventive analytical paradigms.
Authored by Sanyam Jain, Sumaiya Thaseen
Malwares have been being a major security threats to enterprises, government organizations and end-users. Beside traditional malwares, such as viruses, worms and trojans, new types of malwares, such as botnets, ransomwares, IoT malwares and crypto-jackings are released daily. To cope with malware threats, several measures for monitoring, detecting and preventing malwares have been developed and deployed in practice, such as signature-based detection, static and dynamic file analysis. This paper proposes 2 malware detection models based on statistics and machine learning using opcode n-grams. The proposed models aim at achieving high detection accuracy as well as reducing the amount of time for training and detection. Experimental results show that our proposed models give better performance measures than previous proposals. Specifically, the proposed statistics-based model is very fast and it achieves a high detection accuracy of 92.75\% and the random forest-based model produces the highest detection accuracy of 96.29\%.
Authored by Xuan Hoang, Ba Nguyen, Thi Ninh
With the rapid development of science and technology, information security issues have been attracting more attention. According to statistics, tens of millions of computers around the world are infected by malicious software (Malware) every year, causing losses of up to several USD billion. Malware uses various methods to invade computer systems, including viruses, worms, Trojan horses, and others and exploit network vulnerabilities for intrusion. Most intrusion detection approaches employ behavioral analysis techniques to analyze malware threats with packet collection and filtering, feature engineering, and attribute comparison. These approaches are difficult to differentiate malicious traffic from legitimate traffic. Malware detection and classification are conducted with deep learning and graph neural networks (GNNs) to learn the characteristics of malware. In this study, a GNN-based model is proposed for malware detection and classification on a renewable energy management platform. It uses GNN to analyze malware with Cuckoo Sandbox malware records for malware detection and classification. To evaluate the effectiveness of the GNN-based model, the CIC-AndMal2017 dataset is used to examine its accuracy, precision, recall, and ROC curve. Experimental results show that the GNN-based model can reach better results.
Authored by Hsiao-Chung Lin, Ping Wang, Wen-Hui Lin, Yu-Hsiang Lin, Jia-Hong Chen
The term Internet of Things(IoT) describes a network of real-world items, gadgets, structures, and other things that are equipped with communication and sensors for gathering and exchanging data online. The likelihood of Android malware attacks on IoT devices has risen due to their widespread use. Regular security precautions might not be practical for these devices because they frequently have limited resources. The detection of malware attacks on IoT environments has found hope in ML approaches. In this paper, some machine learning(ML) approaches have been utilized to detect IoT Android malware threats. This method uses a collection of Android malware samples and good apps to build an ML model. Using the Android Malware dataset, many ML techniques, including Naive Bayes (NB), K-Nearest Neighbour (KNN), Decision Tree (DT), and Random Forest (RF), are used to detect malware in IoT. The accuracy of the DT model is 95\%, which is the highest accuracy rate, while that of the NB, KNN, and RF models have accuracy rates of 84\%, 89\%, and 92\%, respectively.
Authored by Anshika Sharma, Himanshi Babbar
The motive of this paper is to detect the malware from computer systems in order to protect the confidential data, information, documents etc. from being accessing. The detection of malware is necessary because it steals the data from that system which is affected by malware. There are different malware detection techniques (cloud-based, signature-based, Iot-based, heuristic based etc.) and different malware detection tools (static, dynamic) area used in this paper to detect new generation malware. It is necessary to detect malware because the attacks of malware badly affect our economy and no one sector is untouched by it. The detection of malware is compulsory because it exploits goal devices vulnerabilities, along with a Trojan horse in valid software e.g. browser that may be hijacked. There are also different tools used for detection of malware like static or dynamic that we see in this paper. We also see different methods of detection of malware in android.
Authored by P.A. Selvaraj, M. Jagadeesan, T.M. Saravanan, Aniket Kumar, Anshu Kumar, Mayank Singh
In cybersecurity, Intrusion Detection Systems (IDS) protect against emerging cyber threats. Combining signature-based and anomaly-based detection methods may improve IDS accuracy and reduce false positives. This research analyzes hybrid intrusion detection systems signature-based components performance and limitations. The paper begins with a detailed history of signature-based detection methods responding to changing threat situations. This research analyzes signature databases to determine their capacity to identify and guard against current threats and cover known vulnerabilities. The paper also examines the intricate relationship between signature-based detection and anomalybased techniques in hybrid IDS systems. This investigation examines how these two methodologies work together to uncover old and new attack strategies, focusing on zero-day vulnerabilities and polymorphic malware. A diverse dataset of network traffic and attack scenarios is used to test. Detection, false positives, and response times assess signature-based components. Comparative examinations investigate how signature-based detection affects system accuracy and efficiency. This research illuminates the role of signature-based aspects in hybrid intrusion detection systems. This study recommends integrating signature-based detection techniques with anomaly-based methods to improve hybrid intrusion detection systems (IDS) at recognizing and mitigating various cyber threats.
Authored by Moorthy Agoramoorthy, Ahamed Ali, D. Sujatha, Michael F, G. Ramesh
Past Advanced Persistent Threat (APT) attacks on Industrial Internet-of-Things (IIoT), such as the 2016 Ukrainian power grid attack and the 2017 Saudi petrochemical plant attack, have shown the disruptive effects of APT campaigns while new IIoT malware continue to be developed by APT groups. Existing APT detection systems have been designed using cyberattack TTPs modelled for enterprise IT networks and leverage specific data sources (e.g., Linux audit logs, Windows event logs) which are not found on ICS devices. In this work, we propose RAPTOR, a system to detect APT campaigns in IIoT. Using cyberattack TTPs modelled for ICS/OT environments and focusing on ‘invariant’ attack phases, RAPTOR detects and correlates various APT attack stages in IIoT leveraging data which can be readily collected from ICS devices/networks (packet traffic traces, IDS alerts). Subsequently, it constructs a high-level APT campaign graph which can be used by cybersecurity analysts towards attack analysis and mitigation. A performance evaluation of RAPTOR’s APT attack-stage detection modules shows high precision and low false positive/negative rates. We also show that RAPTOR is able to construct the APT campaign graph for APT attacks (modelled after real-world attacks on ICS/OT infrastructure) executed on our IIoT testbed.
Authored by Ayush Kumar, Vrizlynn Thing
As a recent breakthrough in generative artificial intelligence, ChatGPT is capable of creating new data, images, audio, or text content based on user context. In the field of cybersecurity, it provides generative automated AI services such as network detection, malware protection, and privacy compliance monitoring. However, it also faces significant security risks during its design, training, and operation phases, including privacy breaches, content abuse, prompt word attacks, model stealing attacks, abnormal structure attacks, data poisoning attacks, model hijacking attacks, and sponge attacks. This paper starts from the risks and events that ChatGPT has recently faced, proposes a framework for analyzing cybersecurity in cyberspace, and envisions adversarial models and systems. It puts forward a new evolutionary relationship between attackers and defenders using ChatGPT to enhance their own capabilities in a changing environment and predicts the future development of ChatGPT from a security perspective.
Authored by Chunhui Hu, Jianfeng Chen
Android is the most popular smartphone operating system with a market share of 68.6\% in Apr 2023. Hence, Android is a more tempting target for cybercriminals. This research aims at contributing to the ongoing efforts to enhance the security of Android applications and protect users from the ever-increasing sophistication of malware attacks. Zero-day attacks pose a significant challenge to traditional signature-based malware detection systems, as they exploit vulnerabilities that are unknown to all. In this context, static analysis can be an encouraging approach for detecting malware in Android applications, leveraging machine learning (ML) and deep learning (DL)-based models. In this research, we have used single feature and combination of features extracted from the static properties of mobile apps as input(s) to the ML and DL based models, enabling it to learn and differentiate between normal and malicious behavior. We have evaluated the performance of those models based on a diverse dataset (DREBIN) comprising of real-world Android applications features, including both benign and zero-day malware samples. We have achieved F1 Score 96\% from the multi-view model (DL Model) in case of Zero-day malware scenario. So, this research can be helpful for mitigating the risk of unknown malware.
Authored by Jabunnesa Sara, Shohrab Hossain
The Internet as a whole is a large network of interconnected computer networks and their supporting infrastructure which is divided into 3 parts. The web is a list of websites that can be accessed using search engines like Google, Firefox, and others, this is called as Surface Web. The Internet’s layers stretch well beyond the surface material that many people can quickly reach in their everyday searches. The Deep Web material, which cannot be indexed by regular search engines like Google, is a subset of the internet. The Dark Web, which extends to the deepest reaches of the Deep Web, contains data that has been purposefully hidden. Tor may be used to access the dark web. Tor employs a network of volunteer devices to route users web traffic via a succession of other users computers, making it impossible to track it back to the source. We will analyze and include results about the Dark Web’s presence in various spheres of society in this paper. Further we take dive into about the Tor metrics how the relay list is revised after users are determined based on client requests for directories (using TOR metrics). Other way we can estimate the number of users in anonymous networks. This analysis discusses the purposes for which it is frequently used, with a focus on cybercrime, as well as how law enforcement plays the adversary position. The analysis discusses these secret Dark Web markets, what services they provide, and the events that take place there such as cybercrime, illegal money transfers, sensitive communication etc. Before knowing anything about Dark Web, how a rookie can make mistake of letting any threat or malware into his system. This problem can be tackled by knowing whether to use Windows, or any other OS, or any other service like VPN to enter Dark world. The paper also goes into the agenda of how much of illegal community is involved from India in these markets and what impact does COVID-19 had on Dark Web markets. Our analysis is carried out by searching scholarly journal databases for current literature. By acting as a reference guide and presenting a research agenda, it contributes to the field of the dark web in an efficient way. This paper is totally built for study purposes and precautionary measures for accessing Dark Web.
Authored by Hardik Gulati, Aman Saxena, Neerav Pawar, Poonam Tanwar, Shweta Sharma
Malware Classification - The past decades witness the development of various Machine Learning (ML) models for malware classification. Semantic representation is a crucial basis for these classifiers. This paper aims to assess the effect of semantic representation methods on malware classifier performance. Two commonly-used semantic representation methods including N-gram and GloVe. We utilize diverse ML classifiers to conduct comparative experiments to analyze the capability of N-gram, GloVe and image-based methods for malware classification. We also analyze deeply the reason why the GloVe can produce negative effects on malware static analysis.
Authored by Bingchu Jin, Zesheng Hu, Jianhua Wang, Monong Wei, Yawei Zhao, Chao Xue
Malware Classification - Automated malware classification assigns unknown malware to known families. Most research in malware classification assumes that the defender has access to the malware for analysis. Unfortunately, malware can delete itself after execution. As a result, analysts are only left with digital residue, such as network logs or remnant artifacts of malware in memory or on the file system. In this paper, a novel malware classification method based on the Windows prefetch mechanism is presented and evaluated, enabling analysts to classify malware without a corresponding executable. The approach extracts features from Windows prefetch files, a file system artifact that contains historical process information such as loaded libraries and process dependencies. Results show that classification using these features with two different algorithms garnered F-Scores between 0.80 and 0.82, offering analysts a viable option for forensic analysis.
Authored by Adam Duby, Teryl Taylor, Yanyan Zhuang
Malware Classification - Mobile devices play a crucial role and have become an essential part of people's life particularly with online applications such as shopping, learning, mailing, etc. Android OS has continued to drive the market for other operating systems since 2012. Traditional Android malware detection methods, such as static, dynamic, hybrid analysis, or the Bayesian model, may show less accuracy to detect recent Android malware. We propose a deep learning method for Android malware detection using Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM). CNN provides efficient feature extraction from data and the use of additional LSTM layers improves prediction accuracy. According to the test results, CNN-LSTM can provide reliable malware prediction in Android applications. We train and test our approach using the CICMalDroid2020 dataset. The test results show that the CNN-LSTM classifier exceeds with an accuracy of 94%.
Authored by Shakhnaz Amenova, Cemil Turan, Dinara Zharkynbek
Malware Classification - Due to the constant updates of malware and its variants and the continuous development of malware obfuscation techniques. Malware intrusions targeting Windows hosts are also on the rise. Traditional static analysis methods such as signature matching mechanisms have been difficult to adapt to the detection of new malware. Therefore, a novel visual detection method of malware is proposed for first-time to convert the Windows API call sequence with sequential nature into feature images based on the Gramian Angular Field (GAF) idea, and train a neural network to identify malware. The experimental results demonstrate the effectiveness of our proposed method. For the binary classification of malware, the GAF visualization image of the API call sequence is compared with its original sequence. After GAF visualization, the classification accuracy of the classic machine learning model MLP is improved by 9.64%, and the classification accuracy of the deep learning model CNN is improved by 4.82%. Furthermore, our experiments show that the proposed method is also feasible and effective for the multi-class classification of malware.
Authored by Hongmei Zhang, Xiaoqian Yun, Xiaofang Deng, Xiaoxiong Zhong
Malware Classification - Methodologies used for the detection of malicious applications can be broadly classified into static and dynamic analysis based approaches. With traditional signature-based methods, new variants of malware families cannot be detected. A combination of deep learning techniques along with image-based features is used in this work to classify malware. The data set used here is the ‘Malimg’ dataset, which contains a pictorial representation of well-known malware families. This paper proposes a methodology for identifying malware images and classifying them into various families. The classification is based on image features. The features are extracted using the pre-trained model namely VGG16. The samples of malware are depicted as byteplot grayscale images. Features are extracted employing the convolutional layer of a VGG16 deep learning network, which uses ImageNet dataset for the pre-training step. The features are used to train different classifiers which employ SVM, XGBoost, DNN and Random Forest for the classification task into different malware families. Using 9339 samples from 25 different malware families, we performed experimental evaluations and demonstrate that our approach is effective in identifying malware families with high accuracy.
Authored by K. Deepa, K. Adithyakumar, P. Vinod
Malware Analysis - The rapid development of network information technology, individual’s information networks security has become a very critical issue in our daily life. Therefore, it is necessary to study the malware propagation model system. In this paper, the traditional integer order malware propagation model system is extended to the field of fractional-order. Then we analyze the asymptotic stability of the fractional-order malware propagation model system when the equilibrium point is the origin and the time delay is 0. Next, the asymptotic stability and bifurcation analysis of the fractional-order malware propagation model system when the equilibrium point is the origin and the time delay is not 0 are carried out. Moreover, we study the asymptotic stability of the fractional-order malware propagation model system with an interior equilibrium point. In the end, so as to verify our theoretical results, many numerical simulations are provided.
Authored by Zhe Zhang, Yaonan Wang, Jing Zhang, Xu Xiao
Malware Analysis - Detection of malware and security attacks is a complex process that can vary in its details and analysis activities. As part of the detection process, malware scanners try to categorize a malware once it is detected under one of the known malware categories (e.g. worms, spywares, viruses, etc.). However, many studies and researches indicate problems with scanners categorizing or identifying a particular malware under more than one malware category. This paper, and several others, show that machine learning can be used for malware detection especially with ensemble base prediction methods. In this paper, we evaluated several custom-built ensemble models. We focused on multi-label malware classification as individual or classical classifiers showed low accuracy in such territory.This paper showed that recent machine models such as ensemble and deep learning can be used for malware detection with better performance in comparison with classical models. This is very critical in such a dynamic and yet important detection systems where challenges such as the detection of unknown or zero-day malware will continue to exist and evolve.
Authored by Izzat Alsmadi, Bilal Al-Ahmad, Mohammad Alsmadi