Publications | Science of Security Virtual Organization

Year

Type

An Analysis of Signature-Based Components in Hybrid Intrusion Detection Systems

In cybersecurity, Intrusion Detection Systems (IDS) protect against emerging cyber threats. Combining signature-based and anomaly-based detection methods may improve IDS accuracy and reduce false positives. This research analyzes hybrid intrusion detection systems signature-based components performance and limitations. The paper begins with a detailed history of signature-based detection methods responding to changing threat situations. This research analyzes signature databases to determine their capacity to identify and guard against current threats and cover known vulnerabilities. The paper also examines the intricate relationship between signature-based detection and anomalybased techniques in hybrid IDS systems. This investigation examines how these two methodologies work together to uncover old and new attack strategies, focusing on zero-day vulnerabilities and polymorphic malware. A diverse dataset of network traffic and attack scenarios is used to test. Detection, false positives, and response times assess signature-based components. Comparative examinations investigate how signature-based detection affects system accuracy and efficiency. This research illuminates the role of signature-based aspects in hybrid intrusion detection systems. This study recommends integrating signature-based detection techniques with anomaly-based methods to improve hybrid intrusion detection systems (IDS) at recognizing and mitigating various cyber threats.

Authored by Moorthy Agoramoorthy, Ahamed Ali, D. Sujatha, Michael F, G. Ramesh

Neural Network Malware Detection Verification for Feature and Image Datasets

Authored by Preston Robinette, Diego Lopez, Serena Serbinowska, Kevin Leach, Taylor Johnson

Heterogeneous Graph Transformer for Advanced Persistent Threat Classification in Wireless Networks

Advanced Persistent Threats (APTs) have significantly impacted organizations over an extended period with their coordinated and sophisticated cyberattacks. Unlike signature-based tools such as antivirus and firewalls that can detect and block other types of malware, APTs exploit zero-day vulnerabilities to generate new variants of undetectable malware. Additionally, APT adversaries engage in complex relationships and interactions within network entities, necessitating the learning of interactions in network traffic flows, such as hosts, users, or IP addresses, for effective detection. However, traditional deep neural networks often fail to capture the inherent graph structure and overlook crucial contextual information in network traffic flows. To address these issues, this research models APTs as heterogeneous graphs, capturing the diverse features and complex interactions in network flows. Consequently, a hetero-geneous graph transformer (HGT) model is used to accurately distinguish between benign and malicious network connections. Experiment results reveal that the HGT model achieves better performance, with 100 \% accuracy and accelerated learning time, outperferming homogeneous graph neural network models.

Authored by Kazeem Saheed, Shagufta Henna

RAPTOR: Advanced Persistent Threat Detection in Industrial IoT via Attack Stage Correlation

Past Advanced Persistent Threat (APT) attacks on Industrial Internet-of-Things (IIoT), such as the 2016 Ukrainian power grid attack and the 2017 Saudi petrochemical plant attack, have shown the disruptive effects of APT campaigns while new IIoT malware continue to be developed by APT groups. Existing APT detection systems have been designed using cyberattack TTPs modelled for enterprise IT networks and leverage specific data sources (e.g., Linux audit logs, Windows event logs) which are not found on ICS devices. In this work, we propose RAPTOR, a system to detect APT campaigns in IIoT. Using cyberattack TTPs modelled for ICS/OT environments and focusing on ‘invariant’ attack phases, RAPTOR detects and correlates various APT attack stages in IIoT leveraging data which can be readily collected from ICS devices/networks (packet traffic traces, IDS alerts). Subsequently, it constructs a high-level APT campaign graph which can be used by cybersecurity analysts towards attack analysis and mitigation. A performance evaluation of RAPTOR’s APT attack-stage detection modules shows high precision and low false positive/negative rates. We also show that RAPTOR is able to construct the APT campaign graph for APT attacks (modelled after real-world attacks on ICS/OT infrastructure) executed on our IIoT testbed.

Authored by Ayush Kumar, Vrizlynn Thing

A Dimensional Perspective Analysis on the Cybersecurity Risks and Opportunities of ChatGPT-Like Information Systems

As a recent breakthrough in generative artificial intelligence, ChatGPT is capable of creating new data, images, audio, or text content based on user context. In the field of cybersecurity, it provides generative automated AI services such as network detection, malware protection, and privacy compliance monitoring. However, it also faces significant security risks during its design, training, and operation phases, including privacy breaches, content abuse, prompt word attacks, model stealing attacks, abnormal structure attacks, data poisoning attacks, model hijacking attacks, and sponge attacks. This paper starts from the risks and events that ChatGPT has recently faced, proposes a framework for analyzing cybersecurity in cyberspace, and envisions adversarial models and systems. It puts forward a new evolutionary relationship between attackers and defenders using ChatGPT to enhance their own capabilities in a changing environment and predicts the future development of ChatGPT from a security perspective.

Authored by Chunhui Hu, Jianfeng Chen

EL-GRILLO: Leaking Data Ultrasonically from Air-Gapped PCs via the Tiny Motherboard Buzzer

Air-gapped workstations are separated from the Internet because they contain confidential or sensitive information. Studies have shown that attackers can leak data from air-gapped computers with covert ultrasonic signals produced by loudspeakers. To counteract the threat, speakers might not be permitted on highly sensitive computers or disabled altogether - a measure known as an ’audio gap.’ This paper presents an attack enabling adversaries to exfiltrate data over ultrasonic waves from air-gapped, audio-gapped computers without external speakers. The malware on the compromised computer uses its built-in buzzer to generate sonic and ultrasonic signals. This component is mounted on many systems, including PC workstations, embedded systems, and server motherboards. It allows software and firmware to provide error notifications to a user, such as memory and peripheral hardware failures. We examine the different types of internal buzzers and their hardware and software controls. Despite their limited technological capabilities, such as 1-bit sound, we show that sensitive data can be encoded in sonic and ultrasonic waves. This is done using pulse width modulation (PWM) techniques to maintain a carrier wave with a dynamic range. We also show that malware can evade detection by hiding in the frequency bands of other components (e.g., fans and power supplies). We implement the attack using a PC transmitter and smartphone app receiver. We discuss transmission protocols, modulation, encoding, and reception and present the evaluation of the covert channel as well. Based on our tests, sensitive data can be exfiltrated from air-gapped computers through its built- in buzzer. A smartphone can receive data from up to six meters away at 100 bits per second.

Authored by Mordechai Guri

Detection and Notification of Zero-Day attack to Prevent Cybercrime

This paper seeks to understand how zero- day vulnerabilities relate to traded markets. People in trade and development are reluctant to talk about zero-day vulnerabilities. Thanks to years of research, in addition to interviews, The majority of thepublic documentation about Mr. Cesar Cerrudo s 0-day vulnerabilities are examinedby him, and he talks to experts in many computer security domains about them. In this research, we gave a summary of the current malware detection technologies and suggest a fresh zero-day malware detection and prevention model that is capable of efficiently separating malicious from benign zero-day samples. We also discussed various methods used to detect malicious files and present the results obtained from these methods.

Authored by Atharva Deshpande, Isha Patil, Jayesh Bhave, Aum Giri, Nilesh Sable, Gurunath Chavan

Static Analysis Based Malware Detection for Zero-Day Attacks in Android Applications

Android is the most popular smartphone operating system with a market share of 68.6\% in Apr 2023. Hence, Android is a more tempting target for cybercriminals. This research aims at contributing to the ongoing efforts to enhance the security of Android applications and protect users from the ever-increasing sophistication of malware attacks. Zero-day attacks pose a significant challenge to traditional signature-based malware detection systems, as they exploit vulnerabilities that are unknown to all. In this context, static analysis can be an encouraging approach for detecting malware in Android applications, leveraging machine learning (ML) and deep learning (DL)-based models. In this research, we have used single feature and combination of features extracted from the static properties of mobile apps as input(s) to the ML and DL based models, enabling it to learn and differentiate between normal and malicious behavior. We have evaluated the performance of those models based on a diverse dataset (DREBIN) comprising of real-world Android applications features, including both benign and zero-day malware samples. We have achieved F1 Score 96\% from the multi-view model (DL Model) in case of Zero-day malware scenario. So, this research can be helpful for mitigating the risk of unknown malware.

Authored by Jabunnesa Sara, Shohrab Hossain

Fast Detection of Advanced Persistent Threats for Smart Grids: A Deep Reinforcement Learning Approach

Data management systems in smart grids have to address advanced persistent threats (APTs), where malware injection methods are performed by the attacker to launch stealthy attacks and thus steal more data for illegal advantages. In this paper, we present a hierarchical deep reinforcement learning based APT detection scheme for smart grids, which enables the control center of the data management system to choose the APT detection policy to reduce the detection delay and improve the data protection level without knowing the attack model. Based on the state that consists of the size of the gathered power usage data, the priority level of the data, and the detection history, this scheme develops a two-level hierarchical structure to compress the high-dimensional action space and designs four deep dueling networks to accelerate the optimization speed with less over-estimation. Detection performance bound is provided and simulation results show that the proposed scheme improves both the data protection level and the utility of the control center with less detection delay.

Authored by Shi Yu

Ethereal Networks and Honeypots for Breach Detection

Network Reconnaissance - With increasing number of data thefts courtesy of new and complex attack mechanisms being used everyday, declaring the internet as unsafe would be the understatement of the century. For current security experts the scenario is equivalent to an endless cat-and-mouse game across a constantly changing landscape. Hence relying on ﬁrewalls and anti-virus softwares is like trying to ﬁght a modern, well-equipped army using sticks and stones. All that an attacker needs to successfully breach our system is the right social networking or the right malware used like a packing or encoding technique that our tools won’t detect. Therefore it is the need of the hour to shift our focus beyond edge defense, which largely involves validating the tools, and move towards identiﬁcation of a breach followed by an appropriate response. This is achieved by implementing an ethereal network which is an end-to-end host and network approach that can actually scale as well as provide true breach detection. The objective is not just blocking; it is signiﬁcant time reduction. When mundane methods involving ﬁrewalls and antiviruses fail, we need to determine what happened and respond. Any industry report uses the term weeks, months, and even years to determine the time of response, which is not good enough. Our goal is to bring it down to hours. We are talking about dramatic time reduction to improve our response, hence an effective breach detection approach is mandatory. A MHN (Modern Honey Network) with a honeypot system has been used to make management and deployment easier and to secure the honeypots. We have used various honeypots such as Glastopf, Dionaea honeypots, Kippo. The dubious activity will be recorded and the attacks details detected in MHN server. The ﬁnal part of our research is reconnaissance. Since it can be awfully complicated we simplify the process by having our main focus on reconnaissance. Because if a malware or an insider threat breaks into something, they don’t know what they now have access to. This makes them feel the need to do reconnaissance. So, focusing on that behaviour provides us a simple way to determine that we have some unusual activity - whether it is an IOT device that has been compromised or whatever it may be, that has breached our network. Finally we deploy MHN, deploy Dionaea, Kippo, Snort honeypots and Splunk integration for analyzing the captured attacks which reveals the service port under attack and the source IP address of the attacker.

Authored by Sourav Mishra, Vijay Chaurasiya

Machine Learning Based Obfuscated Malware Detection in the Cloud Environment with Nature-Inspired Feature Selection

Nearest Neighbor Search - One of the most significant and widely used IT breakthroughs nowadays is cloud computing. Today, the majority of enterprises use private or public cloud computing services for their computing infrastructure. Cyber-attackers regularly target Cloud resources by inserting malicious code or obfuscated malware onto the server. These malware programmes that are obfuscated are so clever that they often manage to evade the detection technology that is in place. Unfortunately, they are discovered long after they have done significant harm to the server. Machine Learning (ML) techniques have shown to be effective at finding malware in a wide range of fields. To address feature selection (FS) challenges, this study uses the wrapperbased Binary Bat Algorithm (BBA), Cuckoo Search Algorithm (CSA), Mayfly Algorithm (MA), and Particle Swarm Optimization (PSO), and then k-Nearest Neighbor (kNN), Random Forest (RF), and Support Vector Machine (SVM) are used to classify the benign and malicious records to measure the performance in terms of various metrics. CIC-MalMem-2022, the most recent malware memory dataset, is used to evaluate and test the proposed approach and it is found that the proposed system is an acceptable solution to detect malware.

Authored by Mohd. Ghazi, N. Raghava

Detection and Isolation Malware by Dynamic Routing Moving Target Defense with Proxies

Moving Target Defense - In recent years, many companies and organizations have introduced internal networks. While such internal networks propose availability and convenience, there have been many cases in which malicious outsiders have intruded on these local networks, and leaked customer information through cyber attacks. In addition, there have recently been reports of a type of attack called ”Advanced Persistent Threats (APT)”. Unlike conventional cyber attacks, these attacks target speciﬁc objectives. And they use sophisticated techniques to penetrate the target’s system. Once malware successes to intrude into the system, malware does not immediately attack the target but hides for a long time to investigate the system and gather information. Moving Target Defense, MTD is a technology that dynamically changes the conﬁgurations of systems targeted by cyber attacks. In this study, we implemented a model using a proxy-based network-level MTD to detect and quarantine malware in internal networks. And we can conﬁrm that the proposed method is effective in the detection and quarantine of malware.

Authored by Kouki Inoue, Hiroshi Koide

Malware Detection Classification using Recurrent Neural Network

Malware Classification - Nowadays, increasing numbers of malicious programs are becoming a serious problem, which increases the need for automated detection and categorization of potential threats. These attacks often use undetected malware that is not recognized by the security vendor, making it difficult to protect the endpoints from viruses. Existing methods have been proposed to detect malware. However, as malware variations develop, they can lead to misdiagnosis and are difficult to diagnose accurately. To address this problem, in this work introduces a Recurrent Neural Network (RNN) to identify the malware or benign based on extract features using Information Gain Absolute Feature Selection (IGAFS) technique. First, Malware detection dataset is collected from kaggle repository. Then the proposed pre-process the dataset for removing null and noisy values to prepare the dataset. Next, the proposed Information Gain Absolute Feature Selection (IGAFS) technique is used to select most relevant features for malware from the pre-processed dataset. Selected features are trained into Recurrent Neural Network (RNN) method to classify as malware or not with better accuracy and false rate. The experimental result provides greater performance compared with previous methods.

Authored by Suresh Kumar, Umi B., Isa Mishra, Shitharth S., Diwakar Tripathi, Siva T.

Android Malware Classification by CNN-LSTM

Malware Classification - Mobile devices play a crucial role and have become an essential part of people's life particularly with online applications such as shopping, learning, mailing, etc. Android OS has continued to drive the market for other operating systems since 2012. Traditional Android malware detection methods, such as static, dynamic, hybrid analysis, or the Bayesian model, may show less accuracy to detect recent Android malware. We propose a deep learning method for Android malware detection using Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM). CNN provides efficient feature extraction from data and the use of additional LSTM layers improves prediction accuracy. According to the test results, CNN-LSTM can provide reliable malware prediction in Android applications. We train and test our approach using the CICMalDroid2020 dataset. The test results show that the CNN-LSTM classifier exceeds with an accuracy of 94%.

Authored by Shakhnaz Amenova, Cemil Turan, Dinara Zharkynbek

Malware Classification based on a Light-weight Architecture of CNN: MalShuffleNet

Malware Classification - Traditional methods of malware detection have difficulty in detecting massive malware variants. Malware detection based on malware visualization has been proved an effective method for identifying unknown malware variants. In order to improve the accuracy and reduce the detection time of above methods, a novel method for malware classification in a light-weight CNN architecture named MalshuffleNet is proposed. The model is customized based on ShuffleNet V2 by adjusting the numbers of the fully connected layer for adopting to malware classification. Empirical results on Malimg dataset indicate that our model achieves 99.03% in accuracy, and identify an unknown malware only taking 5.3 milliseconds on average.

Authored by Lingfeng Qiu, Shuo Wang, Jian Wang, Yifei Wang, Wei Huang

Markov Image with Transfer Learning for Malware Detection and Classification

Malware Classification - Malware attack is a severe problem that can cause a considerable loss. To prevent the malware attack, different malware detection and classification method have been implemented in recent years. This paper proposed a new method based on Markov image and transfer learning on machine learning. Also, an experience comparing the performance on malware detection and classification between the proposed and grayscale methods was done. The accuracy and loss of malware detection and classification by using the proposed method are 0.973 and 0.076, 0.987 and 0.062 respectively. The accuracy and loss of malware detection and classification using the grayscale method are 0.989 and 0.037, 0.973 and 0.202 respectively. Although the grayscale method has done better in malware detection, the proposed method's accuracy is over 0.97. Therefore, the result shows that the proposed method are suitable for malware detection and classification.

Authored by Lok Kwan

Malware Classification Based on GAF Visualization of Dynamic API Call Sequences

Malware Classification - Due to the constant updates of malware and its variants and the continuous development of malware obfuscation techniques. Malware intrusions targeting Windows hosts are also on the rise. Traditional static analysis methods such as signature matching mechanisms have been difficult to adapt to the detection of new malware. Therefore, a novel visual detection method of malware is proposed for first-time to convert the Windows API call sequence with sequential nature into feature images based on the Gramian Angular Field (GAF) idea, and train a neural network to identify malware. The experimental results demonstrate the effectiveness of our proposed method. For the binary classification of malware, the GAF visualization image of the API call sequence is compared with its original sequence. After GAF visualization, the classification accuracy of the classic machine learning model MLP is improved by 9.64%, and the classification accuracy of the deep learning model CNN is improved by 4.82%. Furthermore, our experiments show that the proposed method is also feasible and effective for the multi-class classification of malware.

Authored by Hongmei Zhang, Xiaoqian Yun, Xiaofang Deng, Xiaoxiong Zhong

Malware Image Classification using VGG16

Malware Classification - Methodologies used for the detection of malicious applications can be broadly classified into static and dynamic analysis based approaches. With traditional signature-based methods, new variants of malware families cannot be detected. A combination of deep learning techniques along with image-based features is used in this work to classify malware. The data set used here is the ‘Malimg’ dataset, which contains a pictorial representation of well-known malware families. This paper proposes a methodology for identifying malware images and classifying them into various families. The classification is based on image features. The features are extracted using the pre-trained model namely VGG16. The samples of malware are depicted as byteplot grayscale images. Features are extracted employing the convolutional layer of a VGG16 deep learning network, which uses ImageNet dataset for the pre-training step. The features are used to train different classifiers which employ SVM, XGBoost, DNN and Random Forest for the classification task into different malware families. Using 9339 samples from 25 different malware families, we performed experimental evaluations and demonstrate that our approach is effective in identifying malware families with high accuracy.

Authored by K. Deepa, K. Adithyakumar, P. Vinod

Malware analysis and multi-label category detection issues: Ensemble-based approaches

Malware Analysis - Detection of malware and security attacks is a complex process that can vary in its details and analysis activities. As part of the detection process, malware scanners try to categorize a malware once it is detected under one of the known malware categories (e.g. worms, spywares, viruses, etc.). However, many studies and researches indicate problems with scanners categorizing or identifying a particular malware under more than one malware category. This paper, and several others, show that machine learning can be used for malware detection especially with ensemble base prediction methods. In this paper, we evaluated several custom-built ensemble models. We focused on multi-label malware classification as individual or classical classifiers showed low accuracy in such territory.This paper showed that recent machine models such as ensemble and deep learning can be used for malware detection with better performance in comparison with classical models. This is very critical in such a dynamic and yet important detection systems where challenges such as the detection of unknown or zero-day malware will continue to exist and evolve.

Authored by Izzat Alsmadi, Bilal Al-Ahmad, Mohammad Alsmadi

DynaMalDroid: Dynamic Analysis-Based Detection Framework for Android Malware Using Machine Learning Techniques

Malware Analysis - Android malware is continuously evolving at an alarming rate due to the growing vulnerabilities. This demands more effective malware detection methods. This paper presents DynaMalDroid, a dynamic analysis-based framework to detect malicious applications in the Android platform. The proposed framework contains three modules: dynamic analysis, feature engineering, and detection. We utilized the well-known CICMalDroid2020 dataset, and the system calls of apps are extracted through dynamic analysis. We trained our proposed model to recognize malware by selecting features obtained through the feature engineering module. Further, with these selected features, the detection module applies different Machine Learning classifiers like Random Forest, Decision Tree, Logistic Regression, Support Vector Machine, Naïve-Bayes, K-Nearest Neighbour, and AdaBoost, to recognize whether an application is malicious or not. The experiments have shown that several classifiers have demonstrated excellent performance and have an accuracy of up to 99\%. The models with Support Vector Machine and AdaBoost classifiers have provided better detection accuracy of 99.3\% and 99.5\%, respectively.

Authored by Hashida Manzil, Manohar S

Disparity Analysis Between the Assembly and Byte Malware Samples with Deep Autoencoders

Malware Analysis - Malware attacks in the cyber world continue to increase despite the efforts of Malware analysts to combat this problem. Recently, Malware samples have been presented as binary sequences and assembly codes. However, most researchers focus only on the raw Malware sequence in their proposed solutions, ignoring that the assembly codes may contain important details that enable rapid Malware detection. In this work, we leveraged the capabilities of deep autoencoders to investigate the presence of feature disparities in the assembly and raw binary Malware samples. First, we treated the task as outliers to investigate whether the autoencoder would identify and justify features as samples from the same family. Second, we added noise to all samples and used Deep Autoencoder to reconstruct the original samples by denoising. Experiments with the Microsoft Malware dataset showed that the byte samples features differed from the assembly code samples.

Authored by Muhammed Abdullah, Yongbin Yu, Jingye Cai, Yakubu Imrana, Nartey Tettey, Daniel Addo, Kwabena Sarpong, Bless Lord Y. Agbley, Benjamin Appiah

Effective of Obfuscated Android Malware Detection using Static Analysis

Malware Analysis - The effective security system improvement from malware attacks on the Android operating system should be updated and improved. Effective malware detection increases the level of data security and high protection for the users. Malicious software or malware typically finds a means to circumvent the security procedure, even when the user is unaware whether the application can act as malware. The effectiveness of obfuscated android malware detection is evaluated by collecting static analysis data from a data set. The experiment assesses the risk level of which malware dataset using the hash value of the malware and records malware behavior. A set of hash SHA256 malware samples has been obtained from an internet dataset and will be analyzed using static analysis to record malware behavior and evaluate which risk level of the malware. According to the results, most of the algorithms provide the same total score because of the multiple crime inside the malware application.

Authored by Teddy Mantoro, Muhammad Fahriza, Muhammad Bhakti

PDF Malware Analysis

Malware Analysis - This document addresses the issue of the actual security level of PDF documents. Two types of detection approaches are utilized to detect dangerous elements within malware: static analysis and dynamic analysis. Analyzing malware binaries to identify dangerous strings, as well as reverse-engineering is included in static analysis for t1he malware to disassemble it. On the other hand, dynamic analysis monitors malware activities by running them in a safe environment, such as a virtual machine. Each method has its own set of strengths and weaknesses, and it is usually best to employ both methods while analyzing malware. Malware detection could be simplified without sacrificing accuracy by reducing the number of malicious traits. This may allow the researcher to devote more time to analysis. Our worry is that there is no obvious need to identify malware with numerous functionalities when it isn t necessary. We will solve this problem by developing a system that will identify if the given file is infected with malware or not.

Authored by Md Khalil, Vivek, Kumar Anand, Antarlina Paul, Rahul Grover

ProcGuard: Process Injection Behaviours Detection Using Fine-grained Analysis of API Call Chain with Deep Learning

Information Reuse and Security - New malware increasingly adopts novel fileless techniques to evade detection from antivirus programs. Process injection is one of the most popular fileless attack techniques. This technique makes malware more stealthy by writing malicious code into memory space and reusing the name and port of the host process. It is difficult for traditional security software to detect and intercept process injections due to the stealthiness of its behavior. We propose a novel framework called ProcGuard for detecting process injection behaviors. This framework collects sensitive function call information of typical process injection. Then we perform a fine-grained analysis of process injection behavior based on the function call chain characteristics of the program, and we also use the improved RCNN network to enhance API analysis on the tampered memory segments. We combine API analysis with deep learning to determine whether a process injection attack has been executed. We collect a large number of malicious samples with process injection behavior and construct a dataset for evaluating the effectiveness of ProcGuard. The experimental results demonstrate that it achieves an accuracy of 81.58\% with a lower false-positive rate compared to other systems. In addition, we also evaluate the detection time and runtime performance loss metrics of ProcGuard, both of which are improved compared to previous detection tools.

Authored by Juan Wang, Chenjun Ma, Ziang Li, Huanyu Yuan, Jie Wang

GNN-Based Malicious Network Entities Identification In Large-Scale Network Data

Malware Analysis and Graph Theory - A reliable database of Indicators of Compromise (IoC’s) is a cornerstone of almost every malware detection system. Building the database and keeping it up-to-date is a lengthy and often manual process where each IoC should be manually reviewed and labeled by an analyst. In this paper, we focus on an automatic way of identifying IoC’s intended to save analysts’ time and scale to the volume of network data. We leverage relations of each IoC to other entities on the internet to build a heterogeneous graph. We formulate a classification task on this graph and apply graph neural networks (GNNs) in order to identify malicious domains. Our experiments show that the presented approach provides promising results on the task of identifying high-risk malware as well as legitimate domains classification.

Authored by Stepan Dvorak, Pavel Prochazka, Lukas Bajer