Publications | Science of Security Virtual Organization

Machine Learning Based Obfuscated Malware Detection in the Cloud Environment with Nature-Inspired Feature Selection

Nearest Neighbor Search - One of the most significant and widely used IT breakthroughs nowadays is cloud computing. Today, the majority of enterprises use private or public cloud computing services for their computing infrastructure. Cyber-attackers regularly target Cloud resources by inserting malicious code or obfuscated malware onto the server. These malware programmes that are obfuscated are so clever that they often manage to evade the detection technology that is in place. Unfortunately, they are discovered long after they have done significant harm to the server. Machine Learning (ML) techniques have shown to be effective at finding malware in a wide range of fields. To address feature selection (FS) challenges, this study uses the wrapperbased Binary Bat Algorithm (BBA), Cuckoo Search Algorithm (CSA), Mayfly Algorithm (MA), and Particle Swarm Optimization (PSO), and then k-Nearest Neighbor (kNN), Random Forest (RF), and Support Vector Machine (SVM) are used to classify the benign and malicious records to measure the performance in terms of various metrics. CIC-MalMem-2022, the most recent malware memory dataset, is used to evaluate and test the proposed approach and it is found that the proposed system is an acceptable solution to detect malware.

Authored by Mohd. Ghazi, N. Raghava

An Efficient and Accurate Encrypted Image Retrieval Scheme via Ball Tree

Nearest Neighbor Search - With the rise and development of cloud computing, more and more companies try to outsource computing and storage to cloud in order to save storage and computing cost. Due to the rich information contained in images, the explosion of images is booming the image outsourcing. However, images may contain a lot of sensitive information and cloud servers are always not trusted. Directly outsourcing may lead to data breaches and incur privacy and security concerns. This has partly led to renewed interest in privacy-preserving encrypted image retrieval. However, there are still many challenges, such as low search accuracy and inefﬁciency due to the hundreds of high dimensional features extracted from a single image and the large scale of images. To address these challenges, in this paper, we propose an efﬁcient, scalable and privacy-preserving image retrieval scheme via ball tree. First, the pre-trained Convolutional Neural Network (CNN) model is employed to extract image feature vectors to improve search accuracy. Next, an encrypted ball tree is constructed by using Learning With Errors(LWE)based secure k-Nearest Neighbor (kNN) algorithm. Finally, we conduct comprehensive experiments on real-world datasets and give a brief security analysis. The results show that our scheme is practical in terms of security, accuracy, and efﬁciency.

Authored by Xianxian Li, Jie Lei, Zhenkui Shi, Feng Yu

Classification of Traffic in 5G Internet of Things Networks: A New Framework

Nearest Neighbor Search - The organization formed by the connection established between computers, typically by cable, for the purpose of communicating and transmitting data is known as a network. A computer network is a collection of interconnected computers that allow for the sharing of resources including data, programs, and files. When people think of computer networks, they think of the Internet. In this paper, we proposed the usage of a new technique for the categorization of computer network traffic that is based on deep sparse autoencoders and k-nearest-neighbor (KNN) that has been optimized with Grid Search. The autoencoders took the input data and extracted high-level characteristics from it, then connected those features to the KNN. The KNN was used to divide the characteristics into three distinct kinds of assaults (normal and abnormal). In comparison to other investigations, the proposed approach demonstrated an accuracy of 98.23\% in its results.

Authored by Sarmad Al-Jawashee, Mesüt Çevik

Research on Web Components and Fingerprint Intelligent Mining Methods

Nearest Neighbor Search - Web component fingerprint library is the basis to solve the problem of Web component identification. A complete and accurate Web component fingerprint library can effectively improve the Web component identification capability. At present, the expansion mode of Web component fingerprint database is still mainly based on expert experience for manual mining, which is difficult to expand and update. Therefore, there is an urgent need for a method to efficiently extend the Web component fingerprint library. To solve this problem, an intelligent method for mining Web components and fingerprints is proposed. This method uses the idea of manual mining new components for reference, and uses the search result characteristics of Web components in search engines to intelligently mine new Web components. At the same time, the fingerprint of Web components can be obtained automatically through data mining on the websites where new components are applied. The experimental results show that 22 new components and 102 component fingerprints have been found in a short time by using intelligent mining methods, which can efficiently mine Web components and fingerprints. Compared with the current mainstream manual mining methods, the efficiency of this method is greatly improved, which proves the feasibility of this method.

Authored by Kaiming Yang, Tianyang Zhou, Guoren Zhong, Junhu Zhu, Ziqiao Zhou

Analysis of the Optimized KNN Algorithm for the Data Security of DR Service

Nearest Neighbor Search - The data of large-scale distributed demand-side iot devices are gradually migrated to the cloud. This cloud deployment mode makes it convenient for IoT devices to participate in the interaction between supply and demand, and at the same time exposes various vulnerabilities of IoT devices to the Internet, which can be easily accessed and manipulated by hackers to launch large-scale DDoS attacks. As an easy-to-understand supervised learning classification algorithm, KNN can obtain more accurate classification results without too many adjustment parameters, and has achieved many research achievements in the field of DDoS detection. However, in the face of high-dimensional data, this method has high operation cost, high cost and not practical. Aiming at this disadvantage, this chapter explores the potential of classical KNN algorithm in data storage structure, Knearest neighbor search and hyperparameter optimization, and proposes an improved KNN algorithm for DDoS attack detection of demand-side IoT devices.

Authored by Kun Shi, Songsong Chen, Dezhi Li, Ke Tian, Meiling Feng

A Feature Selection Technique for Network Intrusion Detection based on the Chaotic Crow Search Algorithm

Nearest Neighbor Search - Network security is one of the main challenges faced by network administrators and owners, especially with the increasing numbers and types of attacks. This rapid increase results in a need to develop different protection techniques and methods. Network Intrusion Detection Systems (NIDS) are a method to detect and analyze network traffic to identify attacks and notify network administrators. Recently, machine learning (ML) techniques have been extensively applied in developing detection systems. Due to the high complexity of data exchanged over the networks, applying ML techniques will negatively impact system performance as many features need to be analyzed. To select the most relevant features subset from the input data, a feature selection technique is used, which results in enhancing the overall performance of the NIDS. In this paper, we propose a wrapper approach as a feature selection based on a Chaotic Crow Search Algorithm (CCSA) for anomaly network intrusion detection systems. Experiments were conducted on the LITNET2020 dataset. To the best of our knowledge, our proposed method can be considered the first selection algorithm applied on this dataset based on swarm intelligence optimization to find a special subset of features for binary and multiclass classifications that optimizes the performance for all classes at the same time.The model was evaluated using several ML classifiers namely, Knearest neighbors (KNN), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), Multi-layer perceptron (MLP), and Long Short-Term Memory (LSTM). The results proved that the proposed algorithm is more efficient in improving the performance of NIDS in terms of accuracy, detection rate, precision, F-score, specificity, and false alarm rate, outperforming state-of-the-art feature selection techniques recently proposed in the literature.

Authored by Hussein Al-Zoubi, Samah Altaamneh

Private Approximate Nearest Neighbor Search with Sublinear Communication

Nearest Neighbor Search - Nearest neighbor search is a fundamental buildingblock for a wide range of applications. A privacy-preserving protocol for nearest neighbor search involves a set of clients who send queries to a remote database. Each client retrieves the nearest neighbor(s) to its query in the database without revealing any information about the query. To ensure database privacy, clients must learn as little as possible beyond the query answer, even if behaving maliciously by deviating from protocol.

Authored by Sacha Servan-Schreiber, Simon Langowski, Srinivas Devadas

A Fast System for Person Description Search in Videos

Nearest Neighbor Search - Security CCTV cameras are important for public safety. These cameras record continuously 24/7 and produce a large amount of video data. If the videos are not reviewed immediately after an incident, it can be difficult and timeconsuming to find a specific person out of many hours of recording. In this work we present a system that can search for people that fit a textual description in a video. It utilizes a imagetext multimodal deep learning model to calculate the similarity between an image of a person against a text description and find the top matches. Normally this would require calculating the textimage similarity scores between one text description and every person in the video, which is O(n) in the number of people in the video and therefore impractical for real-time search. We propose a solution to this by pre-calculating embeddings of person images and applying approximate nearest neighbor vector search. At inference time, only one forward pass through the deep learning model is needed, the computational cost is therefore the time to embed a text description O(1), plus the time to perform an approximate nearest neighbor search O(log(n)). This makes realtime interactive search possible.

Authored by Sumeth Yuenyong

Analysis of the Optimized KNN Algorithm for the Data Security of DR Service

Internet-scale Computing Security - The data of large-scale distributed demand-side iot devices are gradually migrated to the cloud. This cloud deployment mode makes it convenient for IoT devices to participate in the interaction between supply and demand, and at the same time exposes various vulnerabilities of IoT devices to the Internet, which can be easily accessed and manipulated by hackers to launch large-scale DDoS attacks. As an easy-to-understand supervised learning classification algorithm, KNN can obtain more accurate classification results without too many adjustment parameters, and has achieved many research achievements in the field of DDoS detection. However, in the face of high-dimensional data, this method has high operation cost, high cost and not practical. Aiming at this disadvantage, this chapter explores the potential of classical KNN algorithm in data storage structure, K-nearest neighbor search and hyperparameter optimization, and proposes an improved KNN algorithm for DDoS attack detection of demand-side IoT devices.

Authored by Kun Shi, Songsong Chen, Dezhi Li, Ke Tian, Meiling Feng