The Internet has evolved to the point that gigabytes and even terabytes of data are generated and processed on a daily basis. Such a stream of data is characterised by high volume, velocity and variety and is referred to as Big Data. Traditional data processing tools can no longer be used to process big data, because they were not designed to handle such a massive amount of data. This problem concerns also cyber security, where tools like intrusion detection systems employ classification algorithms to analyse the network traffic. Achieving a high accuracy attack detection becomes harder when the amount of data increases and the algorithms must be efficient enough to keep up with the throughput of a huge data stream. Due to the challenges posed by a big data environment, some monitoring systems have already shifted from deep packet inspection to flow-level inspection. The goal of this paper is to evaluate the applicability of an existing intrusion detection technique that performs deep packet inspection in a big data setting. We have conducted several experiments with Apache Spark to assess the performance of the technique when classifying anomalous packets, showing that it benefits from the use of Spark.
Authored by Fabrizio Angiulli, Angelo Furfaro, Domenico Saccá, Ludovica Sacco
Attack detection in enterprise networks is increasingly faced with large data volumes, in part high data bursts, and heavily fluctuating data flows that often cause arbitrary discarding of data packets in overload situations which can be used by attackers to hide attack activities. Attack detection systems usually configure a comprehensive set of signatures for known vulnerabilities in different operating systems, protocols, and applications. Many of these signatures, however, are not relevant in each context, since certain vulnerabilities have already been eliminated, or the vulnerable applications or operating system versions, respectively, are not installed on the involved systems. In this paper, we present an approach for clustering data flows to assign them to dedicated analysis units that contain only signature sets relevant for the analysis of these flows. We discuss the performance of this clustering and show how it can be used in practice to improve the efficiency of an analysis pipeline.
Authored by Michael Vogel, Franka Schuster, Fabian Kopp, Hartmut König
Though several deep learning (DL) detectors have been proposed for the network attack detection and achieved high accuracy, they are computationally expensive and struggle to satisfy the real-time detection for high-speed networks. Recently, programmable switches exhibit a remarkable throughput efficiency on production networks, indicating a possible deployment of the timely detector. Therefore, we present Soter, a DL enhanced in-network framework for the accurate real-time detection. Soter consists of two phases. One is filtering packets by a rule-based decision tree running on the Tofino ASIC. The other is executing a well-designed lightweight neural network for the thorough inspection of the suspicious packets on the CPU. Experiments on the commodity switch demonstrate that Soter behaves stably in ten network scenarios of different traffic rates and fulfills per-flow detection in 0.03s. Moreover, Soter naturally adapts to the distributed deployment among multiple switches, guaranteeing a higher total throughput for large data centers and cloud networks.
Authored by Guorui Xie, Qing Li, Chupeng Cui, Peican Zhu, Dan Zhao, Wanxin Shi, Zhuyun Qi, Yong Jiang, Xi Xiao
Network attacks become more complicated with the improvement of technology. Traditional statistical methods may be insufficient in detecting constantly evolving network attack. For this reason, the usage of payload-based deep packet inspection methods is very significant in detecting attack flows before they damage the system. In the proposed method, features are extracted from the byte distributions in the payload and these features are provided to characterize the flows more deeply by using N-Gram analysis methods. The proposed procedure has been tested on IDS 2012 and 2017 datasets, which are widely used in the literature.
Authored by Süleyman Özdel, Pelin Ateş, Çağatay Ateş, Mutlu Koca, Emin Anarım
Internet service providers (ISP) rely on network traffic classifiers to provide secure and reliable connectivity for their users. Encrypted traffic introduces a challenge as attacks are no longer viable using classic Deep Packet Inspection (DPI) techniques. Distinguishing encrypted from non-encrypted traffic is the first step in addressing this challenge. Several attempts have been conducted to identify encrypted traffic. In this work, we compare the detection performance of DPI, traffic pattern, and randomness tests to identify encrypted traffic in different levels of granularity. In an experimental study, we evaluate these candidates and show that a traffic pattern-based classifier outperforms others for encryption detection.
Authored by Hossein Doroud, Ahmad Alaswad, Falko Dressler
The growing number of cybersecurity incidents and the always increasing complexity of cybersecurity attacks is forcing the industry and the research community to develop robust and effective methods to detect and respond to network attacks. Many tools are either built upon a large number of rules and signatures which only large third-party vendors can afford to create and maintain, or are based on complex artificial intelligence engines which, in most cases, still require personalization and fine-tuning using costly service contracts offered by the vendors.This paper introduces an open-source network traffic monitoring system based on the concept of cyberscore, a numerical value that represents how a network activity is considered relevant for spotting cybersecurity-related events. We describe how this technique has been applied in real-life networks and present the result of this evaluation.
Authored by Luca Deri, Alfredo Cardigliano
Current intrusion detection techniques cannot keep up with the increasing amount and complexity of cyber attacks. In fact, most of the traffic is encrypted and does not allow to apply deep packet inspection approaches. In recent years, Machine Learning techniques have been proposed for post-mortem detection of network attacks, and many datasets have been shared by research groups and organizations for training and validation. Differently from the vast related literature, in this paper we propose an early classification approach conducted on CSE-CIC-IDS2018 dataset, which contains both benign and malicious traffic, for the detection of malicious attacks before they could damage an organization. To this aim, we investigated a different set of features, and the sensitivity of performance of five classification algorithms to the number of observed packets. Results show that ML approaches relying on ten packets provide satisfactory results.
Authored by Idio Guarino, Giampaolo Bovenzi, Davide Di Monda, Giuseppe Aceto, Domenico Ciuonzo, Antonio Pescapè
The Manufacturer Usage Description (MUD) standard aims to reduce the attack surface for IoT devices by locking down their behavior to a formally-specified set of network flows (access control entries). Formal network behaviors can also be systematically and rigorously verified in any operating environment. Enforcing MUD flows and monitoring their activity in real-time can be relatively effective in securing IoT devices; however, its scope is limited to endpoints (domain names and IP addresses) and transport-layer protocols and services. Therefore, misconfigured or compromised IoTs may conform to their MUD-specified behavior but exchange unintended (or even malicious) contents across those flows. This paper develops PicP-MUD with the aim to profile the information content of packet payloads (whether unencrypted, encoded, or encrypted) in each MUD flow of an IoT device. That way, certain tasks like cyber-risk analysis, change detection, or selective deep packet inspection can be performed in a more systematic manner. Our contributions are twofold: (1) We analyze over 123K network flows of 6 transparent (e.g., HTTP), 11 encrypted (e.g., TLS), and 7 encoded (e.g., RTP) protocols, collected in our lab and obtained from public datasets, to identify 17 statistical features of their application payload, helping us distinguish different content types; and (2) We develop and evaluate PicP-MUD using a machine learning model, and show how we achieve an average accuracy of 99% in predicting the content type of a flow.
Authored by Arman Pashamokhtari, Arunan Sivanathan, Ayyoob Hamza, Hassan Gharakheili