Publications | Science of Security Virtual Organization

A No Code XAI Framework for Policy Making

Explainable AI (XAI) techniques are used for understanding the internals of the AI algorithms and how they produce a particular result. Several software packages are available implementing XAI techniques however, their use requires a deep knowledge of the AI algorithms and their output is not intuitive for non-experts. In this paper we present a framework, (XAI4PublicPolicy), that provides customizable and reusable dashboards for XAI ready to be used both for data scientists and general users with no code. The models, and data sets are selected dragging and dropping from repositories While dashboards are generated selecting the type of charts. The framework can work with structured data and images in different formats. This XAI framework was developed and is being used in the context of the AI4PublicPolicy European project for explaining the decisions made by machine learning models applied to the implementation of public policies.

Authored by Marta Martínez, Ainhoa Azqueta-Alzúaz

Twin eye Authentication Gateway Architecture Resilient to DDoS attacks in 6LoWPAN IoT Network Using Machine Learning Techniques

IoT technology establishes a platform for automating services by connecting diverse objects through the Internet backbone. However, the integration of IoT networks also introduces security challenges, rendering IoT infrastructure susceptible to cyber-attacks. Notably, Distributed Denial of Service (DDoS) attacks breach the authorization conditions and these attacks have the potential to disrupt the physical functioning of the IoT infrastructure, leading to significant financial losses and even endangering human lives. Yet, maintaining availability even when networking elements malfunction has not received much attention. This research paper introduces a novel Twin eye Architecture, which includes dual gateway connecting every IoT access network to provide reliability even with the failure or inaccessibility of connected gateway. It includes the module called DDoS Manager that is molded into the gateway to recognize the dangling of the gateway. The effectiveness of the proposed model is evaluated using dataset simulated in NS3 environment. The results highlight the outstanding performance of the proposed model, achieving high accuracy rates. These findings demonstrate the proposed network architecture continues to provide critical authentication services even upon the failure of assigned gateway.

Authored by Manjula L, G Raju

Semi-supervised Trojan Nets Classification Using Anomaly Detection Based on SCOAP Features

Recently, hardware Trojan has become a serious security concern in the integrated circuit (IC) industry. Due to the globalization of semiconductor design and fabrication processes, ICs are highly vulnerable to hardware Trojan insertion by malicious third-party vendors. Therefore, the development of effective hardware Trojan detection techniques is necessary. Testability measures have been proven to be efﬁcient features for Trojan nets classiﬁcation. However, most of the existing machine-learning-based techniques use supervised learning methods, which involve time-consuming training processes, need to deal with the class imbalance problem, and are not pragmatic in real-world situations. Furthermore, no works have explored the use of anomaly detection for hardware Trojan detection tasks. This paper proposes a semi-supervised hardware Trojan detection method at the gate level using anomaly detection. We ameliorate the existing computation of the Sandia Controllability/Observability Analysis Program (SCOAP) values by considering all types of D ﬂip-ﬂops and adopt semi-supervised anomaly detection techniques to detect Trojan nets. Finally, a novel topology-based location analysis is utilized to improve the detection performance. Testing on 17 Trust-Hub Trojan benchmarks, the proposed method achieves an overall 99.47\% true positive rate (TPR), 99.99\% true negative rate (TNR), and 99.99\% accuracy.

Authored by Pei-Yu Lo, Chi-Wei Chen, Wei-Ting Hsu, Chih-Wei Chen, Chin-Wei Tien, Sy-Yen Kuo

Hardware Trojan Detection at LUT: Where Structural Features Meet Behavioral Characteristics

This work proposes a novel hardware Trojan detection method that leverages static structural features and behavioral characteristics in ﬁeld programmable gate array (FPGA) netlists. Mapping of hardware design sources to look-up-table (LUT) networks makes these features explicit, allowing automated feature extraction and further effective Trojan detection through machine learning. Four-dimensional features are extracted for each signal and a random forest classiﬁer is trained for Trojan net classiﬁcation. Experiments using Trust-Hub benchmarks show promising Trojan detection results with accuracy, precision, and F1-measure of 99.986\%, 100\%, and 99.769\% respectively on average.

Authored by Lingjuan Wu, Xuelin Zhang, Siyi Wang, Wei Hu

Obtaining a Highly Informative Digital Image of Information Signals of Cyber-Physical Systems Using Time-Frequency Spectral Analysis and Digital Filtering

The paper presents the stages of constructing a highly informative digital image of the time-frequency representation of information signals of cyber-physical systems. Signal visualization includes the stage of displaying the signal on the frequency-time plane, the stage of two-dimensional digital filtering and the stage of extracting highly informative components of the signal image. The use of two-dimensional digital filtering allows you to select the most informative component of the image of a complex analyzed information signal. The obtained digital image of the signal of the cyber-physical system is a highly informative initial information for solving a wide range of different problems of information security systems in cyberphysical systems with the subsequent use of machine learning technologies.

Authored by Andrey Ragozin, Anastasiya Pletenkova

Development of Threat Hunting Model Using Machine Learning Algorithms for Cyber Attacks Mitigation

Threat hunting has become very popular due to the present dynamic cyber security environment. As there remains increase in attacks’ landscape, the traditional way of monitoring threats is not scalable anymore. Consequently, threat hunting modeling technique is implemented as an emergent activity using machine learning (ML) paradigms. ML predictive analytics was carried out on OSTO-CID dataset using four algorithms to develop the model. Cross validation ratio of 80:20 was used to train and test the model. Decision tree classifier (DTC) gives the best metrics results among the four ML algorithms with 99.30\% accuracy. Therefore, DTC can be used for developing threat hunting model to mitigate cyber-attacks using data mining approach.

Authored by Akinsola T., Olajubu A., Aderounmu A.

Detection of Intrusions using Support Vector Machines and Deep Neural Networks

An Intrusion detection system (IDS) plays a role in network intrusion detection through network data analysis, and high detection accuracy, precision, and recall are required to detect intrusions. Also, various techniques such as expert systems, data mining, and state transition analysis are used for network data analysis. The paper compares the detection effects of the two IDS methods using data mining. The first technique is a support vector machine (SVM), a machine learning algorithm; the second is a deep neural network (DNN), one of the artificial neural network models. The accuracy, precision, and recall were calculated and compared using NSL-KDD training and validation data, which is widely used in intrusion detection to compare the detection effects of the two techniques. DNN shows slightly higher accuracy than the SVM model. The risk of recognizing an actual intrusion as normal data is much greater than the risk of considering normal data as an intrusion, so DNN proves to be much more effective in intrusion detection than SVM.

Authored by N Patel, B Mehtre, Rajeev Wankar

Cyber Threat Intelligence and Machine Learning

Cyber Threat Intelligence has been demonstrated to be an effective element of defensive security and cyber protection with examples dating back to the founding of the Financial Sector Information Sharing and Analysis Center (FS ISAC) in 1998. Automated methods are needed today in order to stay current with the magnitude of attacks across the globe. Threat information must be actionable, current and credibly validated if they are to be ingested into computer operated defense systems. False positives degrade the value of the system. This paper outlines some of the progress made in applying artificial intelligence techniques as well as the challenges associated with utilizing machine learning to refine the flow of threat intelligence. A variety of methods have been developed to create learning models that can be integrated with firewalls, rules and heuristics. In addition more work is needed to effectively support the limited number of expert human hours available to evaluate the prioritized threat landscape flagged as malicious in a (Security Operations Center) SOC environment.

Authored by Jon Haass

Low-rank Defenses Against Adversarial Attacks in Recommender Systems

Recommender systems are powerful tools which touch on numerous aspects of everyday life, from shopping to consuming content, and beyond. However, as other machine learning models, recommender system models are vulnerable to adversarial attacks and their performance could drop significantly with a slight modification of the input data. Most of the studies in the area of adversarial machine learning are focused on the image and vision domain. There are very few work that study adversarial attacks on recommender systems and even fewer work that study ways to make the recommender systems robust and reliable. In this study, we explore two stateof-the-art adversarial attack methods proposed by Tang et al. [1] and Christakopoulou et al. [2] and we report our proposed defenses and experimental evaluations against these attacks. In particular, we observe that low-rank reconstructions and/or transformation of the attacked data has a significant alleviating effect on the attack, and we present extensive experimental evidence to demonstrate the effectiveness of this approach. We also show that a simple classifier is able to learn to detect fake users from real users and can successfully discard them from the dataset. This observation elaborates the fact that the threat model does not generate fake users that mimic the same behavior of real users and can be easily distinguished from real users’ behavior. We also examine how transforming latent factors of the matrix factorization model into a low-dimensional space impacts its performance. Furthermore, we combine fake users from both attacks to examine how our proposed defense is able to defend against multiple attacks at the same time. Local lowrank reconstruction was able to reduce the hit ratio of target items from 23.54\% to 15.69\% while the overall performance of the recommender system was preserved.

Authored by Negin Entezari, Evangelos Papalexakis

Automatic Mapping of Unstructured Cyber Threat Intelligence: An Experimental Study: (Practical Experience Report)

Proactive approaches to security, such as adversary emulation, leverage information about threat actors and their techniques (Cyber Threat Intelligence, CTI). However, most CTI still comes in unstructured forms (i.e., natural language), such as incident reports and leaked documents. To support proactive security efforts, we present an experimental study on the automatic classiﬁcation of unstructured CTI into attack techniques using machine learning (ML). We contribute with two new datasets for CTI analysis, and we evaluate several ML models, including both traditional and deep learning-based ones. We present several lessons learned about how ML can perform at this task, which classiﬁers perform best and under which conditions, which are the main causes of classiﬁcation errors, and the challenges ahead for CTI analysis.

Authored by Vittorio Orbinato, Mariarosaria Barbaraci, Roberto Natella, Domenico Cotroneo

Threat Modeling for Machine Learning-Based Network Intrusion Detection Systems

Network Intrusion Detection Systems (NIDS) monitor networking environments for suspicious events that could compromise the availability, integrity, or confidentiality of the network’s resources. To ensure NIDSs play their vital roles, it is necessary to identify how they can be attacked by adopting a viewpoint similar to the adversary to identify vulnerabilities and defenses hiatus. Accordingly, effective countermeasures can be designed to thwart any potential attacks. Machine learning (ML) approaches have been adopted widely for network anomaly detection. However, it has been found that ML models are vulnerable to adversarial attacks. In such attacks, subtle perturbations are inserted to the original inputs at inference time in order to evade the classifier detection or at training time to degrade its performance. Yet, modeling adversarial attacks and the associated threats of employing the machine learning approaches for NIDSs was not addressed. One of the growing challenges is to avoid ML-based systems’ diversity and ensure their security and trust. In this paper, we conduct threat modeling for ML-based NIDS using STRIDE and Attack Tree approaches to identify the potential threats on different levels. We model the threats that can be potentially realized by exploiting vulnerabilities in ML algorithms through a simplified structural attack tree. To provide holistic threat modeling, we apply the STRIDE method to systems’ data flow to uncover further technical threats. Our models revealed a noticing of 46 possible threats to consider. These presented models can help to understand the different ways that a ML-based NIDS can be attacked; hence, hardening measures can be developed to prevent these potential attacks from achieving their goals.

Authored by Huda Alatwi, Charles Morisset

Prior Knowledge based Advanced Persistent Threats Detection for IoT in a Realistic Benchmark

The number of Internet of Things (IoT) devices being deployed into networks is growing at a phenomenal pace, which makes IoT networks more vulnerable in the wireless medium. Advanced Persistent Threat (APT) is malicious to most of the network facilities and the available attack data for training the machine learning-based Intrusion Detection System (IDS) is limited when compared to the normal trafﬁc. Therefore, it is quite challenging to enhance the detection performance in order to mitigate the inﬂuence of APT. Therefore, Prior Knowledge Input (PKI) models are proposed and tested using the SCVIC-APT2021 dataset. To obtain prior knowledge, the proposed PKI model pre-classiﬁes the original dataset with unsupervised clustering method. Then, the obtained prior knowledge is incorporated into the supervised model to decrease training complexity and assist the supervised model in determining the optimal mapping between the raw data and true labels. The experimental ﬁndings indicate that the PKI model outperforms the supervised baseline, with the best macro average F1-score of 81.37\%, which is 10.47\% higher than the baseline.

Authored by Yu Shen, Murat Simsek, Burak Kantarci, Hussein Mouftah, Mehran Bagheri, Petar Djukic

Combating Advanced Persistent Threats for Imminent Low Earth Orbit Cognitive Communications Systems

With the proliferation of Low Earth Orbit (LEO) spacecraft constellations, comes the rise of space-based wireless cognitive communications systems (CCS) and the need to safeguard and protect data against potential hostiles to maintain widespread communications for enabling science, military and commercial services. For example, known adversaries are using advanced persistent threats (APT) or highly progressive intrusion mechanisms to target high priority wireless space communication systems. Specialized threats continue to evolve with the advent of machine learning and artificial intelligence, where computer systems inherently can identify system vulnerabilities expeditiously over naive human threat actors due to increased processing resources and unbiased pattern recognition. This paper presents a disruptive abuse case for an APT-attack on such a CCS and describes a trade-off analysis that was performed to evaluate a variety of machine learning techniques that could aid in the rapid detection and mitigation of an APT-attack. The trade results indicate that with the employment of neural networks, the CCS s resiliency would increase its operational functionality, and therefore, on-demand communication services reliability would increase. Further, modelling, simulation, and analysis (MS\&A) was achieved using the Knowledge Discovery and Data Mining (KDD) Cup 1999 data set as a means to validate a subset of the trade study results against Training Time and Number of Parameters selection criteria. Training and cross-validation learning curves were computed to model the learning performance over time to yield a reasonable conclusion about the application of neural networks.

Authored by Suzanna LaMar, Jordan Gosselin, Lisa Happel, Anura Jayasumana

An Efficient User Trust Computation Using Machine Learning Methods in Online Social Networks

Social networks are good platforms for likeminded people to exchange their views and thoughts. With the rapid growth of web applications, social networks became huge networks with million numbers of users. On the other hand, number of malicious activities by untrustworthy users also increased. Users must estimate the people trustworthiness before sharing their personal information with them. Since the social networks are huge and complex, the estimation of user trust value is not trivial task and could gain main researchers focus. Some of the mathematical methods are proposed to estimate the user trust value, but still they are lack of efficient methods to analyze user activities. In this paper “An Efficient Trust Computation Methods Using Machine Learning in Online Social Networks- TCML” is proposed. Here the twitter user activities are considered to estimate user direct trust value. The trust values of unknown users are computed through the recommendations of common friends. The available twitter data set is unlabeled data, hence unsupervised methods are used in categorization (clusters) of users and in computation of their trust value. In experiment results, silhouette score is used in assessing of cluster quality. The proposed method performance is compared with existing methods like mole and tidal where it could outperform them.

Authored by Anitha Yarava, Shoba Bindu

Artificial Intelligence and its Influence on E-Commerce

COVID-19 has taught us the need of practicing social distancing. In the year 2020 because of sudden lockdown across the globe, E-commerce websites and e-shopping were the only escape to fulfill our basic needs and with the advancement of technology putting your websites online has become a necessity. Be it food, groceries, or our favorite outfit, all these things are now available online. It was noticed during the lockdown period that the businesses that had no social presence suffered heavy losses. On the other hand, people who had established their presence on the internet saw a sudden boom in their overall sales. This project discusses how the recent advancement in the field of Machine Learning and Artificial Intelligence has led to an increase in the sales of various businesses. The machine learning model analyses the pattern of customer’s behavior which affects the sales builds a dataset after many observations and finally helps generate an algorithm which is an efficient recommendation system. This project also discusses how cyber security helps us have secured and authenticated transactions which have aided ecommerce business growth by building customer s trust.

Authored by Tanya Pahadi, Abhishek Verma, Raju Ranjan

Trustworthy Autonomous Systems (TAS): Engaging TAS experts in curriculum design

Recent advances in artificial intelligence, specifically machine learning, contributed positively to enhancing the autonomous systems industry, along with introducing social, technical, legal and ethical challenges to make them trustworthy. Although Trustworthy Autonomous Systems (TAS) is an established and growing research direction that has been discussed in multiple disciplines, e.g., Artificial Intelligence, Human-Computer Interaction, Law, and Psychology. The impact of TAS on education curricula and required skills for future TAS engineers has rarely been discussed in the literature. This study brings together the collective insights from a number of TAS leading experts to highlight significant challenges for curriculum design and potential TAS required skills posed by the rapid emergence of TAS. Our analysis is of interest not only to the TAS education community but also to other researchers, as it offers ways to guide future research toward operationalising TAS education.

Authored by Mohammad Naiseh, Caitlin Bentley, Sarvapali Ramchurn

Trustworthy Machine Learning for Securing IoT Systems

This paper first describes the security and privacy challenges for the Internet of Things IoT) systems and then discusses some of the solutions that have been proposed. It also describes aspects of Trustworthy Machine Learning (TML) and then discusses how TML may be applied to handle some of the security and privacy challenges for IoT systems.

Authored by Bhavani Thuraisingham

Data Trustworthiness for UWB Ranging in IoT

The computation of data trustworthiness during double-sided two-way-ranging with ultra-wideband signals between IoT devices is proposed. It relies on machine learning based ranging error correction, in which the certainty of the correction value is used to quantify trustworthiness. In particular, the trustworthiness score and error correction value are calculated from channel impulse response measurements, either using a modified k-nearest neighbor (KNN) or a modified random forest (RF) algorithm. The proposed scheme is easily implemented using commercial ultra-wideband transceivers and it enables real time surveillance of malicious or unintended modification of the propagation channel. The results on experimental data show an improvement of 47\% RMSE on the test set when only trustworthy measurements are considered.

Authored by Philipp Peterseil, Bernhard Etzlinger, David Marzinger, Roya Khanzadeh, Andreas Springer

On the Security of Python Virtual Machines: An Empirical Study

Python continues to be one of the most popular programming languages and has been used in many safetycritical ﬁelds such as medical treatment, autonomous driving systems, and data science. These ﬁelds put forward higher security requirements to Python ecosystems. However, existing studies on machine learning systems in Python concentrate on data security, model security and model privacy, and just assume the underlying Python virtual machines (PVMs) are secure and trustworthy. Unfortunately, whether such an assumption really holds is still unknown.

Authored by Xinrong Lin, Baojian Hua, Qiliang Fan

A Vision For Hierarchical Federated Learning in Dynamic Service Chaining

We have seen the tremendous expansion of machine learning (ML) technology in Artificial Intelligence (AI) applications, including computer vision, voice recognition, and many others. The availability of a vast amount of data has spurred the rise of ML technologies, especially Deep Learning (DL). Traditional ML systems consolidate all data into a central location, usually a data center, which may breach privacy and confidentiality rules. The Federated Learning (FL) concept has recently emerged as a promising solution for mitigating data privacy, legality, scalability, and unwanted bandwidth loss problems. This paper outlines a vision for leveraging FL for better traffic steering predictions. Specifically, we propose a hierarchical FL framework that will dynamically update service function chains in a network by predicting future user demand and network state using the FL method.

Authored by Abdullah Bittar, Changcheng Huang

Research on Video Surveillance Violence Detection Technology Based on Deep Convolution Network

In recent years, in order to continuously promote the construction of safe cities, security monitoring equipment has been widely used all over the country. How to use computer vision technology to realize effective intelligent analysis of violence in video surveillance is very important to maintain social stability and ensure people s life and property safety. Video surveillance system has been widely used because of its intuitive and convenient advantages. However, the existing video monitoring system has relatively single function, and generally only has the functions of monitoring video viewing, query and playback. In addition, relevant researchers pay less attention to the complex abnormal behavior of violence, and relevant research often ignores the differences between violent behaviors in different scenes. At present, there are two main problems in video abnormal behavior event detection: the video data of abnormal behavior is less and the definition of abnormal behavior in different scenes cannot be clearly distinguished. The main existing methods are to model normal behavior events first, and then define videos that do not conform to the normal model as abnormal, among which the learning method of video space-time feature representation based on deep learning shows a good prospect. In the face of massive surveillance videos, it is necessary to use deep learning to identify violent behaviors, so that the machine can learn to identify human actions, instead of manually monitoring camera images to complete the alarm of violent behaviors. Network training mainly uses video data set to identify network training.

Authored by Xuezhong Wang

Message Passing Graph Neural Networks for Software Security Vulnerability Detection

Vulnerability Detection 2022 - With the booming development of deep learning and machine learning, the use of neural networks for software source code security vulnerability detection has become a hot pot in the field of software security. As a data structure, graphs can adequately represent the complex syntactic information, semantic information, and dependencies in software source code. In this paper, we propose the MPGVD model based on the idea of text classification in natural language processing. The model uses BERT for source code pre-training, transforms graphs into corresponding feature vectors, uses MPNN (Message Passing Neural Networks) based on graph neural networks in the feature extraction phase, and finally outputs the detection results. Our proposed MPGVD, compared with other existing vulnerability detection models on the same dataset CodeXGLUE, obtain the highest detection accuracy of 64.34\%.

Authored by Yang Xue, Junjun Guo, Li Zhang, Huiyu Song

Quantum Federated Learning: Remarks and Challenges

Quantum Computing Security 2022 - As the development of quantum computing hardware is on the rise, its potential application to various research areas has been investigated, including to machine learning. Recently, there have been several initiatives to expand the work to quantum federated learning (QFL). However, challenges arise due to the fact that quantum computation poses different characteristics from classical computation, giving an even more challenge for a federated setting. In this paper, we present a highlevel overview of the current state of research in QFL. Furthermore, we also describe in brief about quantum computation and discuss its present limitations in relation to QFL development. Additionally, possible approaches to deploy QFL are explored. Lastly, remarks and challenges of QFL are also presented.

Authored by Harashta Larasati, Muhammad Firdaus, Howon Kim

Automatic Classification of Web and IoT Privacy Policies

Privacy Policies - Privacy policies, despite the important information they provide about the collection and use of one’s data, tend to be skipped over by most Internet users. In this paper, we seek to make privacy policies more accessible by automatically classifying text samples into web privacy categories. We use natural language processing techniques and multiple machine learning models to determine the effectiveness of each method in the classiﬁcation method. We also explore the effectiveness of these methods to classify privacy policies of Internet of Things (IoT) devices.

Authored by Jasmine Carson, Lisa DiSalvo, Lydia Ray

Software Defined Perimeter Monitoring and Blockchain-Based Verification of Policy Mapping

Predictive Security Metrics - With the emergence of Zero Trust (ZT) Architecture, industry leaders have been drawn to the technology because of its potential to handle a high level of security threats. The Zero Trust Architecture (ZTA) is paving the path for a security industrial revolution by eliminating location-based implicant access and focusing on asset, user, and resource security. Software Defined Perimeter (SDP) is a secure overlay network technology that can be used to implement a Zero Trust framework. SDP is a next-generation network technology that allows network architecture to be hidden from the outside world. It also hides the overlay communication from the underlay network by employing encrypted communications. With encrypted information, detecting abnormal behavior of entities on an overlay network becomes exceedingly difficult. Therefore, an automated system is required. We proposed a method in this paper for understanding the normal behavior of deployed polices by mapping network usage behavior to the policy. An Apache Spark collects and processes the streaming overlay monitoring data generated by the built-in fabric API in order to do this mapping. It sends extracted metrics to Prometheus for storage, and then uses the data for machine learning training and prediction. The cluster-id of the link that it belongs to is predicted by the model, and the cluster-ids are mapped onto the policies. To validate the legitimacy of policy, the labeled polices hash is compared to the actual polices hash that is obtained from blockchain. Unverified policies are notified to the SDP controller for additional action, such as defining new policy behavior or marking uncertain policies.

Authored by Waleed Akbar, Javier Rivera, Khan Ahmed, Afaq Muhammad, Wang-Cheol Song