Forensic Science comprises a set of technical-scientific knowledge used to solve illicit acts. The increasing use of mobile devices as the main computing platform, in particular smartphones, makes existing information valuable for forensics. However, the blocking mechanisms imposed by the manufacturers and the variety of models and technologies make the task of reconstructing the data for analysis challenging. It is worth mentioning that the conclusion of a case requires more than the simple identification of evidence, as it is extremely important to correlate all the data and sources obtained, to confirm a suspicion or to seek new evidence. This work carries out a systematic review of the literature, identifying the different types of existing image acquisition and the main extraction and encryption methods used in smartphones with the Android operating system.
Authored by Alessandro Da Costa, Alan de Sá, Raphael Machado
This paper introduces a new type of attack on isolated, air-gapped workstations. Although air-gap computers have no wireless connectivity, we show that attackers can use the SATA cable as a wireless antenna to transfer radio signals at the 6 GHz frequency band. The Serial ATA (SATA) is a bus interface widely used in modern computers and connects the host bus to mass storage devices such as hard disk drives, optical drives, and solid-state drives. The prevalence of the SATA interface makes this attack highly available to attackers in a wide range of computer systems and IT environments. We discuss related work on this topic and provide technical background. We show the design of the transmitter and receiver and present the implementation of these components. We also demonstrate the attack on different computers and provide the evaluation. The results show that attackers can use the SATA cable to transfer a brief amount of sensitive information from highly secured, air-gap computers wirelessly to a nearby receiver. Furthermore, we show that the attack can operate from user mode, is effective even from inside a Virtual Machine (VM), and can successfully work with other running workloads in the background. Finally, we discuss defense and mitigation techniques for this new air-gap attack.
Authored by Mordechai Guri
Highly secure devices are often isolated from the Internet or other public networks due to the confidential information they process. This level of isolation is referred to as an ’air-gap .’In this paper, we present a new technique named ETHERLED, allowing attackers to leak data from air-gapped networked devices such as PCs, printers, network cameras, embedded controllers, and servers. Networked devices have an integrated network interface controller (NIC) that includes status and activity indicator LEDs. We show that malware installed on the device can control the status LEDs by blinking and alternating colors, using documented methods or undocumented firmware commands. Information can be encoded via simple encoding such as Morse code and modulated over these optical signals. An attacker can intercept and decode these signals from tens to hundreds of meters away. We show an evaluation and discuss defensive and preventive countermeasures for this exfiltration attack.
Authored by Mordechai Guri
Cyber-security incidents have grown significantly in modern networks, far more diverse and highly destructive and disruptive. According to the 2021 Cyber Security Statistics Report [1], cybercrime is up 600% during this COVID pandemic, the top attacks are but are not confined to (a) sophisticated phishing emails, (b) account and DNS hijacking, (c) targeted attacks using stealth and air gap malware, (d) distributed denial of services (DDoS), (e) SQL injection. Additionally, 95% of cyber-security breaches result from human error, according to Cybint Report [2]. The average time to identify a breach is 207 days as per Ponemon Institute and IBM, 2022 Cost of Data Breach Report [3]. However, various preventative controls based on cyber-security risk estimation and awareness results decrease most incidents, but not all. Further, any incident detection delay and passive actions to cyber-security incidents put the organizational assets at risk. Therefore, the cyber-security incident management system has become a vital part of the organizational strategy. Thus, the authors propose a framework to converge a "Security Operation Center" (SOC) and a "Network Operations Center" (NOC) in an "Integrated Network Security Operation Center" (INSOC), to overcome cyber-threat detection and mitigation inefficiencies in the near-real-time scenario. We applied the People, Process, Technology, Governance and Compliance (PPTGC) approach to develop the INSOC conceptual framework, according to the requirements we formulated for its operation [4], [5]. The article briefly describes the INSOC conceptual framework and its usefulness, including the central area of the PPTGC approach while designing the framework.
Authored by Deepesh Shahjee, Nilesh Ware
In the context of cybersecurity systems, trust is the firm belief that a system will behave as expected. Trustworthiness is the proven property of a system that is worthy of trust. Therefore, trust is ephemeral, i.e. trust can be broken; trustworthiness is perpetual, i.e. trustworthiness is verified and cannot be broken. The gap between these two concepts is one which is, alarmingly, often overlooked. In fact, the pressure to meet with the pace of operations for mission critical cross domain solution (CDS) development has resulted in a status quo of high-risk, ad hoc solutions. Trustworthiness, proven through formal verification, should be an essential property in any hardware and/or software security system. We have shown, in "vCDS: A Virtualized Cross Domain Solution Architecture", that developing a formally verified CDS is possible. virtual CDS (vCDS) additionally comes with security guarantees, i.e. confidentiality, integrity, and availability, through the use of a formally verified trusted computing base (TCB). In order for a system, defined by an architecture description language (ADL), to be considered trustworthy, the implemented security configuration, i.e. access control and data protection models, must be verified correct. In this paper we present the first and only security auditing tool which seeks to verify the security configuration of a CDS architecture defined through ADL description. This tool is useful in mitigating the risk of existing solutions by ensuring proper security enforcement. Furthermore, when coupled with the agile nature of vCDS, this tool significantly increases the pace of system delivery.
Authored by Nathan Daughety, Marcus Pendleton, Rebeca Perez, Shouhuai Xu, John Franco
Many organizations process and store classified data within their computer networks. Owing to the value of data that they hold; such organizations are more vulnerable to targets from adversaries. Accordingly, the sensitive organizations resort to an ‘air-gap’ approach on their networks, to ensure better protection. However, despite the physical and logical isolation, the attackers have successfully manifested their capabilities by compromising such networks; examples of Stuxnet and Agent.btz in view. Such attacks were possible due to the successful manipulation of human beings. It has been observed that to build up such attacks, persistent reconnaissance of the employees, and their data collection often forms the first step. With the rapid integration of social media into our daily lives, the prospects for data-seekers through that platform are higher. The inherent risks and vulnerabilities of social networking sites/apps have cultivated a rich environment for foreign adversaries to cherry-pick personal information and carry out successful profiling of employees assigned with sensitive appointments. With further targeted social engineering techniques against the identified employees and their families, attackers extract more and more relevant data to make an intelligent picture. Finally, all the information is fused to design their further sophisticated attacks against the air-gapped facility for data pilferage. In this regard, the success of the adversaries in harvesting the personal information of the victims largely depends upon the common errors committed by legitimate users while on duty, in transit, and after their retreat. Such errors would keep on repeating unless these are aligned with their underlying human behaviors and weaknesses, and the requisite mitigation framework is worked out.
Authored by Rizwan Shaikh, Muhammad Khan, Imran Rashid, Haidar Abbas, Farrukh Naeem, Muhammad Siddiqi
MQTT is widely adopted by IoT devices because it allows for the most efficient data transfer over a variety of communication lines. The security of MQTT has received increasing attention in recent years, and several studies have demonstrated the configurations of many MQTT brokers are insecure. Adversaries are allowed to exploit vulnerable brokers and publish malicious messages to subscribers. However, little has been done to understanding the security issues on the device side when devices handle unauthorized MQTT messages. To fill this research gap, we propose a fuzzing framework named ShadowFuzzer to find client-side vulnerabilities when processing incoming MQTT messages. To avoiding ethical issues, ShadowFuzzer redirects traffic destined for the actual broker to a shadow broker under the control to monitor vulnerabilities. We select 15 IoT devices communicating with vulnerable brokers and leverage ShadowFuzzer to find vulnerabilities when they parse MQTT messages. For these devices, ShadowFuzzer reports 34 zero-day vulnerabilities in 11 devices. We evaluated the exploitability of these vulnerabilities and received a total of 44,000 USD bug bounty rewards. And 16 CVE/CNVD/CN-NVD numbers have been assigned to us.
Authored by Huikai Xu, Miao Yu, Yanhao Wang, Yue Liu, Qinsheng Hou, Zhenbang Ma, Haixin Duan, Jianwei Zhuge, Baojun Liu
In the recent years, we have witnessed quite notable cyber-attacks targeting industrial automation control systems. Upgrading their cyber security is a challenge, not only due to long equipment lifetimes and legacy protocols originally designed to run in air-gapped networks. Even where multiple data sources are available and collection established, data interpretation usable across the different data sources remains a challenge. A modern hydro power plant contains the data sources that range from the classical distributed control systems to newer IoT- based data sources, embedded directly within the plant equipment and deeply integrated in the process. Even abundant collected data does not solve the security problems by itself. The interpretation of data semantics is limited as the data is effectively siloed. In this paper, the relevance of semantic integration of diverse data sources is presented in the context of a hydro power plant. The proposed semantic integration would increase the data interoperability, unlocking the data siloes and thus allowing ingestion of complementary data sources. The principal target of the data interoperability is to support the data-enhanced cyber security in an operational hydro power plant context. Furthermore, the opening of the data siloes would enable additional usage of the existing data sources in a structured semantically enriched form.
Authored by Z. Tabak, H. Keko, S. Sučić
Unmanned Aerial Vehicles (UAVs) are drawing enormous attention in both commercial and military applications to facilitate dynamic wireless communications and deliver seamless connectivity due to their flexible deployment, inherent line-of-sight (LOS) air-to-ground (A2G) channels, and high mobility. These advantages, however, render UAV-enabled wireless communication systems susceptible to eavesdropping attempts. Hence, there is a strong need to protect the wireless channel through which most of the UAV-enabled applications share data with each other. There exist various error correction techniques such as Low Density Parity Check (LDPC), polar codes that provide safe and reliable data transmission by exploiting the physical layer but require high transmission power. Also, the security gap achieved by these error-correction techniques must be reduced to improve the security level. In this paper, we present deep learning (DL) enabled punctured LDPC codes to provide secure and reliable transmission of data for UAVs through the Additive White Gaussian Noise (AWGN) channel irrespective of the computational power and channel state information (CSI) of the Eavesdropper. Numerical result analysis shows that the proposed scheme reduces the Bit Error Rate (BER) at Bob effectively as compared to Eve and the Signal to Noise Ratio (SNR) per bit value of 3.5 dB is achieved at the maximum threshold value of BER. Also, the security gap is reduced by 47.22 % as compared to conventional LDPC codes.
Authored by Himanshu Sharma, Neeraj Kumar, Raj Tekchandani, Nazeeruddin Mohammad
Shipboard marine radar systems are essential for safe navigation, helping seafarers perceive their surroundings as they provide bearing and range estimations, object detection, and tracking. Since onboard systems have become increasingly digitized, interconnecting distributed electronics, radars have been integrated into modern bridge systems. But digitization increases the risk of cyberattacks, especially as vessels cannot be considered air-gapped. Consequently, in-depth security is crucial. However, particularly radar systems are not sufficiently protected against harmful network-level adversaries. Therefore, we ask: Can seafarers believe their eyes? In this paper, we identify possible attacks on radar communication and discuss how these threaten safe vessel operation in an attack taxonomy. Furthermore, we develop a holistic simulation environment with radar, complementary nautical sensors, and prototypically implemented cyberattacks from our taxonomy. Finally, leveraging this environment, we create a comprehensive dataset (RadarPWN) with radar network attacks that provides a foundation for future security research to secure marine radar communication.
Authored by Konrad Wolsing, Antoine Saillard, Jan Bauer, Eric Wagner, Christian van Sloun, Ina Fink, Mari Schmidt, Klaus Wehrle, Martin Henze
As the effects of climate change are becoming more and more evident, the importance of improved situation awareness is also gaining more attention, both in the context of preventive environmental monitoring and in the context of acute crisis response. One important aspect of situation awareness is the correct and thorough monitoring of air pollutants. The monitoring is threatened by sensor faults, power or network failures, or other hazards leading to missing or incorrect data transmission. For this reason, in this work we propose two complementary approaches for predicting missing sensor data and a combined technique for detecting outliers. The proposed solution can enhance the performance of low-cost sensor systems, closing the gap of missing measurements due to network unavailability, detecting drift and outliers thus paving the way to its use as an alert system for reportable events. The techniques have been deployed and tested also in a low power microcontroller environment, verifying the suitability of such a computing power to perform the inference locally, leading the way to an edge implementation of a virtual sensor digital twin.
Authored by Martina Rasch, Antonio Martino, Mario Drobics, Massimo Merenda
We introduce AdaMix, an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data. While pre-training language models on large public datasets has enabled strong differential privacy (DP) guarantees with minor loss of accuracy, a similar practice yields punishing trade-offs in vision tasks. A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset. AdaMix incorporates few-shot training, or cross-modal zero-shot learning, on public data prior to private fine-tuning, to improve the trade-off. AdaMix reduces the error increase from the non-private upper bound from the 167–311% of the baseline, on average across 6 datasets, to 68-92% depending on the desired privacy level selected by the user. AdaMix tackles the trade-off arising in visual classification, whereby the most privacy sensitive data, corresponding to isolated points in representation space, are also critical for high classification accuracy. In addition, AdaMix comes with strong theoretical privacy guarantees and convergence analysis.
Authored by Aditya Golatkar, Alessandro Achille, Yu-Xiang Wang, Aaron Roth, Michael Kearns, Stefano Soatto
This paper proposes a novel approach for privacy preserving face recognition aimed to formally define a trade-off optimization criterion between data privacy and algorithm accuracy. In our methodology, real world face images are anonymized with Gaussian blurring for privacy preservation. The anonymized images are processed for face detection, face alignment, face representation, and face verification. The proposed methodology has been validated with a set of experiments on a well known dataset and three face recognition classifiers. The results demonstrate the effectiveness of our approach to correctly verify face images with different levels of privacy and results accuracy, and to maximize privacy with the least negative impact on face detection and face verification accuracy.
Authored by Wisam Abbasi, Paolo Mori, Andrea Saracino, Valerio Frascolla
Modern consumer electronic devices often provide intelligence services with deep neural networks. We have started migrating the computing locations of intelligence services from cloud servers (traditional AI systems) to the corresponding devices (on-device AI systems). On-device AI systems generally have the advantages of preserving privacy, removing network latency, and saving cloud costs. With the emergence of on-device AI systems having relatively low computing power, the inconsistent and varying hardware resources and capabilities pose difficulties. Authors' affiliation has started applying a stream pipeline framework, NNStreamer, for on-device AI systems, saving developmental costs and hardware resources and improving performance. We want to expand the types of devices and applications with on-device AI services products of both the affiliation and second/third parties. We also want to make each AI service atomic, re-deployable, and shared among connected devices of arbitrary vendors; we now have yet another requirement introduced as it always has been. The new requirement of “among-device AI” includes connectivity between AI pipelines so that they may share computing resources and hardware capabilities across a wide range of devices regardless of vendors and manufacturers. We propose extensions of the stream pipeline framework, NNStreamer, for on-device AI so that NNStreamer may provide among-device AI capability. This work is a Linux Foundation (LF AI & Data) open source project accepting contributions from the general public.
Authored by MyungJoo Ham, Sangjung Woo, Jaeyun Jung, Wook Song, Gichan Jang, Yongjoo Ahn, Hyoungjoo Ahn
Throughout history, technological evolution has generated less desired side effects with impact on society. In the field of IT&C, there are ongoing discussions about the role of robots within economy, but also about their impact on the labour market. In the case of digital media systems, we talk about misinformation, manipulation, fake news, etc. Issues related to the protection of the citizen's life in the face of technology began more than 25 years ago; In addition to the many messages such as “the citizen is at the center of concern” or, “privacy must be respected”, transmitted through various channels of different entities or companies in the field of ICT, the EU has promoted a number of legislative and normative documents to protect citizens' rights and freedoms.
Authored by Doina Banciu, Carmen Cîrnu
With 3.78 billion social media users worldwide in 2021 (48% of the human population), almost 3 billion images are shared daily. At the same time, a consistent evolution of smartphone cameras has led to a photography explosion with 85% of all new pictures being captured using smartphones. However, lately, there has been an increased discussion of privacy concerns when a person being photographed is unaware of the picture being taken or has reservations about the same being shared. These privacy violations are amplified for people with disabilities, who may find it challenging to raise dissent even if they are aware. Such unauthorized image captures may also be misused to gain sympathy by third-party organizations, leading to a privacy breach. Privacy for people with disabilities has so far received comparatively less attention from the AI community. This motivates us to work towards a solution to generate privacy-conscious cues for raising awareness in smartphone users of any sensitivity in their viewfinder content. To this end, we introduce PrivPAS (A real time Privacy-Preserving AI System) a novel framework to identify sensitive content. Additionally, we curate and annotate a dataset to identify and localize accessibility markers and classify whether an image is sensitive to a featured subject with a disability. We demonstrate that the proposed lightweight architecture, with a memory footprint of a mere 8.49MB, achieves a high mAP of 89.52% on resource-constrained devices. Furthermore, our pipeline, trained on face anonymized data. achieves an F1-score of 73.1%.
Authored by Harichandana S, Vibhav Agarwal, Sourav Ghosh, Gopi Ramena, Sumit Kumar, Barath Raja
Blockchain and artificial intelligence are two technologies that, when combined, have the ability to help each other realize their full potential. Blockchains can guarantee the accessibility and consistent admittance to integrity safeguarded big data indexes from numerous areas, allowing AI systems to learn more effectively and thoroughly. Similarly, artificial intelligence (AI) can be used to offer new consensus processes, and hence new methods of engaging with Blockchains. When it comes to sensitive data, such as corporate, healthcare, and financial data, various security and privacy problems arise that must be properly evaluated. Interaction with Blockchains is vulnerable to data credibility checks, transactional data leakages, data protection rules compliance, on-chain data privacy, and malicious smart contracts. To solve these issues, new security and privacy-preserving technologies are being developed. AI-based blockchain data processing, either based on AI or used to defend AI-based blockchain data processing, is emerging to simplify the integration of these two cutting-edge technologies.
Authored by Ramiz Salama, Fadi Al-Turjman
This paper focuses on the system on wireless sensor networks. The system is linear and the time of the system is discrete as well as variable, which named discrete-time linear time-varying systems (DLTVS). DLTVS are vulnerable to network attacks when exchanging information between sensors in the network, as well as putting their security at risk. A DLTVS with privacy-preserving is designed for this purpose. A set-membership estimator is designed by adding privacy noise obeying the Laplace distribution to state at the initial moment. Simultaneously, the differential privacy of the system is analyzed. On this basis, the real state of the system and the existence form of the estimator for the desired distribution are analyzed. Finally, simulation examples are given, which prove that the model after adding differential privacy can obtain accurate estimates and ensure the security of the system state.
Authored by Xuefeng Yang, Li Liu, Yinggang Zhang, Yihao Li, Pan Liu, Shili Ai
Research done in Facial Privacy so far has entrenched the scope of gleaning race, age, and gender from a human’s facial image that are classifiable and compliant biometric attributes. Noticeable distortions, morphing, and face-swapping are some of the techniques that have been researched to restore consumers’ privacy. By fooling face recognition models, these techniques cater superficially to the needs of user privacy, however, the presence of visible manipulations negatively affects the aesthetic of the image. The objective of this work is to highlight common adversarial techniques that can be used to introduce granular pixel distortions using white-box and black-box perturbation algorithms that ensure the privacy of users’ sensitive or personal data in face images, fooling AI facial recognition models while maintaining the aesthetics of and visual integrity of the image.
Authored by Nishchal Jagadeesha
The integration of the Internet-of-Vehicles (IoV) and fog computing benefits from cooperative computing and analysis of environmental data while avoiding network congestion and latency. However, when private data is shared across fog nodes or the cloud, there exist privacy issues that limit the effectiveness of IoV systems, putting drivers' safety at risk. To address this problem, we propose a framework called PPIoV, which is based on Federated Learning (FL) and Blockchain technologies to preserve the privacy of vehicles in IoV.Typical machine learning methods are not well suited for distributed and highly dynamic systems like IoV since they train on data with local features. Therefore, we use FL to train the global model while preserving privacy. Also, our approach is built on a scheme that evaluates the reliability of vehicles participating in the FL training process. Moreover, PPIoV is built on blockchain to establish trust across multiple communication nodes. For example, when the local learned model updates from the vehicles and fog nodes are communicated with the cloud to update the global learned model, all transactions take place on the blockchain. The outcome of our experimental study shows that the proposed method improves the global model's accuracy as a result of allowing reputed vehicles to update the global model.
Authored by Jamal Alotaibi, Lubna Alazzawi
With the development of artificial intelligence, the need for data sharing is becoming more and more urgent. However, the existing data sharing methods can no longer fully meet the data sharing needs. Privacy breaches, lack of motivation and mutual distrust have become obstacles to data sharing. We design a privacy-preserving, decentralized data sharing method based on blockchain smart contracts, named PPDS. To protect data privacy, we transform the data sharing problem into a model sharing problem. This means that the data owner does not need to directly share the raw data, but the AI model trained with such data. The data requester and the data owner interact on the blockchain through a smart contract. The data owner trains the model with local data according to the requester's requirements. To fairly assess model quality, we set up several model evaluators to assess the validity of the model through voting. After the model is verified, the data owner who trained the model will receive reward in return through a smart contract. The sharing of the model avoids direct exposure of the raw data, and the reasonable incentive provides a motivation for the data owner to share the data. We describe the design and workflow of our PPDS, and analyze the security using formal verification technology, that is, we use Coloured Petri Nets (CPN) to build a formal model for our approach, proving its security through simulation execution and model checking. Finally, we demonstrate effectiveness of PPDS by developing a prototype with its corresponding case application.
Authored by Xuesong Hai, Jing Liu
Nowadays, IoT networks and devices exist in our everyday life, capturing and carrying unlimited data. However, increasing penetration of connected systems and devices implies rising threats for cybersecurity with IoT systems suffering from network attacks. Artificial Intelligence (AI) and Machine Learning take advantage of huge volumes of IoT network logs to enhance their cybersecurity in IoT. However, these data are often desired to remain private. Federated Learning (FL) provides a potential solution which enables collaborative training of attack detection model among a set of federated nodes, while preserving privacy as data remain local and are never disclosed or processed on central servers. While FL is resilient and resolves, up to a point, data governance and ownership issues, it does not guarantee security and privacy by design. Adversaries could interfere with the communication process, expose network vulnerabilities, and manipulate the training process, thus affecting the performance of the trained model. In this paper, we present a federated learning model which can successfully detect network attacks in IoT systems. Moreover, we evaluate its performance under various settings of differential privacy as a privacy preserving technique and configurations of the participating nodes. We prove that the proposed model protects the privacy without actually compromising performance. Our model realizes a limited performance impact of only ∼ 7% less testing accuracy compared to the baseline while simultaneously guaranteeing security and applicability.
Authored by Zacharias Anastasakis, Konstantinos Psychogyios, Terpsi Velivassaki, Stavroula Bourou, Artemis Voulkidis, Dimitrios Skias, Antonis Gonos, Theodore Zahariadis
Graph-based Semi-Supervised Learning (GSSL) is a practical solution to learn from a limited amount of labelled data together with a vast amount of unlabelled data. However, due to their reliance on the known labels to infer the unknown labels, these algorithms are sensitive to data quality. It is therefore essential to study the potential threats related to the labelled data, more specifically, label poisoning. In this paper, we propose a novel data poisoning method which efficiently approximates the result of label inference to identify the inputs which, if poisoned, would produce the highest number of incorrectly inferred labels. We extensively evaluate our approach on three classification problems under 24 different experimental settings each. Compared to the state of the art, our influence-driven attack produces an average increase of error rate 50% higher, while being faster by multiple orders of magnitude. Moreover, our method can inform engineers of inputs that deserve investigation (relabelling them) before training the learning model. We show that relabelling one-third of the poisoned inputs (selected based on their influence) reduces the poisoning effect by 50%. ACM Reference Format: Adriano Franci, Maxime Cordy, Martin Gubri, Mike Papadakis, and Yves Le Traon. 2022. Influence-Driven Data Poisoning in Graph-Based Semi-Supervised Classifiers. In 1st Conference on AI Engineering - Software Engineering for AI (CAIN’22), May 16–24, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 11 pages. https://doi.org/10.1145/3522664.3528606
Authored by Adriano Franci, Maxime Cordy, Martin Gubri, Mike Papadakis, Yves Le Traon
Recent trends in the convergence of edge computing and artificial intelligence (AI) have led to a new paradigm of “edge intelligence”, which are more vulnerable to attack such as data and model poisoning and evasion of attacks. This paper proposes a white-box poisoning attack against online regression model for edge intelligence environment, which aim to prepare the protection methods in the future. Firstly, the new method selects data points from original stream with maximum loss by two selection strategies; Secondly, it pollutes these points with gradient ascent strategy. At last, it injects polluted points into original stream being sent to target model to complete the attack process. We extensively evaluate our proposed attack on open dataset, the results of which demonstrate the effectiveness of the novel attack method and the real implications of poisoning attack in a case study electric energy prediction application.
Authored by Yanxu Zhu, Hong Wen, Peng Zhang, Wen Han, Fan Sun, Jia Jia
With the widespread deployment of data-driven services, the demand for data volumes continues to grow. At present, many applications lack reliable human supervision in the process of data collection, which makes the collected data contain low-quality data or even malicious data. This low-quality or malicious data make AI systems potentially face much security challenges. One of the main security threats in the training phase of machine learning is data poisoning attacks, which compromise model integrity by contaminating training data to make the resulting model skewed or unusable. This paper reviews the relevant researches on data poisoning attacks in various task environments: first, the classification of attacks is summarized, then the defense methods of data poisoning attacks are sorted out, and finally, the possible research directions in the prospect.
Authored by Jiaxin Fan, Qi Yan, Mohan Li, Guanqun Qu, Yang Xiao