Recommenders are central in many applications today. The most effective recommendation schemes, such as those based on collaborative filtering (CF), exploit similarities between user profiles to make recommendations, but potentially expose private data. Federated learning and decentralized learning systems address this by letting the data stay on user's machines to preserve privacy: each user performs the training on local data and only the model parameters are shared. However, sharing the model parameters across the network may still yield privacy breaches. In this paper, we present Rex, the first enclave-based decentralized CF recommender. Rex exploits Trusted execution environments (TEE), such as Intel software guard extensions (SGX), that provide shielded environments within the processor to improve convergence while preserving privacy. Firstly, Rex enables raw data sharing, which ultimately speeds up convergence and reduces the network load. Secondly, Rex fully preserves privacy. We analyze the impact of raw data sharing in both deep neural network (DNN) and matrix factorization (MF) recommenders and showcase the benefits of trusted environments in a full-fledged implementation of Rex. Our experimental results demonstrate that through raw data sharing, Rex significantly decreases the training time by 18.3 x and the network load by 2 orders of magnitude over standard decentralized approaches that share only parameters, while fully protecting privacy by leveraging trustworthy hardware enclaves with very little overhead.
Authored by Akash Dhasade, Nevena Dresevic, Anne-Marie Kermarrec, Rafael Pires
Based on the analysis of material performance data management requirements, a network-sharing scheme of material performance data is proposed. A material performance database system including material performance data collection, data query, data analysis, data visualization, data security management and control modules is designed to solve the problems of existing material performance database network sharing, data fusion and multidisciplinary support, and intelligent services Inadequate standardization and data security control. This paper adopts hierarchical access control strategy. After logging into the material performance database system, users can standardize the material performance data and store them to form a shared material performance database. The standardized material performance data of the database system shall be queried and shared under control according to the authority. Then, the database system compares and analyzes the material performance data obtained from controlled query sharing. Finally, the database system visualizes the shared results of controlled queries and the comparative analysis results obtained. The database system adopts the MVC architecture based on B/S (client/server) cross platform J2EE. The Third-party computing platforms are integrated in System. Users can easily use material performance data and related services through browsers and networks. MongoDB database is used for data storage, supporting distributed storage and efficient query.
Authored by Cuifang Zheng, Jiaju Wu, Linggang Kong, Shijia Kang, Zheng Cheng, Bin Luo
In recent years, body-worn RFID and NFC (near field communication) devices have become one of the principal technologies concurring to the rise of healthcare internet of thing (H-IoT) systems. Similarly, points of care (PoCs) moved increasingly closer to patients to reduce the costs while supporting precision medicine and improving chronic illness management, thanks to timely and frequent feedback from the patients themselves. A typical PoC involves medical sensing devices capable of sampling human health, personal equipment with communications and computing capabilities (smartphone or tablet) and a secure software environment for data transmission to medical centers. Hybrid platforms simultaneously employing NFC and ultra-high frequency (UHF) RFID could be successfully developed for the first sensing layer. An application example of the proposed hybrid system for the monitoring of acute myocardial infarction (AMI) survivors details how the combined use of NFC and UHF-RFID in the same PoC can support the multifaceted need of AMI survivors while protecting the sensitive data on the patient’s health.
Authored by Giulio Bianco, Emanuele Raso, Luca Fiore, Alessia Riente, Adina Barba, Carolina Miozzi, Lorenzo Bracciale, Fabiana Arduini, Pierpaolo Loreti, Gaetano Marrocco, Cecilia Occhiuzzi
The selection of distribution network faults is of great significance to accurately identify the fault location, quickly restore power and improve the reliability of power supply. This paper mainly studies the fault phase selection method of distribution network based on wavelet singular entropy and deep belief network (DBN). Firstly, the basic principles of wavelet singular entropy and DBN are analyzed, and on this basis, the DBN model of distribution network fault phase selection is proposed. Firstly, the transient fault current data of the distribution network is processed to obtain the wavelet singular entropy of the three phases, which is used as the input of the fault phase selection model; then the DBN network is improved, and an artificial neural network (ANN) is introduced to make it a fault Select the phase classifier, and specify the output label; finally, use Simulink to build a simulation model of the IEEE33 node distribution network system, obtain a large amount of data of various fault types, generate a training sample library and a test sample library, and analyze the neural network. The adjustment of the structure and the training of the parameters complete the construction of the DBN model for the fault phase selection of the distribution network.
Authored by Jinliang You, Di Zhang, Qingwu Gong, Jiran Zhu, Haiguo Tang, Wei Deng, Tong Kang
Automatic speech recognition (ASR) models are used widely in applications for voice navigation and voice control of domestic appliances. ASRs have been misused by attackers to generate malicious outputs by attacking the deep learning component within ASRs. To assess the security and robustnesss of ASRs, we propose techniques within our framework SPAT that generate blackbox (agnostic to the DNN) adversarial attacks that are portable across ASRs. This is in contrast to existing work that focuses on whitebox attacks that are time consuming and lack portability. Our techniques generate adversarial attacks that have no human audible difference by manipulating the input speech signal using a psychoacoustic model that maintains the audio perturbations below the thresholds of human perception. We propose a framework SPAT with three attack generation techniques based on the psychoacoustic concept and frame selection techniques to selectively target the attack. We evaluate portability and effectiveness of our techniques using three popular ASRs and two input audio datasets using the metrics- Word Error Rate (WER) of output transcription, Similarity to original audio, attack Success Rate on different ASRs and Detection score by a defense system. We found our adversarial attacks were portable across ASRs, not easily detected by a state-of the-art defense system, and had significant difference in output transcriptions while sounding similar to original audio.
Authored by Xiaoliang Wu, Ajitha Rajan
Intelligent, smart, Cloud, reconfigurable manufac-turing, and remote monitoring, all intersect in modern industry and mark the path toward more efficient, effective, and sustain-able factories. Many obstacles are found along the path, including legacy machineries and technologies, security issues, and software that is often hard, slow, and expensive to adapt to face unforeseen challenges and needs in this fast-changing ecosystem. Light-weight, portable, loosely coupled, easily monitored, variegated software components, supporting Edge, Fog and Cloud computing, that can be (re)created, (re)configured and operated from remote through Web requests in a matter of milliseconds, and that rely on libraries of ready-to-use tasks also extendable from remote through sub-second Web requests, constitute a fertile technological ground on top of which fourth-generation industries can be built. In this demo it will be shown how starting from a completely virgin Docker Engine, it is possible to build, configure, destroy, rebuild, operate, exclusively from remote, exclusively via API calls, computation networks that are capable to (i) raise alerts based on configured thresholds or trained ML models, (ii) transform Big Data streams, (iii) produce and persist Big Datasets on the Cloud, (iv) train and persist ML models on the Cloud, (v) use trained models for one-shot or stream predictions, (vi) produce tabular visualizations, line plots, pie charts, histograms, at real-time, from Big Data streams. Also, it will be shown how easily such computation networks can be upgraded with new functionalities at real-time, from remote, via API calls.
Authored by Mirco Soderi, Vignesh Kamath, John Breslin
Large capacity, fast-paced, diversified and high-value data are becoming a hotbed of data processing and research. Privacy security protection based on data life cycle is a method to protect privacy. It is used to protect the confidentiality, integrity and availability of personal data and prevent unauthorized access or use. The main advantage of using this method is that it can fully control all aspects related to the information system and its users. With the opening of the cloud, attackers use the cloud to recalculate and analyze big data that may infringe on others' privacy. Privacy protection based on data life cycle is a means of privacy protection based on the whole process of data production, collection, storage and use. This approach involves all stages from the creation of personal information by individuals (e.g. by filling out forms online or at work) to destruction after use for the intended purpose (e.g. deleting records). Privacy security based on the data life cycle ensures that any personal information collected is used only for the purpose of initial collection and destroyed as soon as possible.
Authored by Hongjun Zhang, Shuyan Cheng, Qingyuan Cai, Xiao Jiang
This paper designs a network security protection system based on artificial intelligence technology from two aspects of hardware and software. The system can simultaneously collect Internet public data and secret-related data inside the unit, and encrypt it through the TCM chip solidified in the hardware to ensure that only designated machines can read secret-related materials. The data edge-cloud collaborative acquisition architecture based on chip encryption can realize the cross-network transmission of confidential data. At the same time, this paper proposes an edge-cloud collaborative information security protection method for industrial control systems by combining end-address hopping and load balancing algorithms. Finally, using WinCC, Unity3D, MySQL and other development environments comprehensively, the feasibility and effectiveness of the system are verified by experiments.
Authored by Xiuyun Lu, Wenxing Zhao, Yuquan Zhu
The Internet of Vehicles (IoVs) performs the rapid expansion of connected devices. This massive number of devices is constantly generating a massive and near-real-time data stream for numerous applications, which is known as big data. Analyzing such big data to find, predict, and control decisions is a critical solution for IoVs to enhance service quality and experience. Thus, the main goal of this paper is to study the impact of big data analytics on traffic prediction in IoVs. In which we have used big data analytics steps to predict the traffic flow, and based on different deep neural models such as LSTM, CNN-LSTM, and GRU. The models are validated using evaluation metrics, MAE, MSE, RMSE, and R2. Hence, a case study based on a real-world road is used to implement and test the efficiency of the traffic prediction models.
Authored by Hakima Khelifi, Amani Belouahri
The Software Defined Networking (SDN) is a solution for Data Center Networks (DCN). This solution offers a centralized control that helps to simplify the management and reduce the big data issues of storage management and data analysis. This paper investigates the performance of deploying an SDN controller in DCN. The paper considers the network topology with a different number of hosts using the Mininet emulator. The paper evaluates the performance of DCN based on Python SDN controllers with a different number of hosts. This evaluation compares POX and RYU controllers as DCN solutions using the throughput, delay, overhead, and convergence time. The results show that the POX outperforms the RYU controller and is the best choice for DCN.
Authored by Jellalah Alzarog, Abdalwart Almhishi, Abubaker Alsunousi, Tareg Abulifa, Wisam Eltarjaman, Salem Sati
Code optimization is an essential feature for compilers and almost all software products are released by compiler optimizations. Consequently, bugs in code optimization will inevitably cast significant impact on the correctness of software systems. Locating optimization bugs in compilers is challenging as compilers typically support a large amount of optimization configurations. Although prior studies have proposed to locate compiler bugs via generating witness test programs, they are still time-consuming and not effective enough. To address such limitations, we propose an automatic bug localization approach, ODFL, for locating compiler optimization bugs via differentiating finer-grained options in this study. Specifically, we first disable the fine-grained options that are enabled by default under the bug-triggering optimization levels independently to obtain bug-free and bug-related fine-grained options. We then configure several effective passing and failing optimization sequences based on such fine-grained options to obtain multiple failing and passing compiler coverage. Finally, such generated coverage information can be utilized via Spectrum-Based Fault Localization formulae to rank the suspicious compiler files. We run ODFL on 60 buggy GCC compilers from an existing benchmark. The experimental results show that ODFL significantly outperforms the state-of-the-art compiler bug isolation approach RecBi in terms of all the evaluated metrics, demonstrating the effectiveness of ODFL. In addition, ODFL is much more efficient than RecBi as it can save more than 88% of the time for locating bugs on average.
Authored by Jing Yang, Yibiao Yang, Maolin Sun, Ming Wen, Yuming Zhou, Hai Jin
The increased dissemination of open source software to a broader audience has led to a proportional increase in the dissemination of vulnerabilities. These vulnerabilities are introduced by developers, some intentionally or negligently. In this paper, we work to quantity the relative risk that a given developer represents to a software project. We propose using empirical software engineering based analysis on the vast data made available by GitHub to create a Developer Risk Score (DRS) for prolific contributors on GitHub. The DRS can then be aggregated across a project as a derived vulnerability assessment, we call this the Computational Vulnerability Assessment Score (CVAS). The CVAS represents the correlation between the Developer Risk score across projects and vulnerabilities attributed to those projects. We believe this to be a contribution in trying to quantity risk introduced by specific developers across open source projects. Both of the risk scores, those for contributors and projects, are derived from an amalgamation of data, both from GitHub and outside GitHub. We seek to provide this risk metric as a force multiplier for the project maintainers that are responsible for reviewing code contributions. We hope this will lead to a reduction in the number of introduced vulnerabilities for projects in the Open Source ecosystem.
Authored by Jon Chapman, Hari Venugopalan
Short-packet communication is a key enabler of various Internet of Things applications that require higher-level security. This proposal briefly reviews block orthogonal sparse superposition (BOSS) codes, which are applicable for secure short-packet transmissions. In addition, following the IEEE 802.11a Wi-Fi standards, we demonstrate the real-time performance of secure short packet transmission using a software-defined radio testbed to verify the feasibility of BOSS codes in a multi-path fading channel environment.
Authored by Bowhyung Lee, Donghwa Han, Namyoon Lee
CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is an important security technique designed to deter bots from abusing software systems, which has broader applications in cyberspace. CAPTCHAs come in a variety of forms, including the deciphering of obfuscated text, transcribing of audio messages, and tracking mouse movement, among others. This paper focuses on using deep learning techniques to recognize text-based CAPTCHAs. In particular, our work focuses on generating training datasets using different CAPTCHA schemes, along with a pre-processing technique allowing for character-based recognition. We have encapsulated the CRABI (CAPTCHA Recognition with Attached Binary Images) framework to give an image multiple labels for improvement in feature extraction. Using real-world datasets, performance evaluations are conducted to validate the efficacy of our proposed approach on several neural network architectures (e.g., custom CNN architecture, VGG16, ResNet50, and MobileNet). The experimental results confirm that over 90% accuracy can be achieved on most models.
Authored by Turhan Kimbrough, Pu Tian, Weixian Liao, Erik Blasch, Wei Yu
Chaos is an interesting phenomenon for nonlinear systems that emerges due to its complex and unpredictable behavior. With the escalated use of low-powered edge-compute devices, data security at the edge develops the need for security in communication. The characteristic that Chaos synchronizes over time for two different chaotic systems with their own unique initial conditions, is the base for chaos implementation in communication. This paper proposes an encryption architecture suitable for communication of on-chip sensors to provide a POC (proof of concept) with security encrypted on the same chip using different chaotic equations. In communication, encryption is achieved with the help of microcontrollers or software implementations that use more power and have complex hardware implementation. The small IoT devices are expected to be operated on low power and constrained with size. At the same time, these devices are highly vulnerable to security threats, which elevates the need to have low power/size hardware-based security. Since the discovery of chaotic equations, they have been used in various encryption applications. The goal of this research is to take the chaotic implementation to the CMOS level with the sensors on the same chip. The hardware co-simulation is demonstrated on an FPGA board for Chua encryption/decryption architecture. The hardware utilization for Lorenz, SprottD, and Chua on FPGA is achieved with Xilinx System Generation (XSG) toolbox which reveals that Lorenz’s utilization is 9% lesser than Chua’s.
Authored by Ravi Monani, Brian Rogers, Amin Rezaei, Ava Hedayatipour
Requirement Elicitation is a key phase in software development. The fundamental goal of security requirement elicitation is to gather appropriate security needs and policies from stakeholders or organizations. The majority of systems fail due to incorrect elicitation procedures, affecting development time and cost. Security requirement elicitation is a major activity of requirement engineering that requires the attention of developers and other stakeholders. To produce quality requirements during software development, the authors suggested a methodology for effective requirement elicitation. Many challenges surround requirement engineering. These concerns can be connected to scope, preconceptions in requirements, etc. Other difficulties include user confusion over technological specifics, leading to confusing system aims. They also don't realize that the requirements are dynamic and prone to change. To protect the privacy of medical images, the proposed image cryptosystem uses a CCM-generated chaotic key series to confuse and diffuse them. A hexadecimal pre-processing technique is used to increase the security of color images utilising a hyper chaos-based image cryptosystem. Finally, a double-layered security system for biometric photos is built employing chaos and DNA cryptography.
Authored by Fahd Al-Qanour, Sivaram Rajeyyagari
Critical infrastructures such as the electricity grid can be severely impacted by cyber-attacks on its supply chain. Hence, having a robust cybersecurity infrastructure and management system for the electricity grid is a high priority. This paper proposes a cyber-security protocol for defense against man-in-the-middle (MiTM) attacks to the supply chain, which uses encryption and cryptographic multi-party authentication. A cyber-physical simulator is utilized to simulate the power system, control system, and security layers. The correctness of the attack modeling and the cryptographic security protocol against this MiTM attack is demonstrated in four different attack scenarios.
Authored by Shuva Paul, Yu-Cheng Chen, Santiago Grijalva, Vincent Mooney
Software supply chain attacks occur during the processes of producing software is compromised, resulting in vulnerabilities that target downstream customers. While the number of successful exploits is limited, the impact of these attacks is significant. Despite increased awareness and research into software supply chain attacks, there is limited information available on mitigating or architecting for these risks, and existing information is focused on singular and independent elements of the supply chain. In this paper, we extensively review software supply chain security using software development tools and infrastructure. We investigate the path that attackers find is least resistant followed by adapting and finding the next best way to complete an attack. We also provide a thorough discussion on how common software supply chain attacks can be prevented, preventing malicious hackers from gaining access to an organization's development tools and infrastructure including the development environment. We considered various SSC attacks on stolen code-sign certificates by malicious attackers and prevented unnoticed malware from passing by security scanners. We are aiming to extend our research to contribute to preventing software supply chain attacks by proposing novel techniques and frameworks.
Authored by Md Faruk, Masrura Tasnim, Hossain Shahriar, Maria Valero, Akond Rahman, Fan Wu
In this work, we discuss data breaches based on the “2012 SocialArks data breach” case study. Data leakage refers to the security violations of unauthorized individuals copying, transmitting, viewing, stealing, or using sensitive, protected, or confidential data. Data leakage is becoming more and more serious, for those traditional information security protection methods like anti-virus software, intrusion detection, and firewalls have been becoming more and more challenging to deal with independently. Nevertheless, fortunately, new IT technologies are rapidly changing and challenging traditional security laws and provide new opportunities to develop the information security market. The SocialArks data breach was caused by a misconfiguration of ElasticSearch Database owned by SocialArks, owned by “Tencent.” The attack methodology is classic, and five common Elasticsearch mistakes discussed the possibilities of those leakages. The defense solution focuses on how to optimize the Elasticsearch server. Furthermore, the ElasticSearch database’s open-source identity also causes many ethical problems, which means that anyone can download and install it for free, and they can install it almost anywhere. Some companies download it and install it on their internal servers, while others download and install it in the cloud (on any provider they want). There are also cloud service companies that provide hosted versions of Elasticsearch, which means they host and manage Elasticsearch clusters for their customers, such as Company Tencent.
Authored by Jun Qian, Zijie Gan, Jie Zhang, Suman Bhunia
Nowadays, network information security is of great concern, and the measurement of the trustworthiness of terminal devices is of great significance to the security of the entire network. The measurement method of terminal device security trust still has the problems of high complexity, lack of universality. In this paper, the device fingerprint library of device access network terminal devices is first established through the device fingerprint mixed collection method; Secondly, the software and hardware features of the device fingerprint are used to increase the uniqueness of the device identification, and the multi- dimensional standard metric is used to measure the trustworthiness of the terminal device; Finally, Block chain technology is used to store the fingerprint and standard model of network access terminal equipment on the chain. To improve the security level of network access devices, a device access method considering the trust of terminal devices from multiple perspectives is implemented.
Authored by Jiaqi Peng, Ke Yang, Jiaxing Xuan, Da Li, Lei Fan
Browsers are one of the most widely used types of software around the world. This prevalence makes browsers a prime target for cyberattacks. To mitigate these threats, users can practice safe browsing habits and take advantage of the security features available to browsers. These protections, however, could be severely crippled if the browser itself were malicious. Presented in this paper is the concept of the evil-twin browser (ETB), a clone of a legitimate browser that looks and behaves identically to the original browser, but discreetly performs other tasks that harm a user's security. To better understand the concept of the evil-twin browser, a prototype ETB named ChroNe was developed. The creation and installation process of ChroN e is discussed in this paper. This paper also explores the motivation behind creating such a browser, examines existing relevant work, inspects the open-source codebase Chromium that assisted in ChroNe's development, and discusses relevant topics like ways to deliver an ETB, the capabilities of an ETB, and possible ways to defend against ETBs.
Authored by Mathew Salcedo, Mehdi Abid, Yoohwan Kim, Ju-Yeon Jo
The authors' industry experiences suggest that compiler warnings, a lightweight version of program analysis, are valuable early bug detection tools. Significant costs are associated with patches and security bulletins for issues that could have been avoided if compiler warnings were addressed. Yet, the industry's attitude towards compiler warnings is mixed. Practices range from silencing all compiler warnings to having a zero-tolerance policy as to any warnings. Current published data indicates that addressing compiler warnings early is beneficial. However, support for this value theory stems from grey literature or is anecdotal. Additional focused research is needed to truly assess the cost-benefit of addressing warnings.
Authored by Gunnar Kudrjavets, Aditya Kumar, Nachiappan Nagappan, Ayushi Rastogi
Binary analysis is pervasively utilized to assess software security and test vulnerabilities without accessing source codes. The analysis validity is heavily influenced by the inferring ability of information related to the code compilation. Among the compilation information, compiler type and optimization level, as the key factors determining how binaries look like, are still difficult to be inferred efficiently with existing tools. In this paper, we conduct a thorough empirical study on the binary's appearance under various compilation settings and propose a lightweight binary analysis tool based on the simplest machine learning method, called DIComP to infer the compiler and optimization level via most relevant features according to the observation. Our comprehensive evaluations demonstrate that DIComP can fully recognize the compiler provenance, and it is effective in inferring the optimization levels with up to 90% accuracy. Also, it is efficient to infer thousands of binaries at a millisecond level with our lightweight machine learning model (1MB).
Authored by Ligeng Chen, Zhongling He, Hao Wu, Fengyuan Xu, Yi Qian, Bing Mao
Neural program embeddings have demonstrated considerable promise in a range of program analysis tasks, including clone identification, program repair, code completion, and program synthesis. However, most existing methods generate neural program embeddings di-rectly from the program source codes, by learning from features such as tokens, abstract syntax trees, and control flow graphs. This paper takes a fresh look at how to improve program embed-dings by leveraging compiler intermediate representation (IR). We first demonstrate simple yet highly effective methods for enhancing embedding quality by training embedding models alongside source code and LLVM IR generated by default optimization levels (e.g., -02). We then introduce IRGEN, a framework based on genetic algorithms (GA), to identify (near-)optimal sequences of optimization flags that can significantly improve embedding quality. We use IRGEN to find optimal sequences of LLVM optimization flags by performing GA on source code datasets. We then extend a popular code embedding model, CodeCMR, by adding a new objective based on triplet loss to enable a joint learning over source code and LLVM IR. We benchmark the quality of embedding using a rep-resentative downstream application, code clone detection. When CodeCMR was trained with source code and LLVM IRs optimized by findings of IRGEN, the embedding quality was significantly im-proved, outperforming the state-of-the-art model, CodeBERT, which was trained only with source code. Our augmented CodeCMR also outperformed CodeCMR trained over source code and IR optimized with default optimization levels. We investigate the properties of optimization flags that increase embedding quality, demonstrate IRGEN's generalization in boosting other embedding models, and establish IRGEN's use in settings with extremely limited training data. Our research and findings demonstrate that a straightforward addition to modern neural code embedding models can provide a highly effective enhancement.
Authored by Zongjie Li, Pingchuan Ma, Huaijin Wang, Shuai Wang, Qiyi Tang, Sen Nie, Shi Wu
The long-living nature and byte-addressability of persistent memory (PM) amplifies the importance of strong memory protections. This paper develops temporal exposure reduction protection (TERP) as a framework for enforcing memory safety. Aiming to minimize the time when a PM region is accessible, TERP offers a complementary dimension of memory protection. The paper gives a formal definition of TERP, explores the semantics space of TERP constructs, and the relations with security and composability in both sequential and parallel executions. It proposes programming system and architecture solutions for the key challenges for the adoption of TERP, which draws on novel supports in both compilers and hardware to efficiently meet the exposure time target. Experiments validate the efficacy of the proposed support of TERP, in both efficiency and exposure time minimization.
Authored by Yuanchao Xu, Chencheng Ye, Xipeng Shen, Yan Solihin