Publications | Science of Security Virtual Organization

EMG Data Collection for Multimodal Keystroke Analysis

User authentication based on muscle tension manifested during password typing seems to be an interesting additional layer of security. It represents another way of verifying a person’s identity, for example in the context of continuous verification. In order to explore the possibilities of such authentication method, it was necessary to create a capturing software that records and stores data from EMG (electromyography) sensors, enabling a subsequent analysis of the recorded data to verify the relevance of the method. The work presented here is devoted to the design, implementation and evaluation of such a solution. The solution consists of a protocol and a software application for collecting multimodal data when typing on a keyboard. Myo armbands on both forearms are used to capture EMG and inertial data while additional modalities are collected from a keyboard and a camera. The user experience evaluation of the solution is presented, too.

Authored by Stefan Korecko, Matus Haluska, Matus Pleva, Markus Skudal, Patrick Bours

Pausing While Programming: Insights From Keystroke Analysis

Pauses in typing are generally considered to indicate cognitive processing and so are of interest in educational contexts. While much prior work has looked at typing behavior of Computer Science students, this paper presents results of a study specifically on the pausing behavior of students in Introductory Computer Programming. We investigate the frequency of pauses of different lengths, what last actions students take before pausing, and whether there is a correlation between pause length and performance in the course. We find evidence that frequency of pauses of all lengths is negatively correlated with performance, and that, while some keystrokes initiate pauses consistently across pause lengths, other keystrokes more commonly initiate short or long pauses. Clustering analysis discovers two groups of students, one that takes relatively fewer mid-to-long pauses and performs better on exams than the other.

Authored by Raj Shrestha, Juho Leinonen, Albina Zavgorodniaia, Arto Hellas, John Edwards

A Resiliency Coordinator Against Malicious Attacks for Cyber-Physical Systems

Resiliency of cyber-physical systems (CPSs) against malicious attacks has been a topic of active research in the past decade due to widely recognized importance. Resilient CPS is capable of tolerating some attacks, operating at a reduced capacity with core functions maintained, and failing gracefully to avoid any catastrophic consequences. Existing work includes an architecture for hierarchical control systems, which is a subset of CPS with wide applicability, that is tailored for resiliency. Namely, the architecture consists of local, network and supervision layers and features such as simplex structure, resource isolation by hypervisors, redundant sensors/actuators, and software defined network capabilities. Existing work also includes methods of ensuring a level of resiliency at each one of the layers, respectively. However, for a holistic system level resiliency, individual methods at each layers must be coordinated in their deployment because all three layers interact for the operation of CPS. For this purpose, a resiliency coordinator for CPS is proposed in this work. The resiliency coordinator is the interconnection of central resiliency coordinator in the supervision layer, network resiliency coordinator in the network layer, and finally, local resiliency coordinators in multiple physical systems that compose the physical layer. We show, by examples, the operation of the resiliency coordinator and illustrate that RC accomplishes a level of attack resiliency greater than the sum of resiliency at each one of the layers separately.

Authored by Yongsoon Eun, Jaegeun Park, Yechan Jeong, Daehoon Kim, Kyung-Joon Park

Power System Resiliency Against Windstorms: A Systematic Framework Based on Dynamic and Steady-State Analysis

Power system robustness against high-impact low probability events is becoming a major concern. To depict distinct phases of a system response during these disturbances, an irregular polygon model is derived from the conventional trapezoid model and the model is analytically investigated for transmission system performance, based on which resiliency metrics are developed for the same. Furthermore, the system resiliency to windstorms is evaluated on the IEEE reliability test system (RTS) by performing steady-state and dynamic security assessment incorporating protection modelling and corrective action schemes using the Power System Simulator for Engineering (PSS®E) software. Based on the results of steady-state and dynamic analysis, modified resiliency metrics are quantified. Finally, this paper quantifies the interdependency of operational and infrastructure resiliency as they cannot be considered discrete characteristics of the system.

Authored by Giritharan Iswaran, Ramin Vakili, Mojdeh Khorsand

MPTCP-based Security Schema in Fog Computing

Recently, Cloud Computing became one of today’s great innovations for provisioning Information Technology (IT) resources. Moreover, a new model has been introduced named Fog Computing, which addresses Cloud Computing paradigm issues regarding time delay and high cost. However, security challenges are still a big concern about the vulnerabilities to both Cloud and Fog Computing systems. Man- in- the- Middle (MITM) is considered one of the most destructive attacks in a Fog Computing context. Moreover, it’s very complex to detect MiTM attacks as it is performed passively at the Software-Defined Networking (SDN) level, also the Fog Computing paradigm is ideally suitable for MITM attacks. In this paper, a MITM mitigation scheme will be proposed consisting of an SDN network (Fog Leaders) which controls a layer of Fog Nodes. Furthermore, Multi-Path TCP (MPTCP) has been used between all edge devices and Fog Nodes to improve resource utilization and security. The proposed solution performance evaluation has been carried out in a simulation environment using Mininet, Ryu SDN controller and Multipath TCP (MPTCP) Linux kernel. The experimental results showed that the proposed solution improves security, network resiliency and resource utilization without any significant overheads compared to the traditional TCP implementation.

Authored by Hossam ELMansy, Khaled Metwally, Khaled Badran

Improving the Prediction Accuracy with Feature Selection for Ransomware Detection

This paper presents the machine learning algorithm to detect whether an executable binary is benign or ransomware. The ransomware cybercriminals have targeted our infrastructure, businesses, and everywhere which has directly affected our national security and daily life. Tackling the ransomware threats more effectively is a big challenge. We applied a machine-learning model to classify and identify the security level for a given suspected malware for ransomware detection and prevention. We use the feature selection data preprocessing to improve the prediction accuracy of the model.

Authored by Chulan Gao, Hossain Shahriar, Dan Lo, Yong Shi, Kai Qian

TEE-based decentralized recommender systems: The raw data sharing redemption

Recommenders are central in many applications today. The most effective recommendation schemes, such as those based on collaborative filtering (CF), exploit similarities between user profiles to make recommendations, but potentially expose private data. Federated learning and decentralized learning systems address this by letting the data stay on user's machines to preserve privacy: each user performs the training on local data and only the model parameters are shared. However, sharing the model parameters across the network may still yield privacy breaches. In this paper, we present Rex, the first enclave-based decentralized CF recommender. Rex exploits Trusted execution environments (TEE), such as Intel software guard extensions (SGX), that provide shielded environments within the processor to improve convergence while preserving privacy. Firstly, Rex enables raw data sharing, which ultimately speeds up convergence and reduces the network load. Secondly, Rex fully preserves privacy. We analyze the impact of raw data sharing in both deep neural network (DNN) and matrix factorization (MF) recommenders and showcase the benefits of trusted environments in a full-fledged implementation of Rex. Our experimental results demonstrate that through raw data sharing, Rex significantly decreases the training time by 18.3 x and the network load by 2 orders of magnitude over standard decentralized approaches that share only parameters, while fully protecting privacy by leveraging trustworthy hardware enclaves with very little overhead.

Authored by Akash Dhasade, Nevena Dresevic, Anne-Marie Kermarrec, Rafael Pires

The Research on Material Properties Database System Based on Network Sharing

Based on the analysis of material performance data management requirements, a network-sharing scheme of material performance data is proposed. A material performance database system including material performance data collection, data query, data analysis, data visualization, data security management and control modules is designed to solve the problems of existing material performance database network sharing, data fusion and multidisciplinary support, and intelligent services Inadequate standardization and data security control. This paper adopts hierarchical access control strategy. After logging into the material performance database system, users can standardize the material performance data and store them to form a shared material performance database. The standardized material performance data of the database system shall be queried and shared under control according to the authority. Then, the database system compares and analyzes the material performance data obtained from controlled query sharing. Finally, the database system visualizes the shared results of controlled queries and the comparative analysis results obtained. The database system adopts the MVC architecture based on B/S (client/server) cross platform J2EE. The Third-party computing platforms are integrated in System. Users can easily use material performance data and related services through browsers and networks. MongoDB database is used for data storage, supporting distributed storage and efficient query.

Authored by Cuifang Zheng, Jiaju Wu, Linggang Kong, Shijia Kang, Zheng Cheng, Bin Luo

Towards a Hybrid UHF RFID and NFC Platform for the Security of Medical Data from a Point of Care

In recent years, body-worn RFID and NFC (near field communication) devices have become one of the principal technologies concurring to the rise of healthcare internet of thing (H-IoT) systems. Similarly, points of care (PoCs) moved increasingly closer to patients to reduce the costs while supporting precision medicine and improving chronic illness management, thanks to timely and frequent feedback from the patients themselves. A typical PoC involves medical sensing devices capable of sampling human health, personal equipment with communications and computing capabilities (smartphone or tablet) and a secure software environment for data transmission to medical centers. Hybrid platforms simultaneously employing NFC and ultra-high frequency (UHF) RFID could be successfully developed for the first sensing layer. An application example of the proposed hybrid system for the monitoring of acute myocardial infarction (AMI) survivors details how the combined use of NFC and UHF-RFID in the same PoC can support the multifaceted need of AMI survivors while protecting the sensitive data on the patient’s health.

Authored by Giulio Bianco, Emanuele Raso, Luca Fiore, Alessia Riente, Adina Barba, Carolina Miozzi, Lorenzo Bracciale, Fabiana Arduini, Pierpaolo Loreti, Gaetano Marrocco, Cecilia Occhiuzzi

Fault phase selection method of distribution network based on wavelet singular entropy and DBN

The selection of distribution network faults is of great significance to accurately identify the fault location, quickly restore power and improve the reliability of power supply. This paper mainly studies the fault phase selection method of distribution network based on wavelet singular entropy and deep belief network (DBN). Firstly, the basic principles of wavelet singular entropy and DBN are analyzed, and on this basis, the DBN model of distribution network fault phase selection is proposed. Firstly, the transient fault current data of the distribution network is processed to obtain the wavelet singular entropy of the three phases, which is used as the input of the fault phase selection model; then the DBN network is improved, and an artificial neural network (ANN) is introduced to make it a fault Select the phase classifier, and specify the output label; finally, use Simulink to build a simulation model of the IEEE33 node distribution network system, obtain a large amount of data of various fault types, generate a training sample library and a test sample library, and analyze the neural network. The adjustment of the structure and the training of the parameters complete the construction of the DBN model for the fault phase selection of the distribution network.

Authored by Jinliang You, Di Zhang, Qingwu Gong, Jiran Zhu, Haiguo Tang, Wei Deng, Tong Kang

Catch Me If You Can: Blackbox Adversarial Attacks on Automatic Speech Recognition using Frequency Masking

Automatic speech recognition (ASR) models are used widely in applications for voice navigation and voice control of domestic appliances. ASRs have been misused by attackers to generate malicious outputs by attacking the deep learning component within ASRs. To assess the security and robustnesss of ASRs, we propose techniques within our framework SPAT that generate blackbox (agnostic to the DNN) adversarial attacks that are portable across ASRs. This is in contrast to existing work that focuses on whitebox attacks that are time consuming and lack portability. Our techniques generate adversarial attacks that have no human audible difference by manipulating the input speech signal using a psychoacoustic model that maintains the audio perturbations below the thresholds of human perception. We propose a framework SPAT with three attack generation techniques based on the psychoacoustic concept and frame selection techniques to selectively target the attack. We evaluate portability and effectiveness of our techniques using three popular ASRs and two input audio datasets using the metrics- Word Error Rate (WER) of output transcription, Similarity to original audio, attack Success Rate on different ASRs and Detection score by a defense system. We found our adversarial attacks were portable across ASRs, not easily detected by a state-of the-art defense system, and had significant difference in output transcriptions while sounding similar to original audio.

Authored by Xiaoliang Wu, Ajitha Rajan

A Demo of a Software Platform for Ubiquitous Big Data Engineering, Visualization, and Analytics, via Reconfigurable Micro-Services, in Smart Factories

Intelligent, smart, Cloud, reconfigurable manufac-turing, and remote monitoring, all intersect in modern industry and mark the path toward more efficient, effective, and sustain-able factories. Many obstacles are found along the path, including legacy machineries and technologies, security issues, and software that is often hard, slow, and expensive to adapt to face unforeseen challenges and needs in this fast-changing ecosystem. Light-weight, portable, loosely coupled, easily monitored, variegated software components, supporting Edge, Fog and Cloud computing, that can be (re)created, (re)configured and operated from remote through Web requests in a matter of milliseconds, and that rely on libraries of ready-to-use tasks also extendable from remote through sub-second Web requests, constitute a fertile technological ground on top of which fourth-generation industries can be built. In this demo it will be shown how starting from a completely virgin Docker Engine, it is possible to build, configure, destroy, rebuild, operate, exclusively from remote, exclusively via API calls, computation networks that are capable to (i) raise alerts based on configured thresholds or trained ML models, (ii) transform Big Data streams, (iii) produce and persist Big Datasets on the Cloud, (iv) train and persist ML models on the Cloud, (v) use trained models for one-shot or stream predictions, (vi) produce tabular visualizations, line plots, pie charts, histograms, at real-time, from Big Data streams. Also, it will be shown how easily such computation networks can be upgraded with new functionalities at real-time, from remote, via API calls.

Authored by Mirco Soderi, Vignesh Kamath, John Breslin

Privacy security protection based on data life cycle

Large capacity, fast-paced, diversified and high-value data are becoming a hotbed of data processing and research. Privacy security protection based on data life cycle is a method to protect privacy. It is used to protect the confidentiality, integrity and availability of personal data and prevent unauthorized access or use. The main advantage of using this method is that it can fully control all aspects related to the information system and its users. With the opening of the cloud, attackers use the cloud to recalculate and analyze big data that may infringe on others' privacy. Privacy protection based on data life cycle is a means of privacy protection based on the whole process of data production, collection, storage and use. This approach involves all stages from the creation of personal information by individuals (e.g. by filling out forms online or at work) to destruction after use for the intended purpose (e.g. deleting records). Privacy security based on the data life cycle ensures that any personal information collected is used only for the purpose of initial collection and destroyed as soon as possible.

Authored by Hongjun Zhang, Shuyan Cheng, Qingyuan Cai, Xiao Jiang

Research on Network Security Protection System Based on Computer Big Data Era

This paper designs a network security protection system based on artificial intelligence technology from two aspects of hardware and software. The system can simultaneously collect Internet public data and secret-related data inside the unit, and encrypt it through the TCM chip solidified in the hardware to ensure that only designated machines can read secret-related materials. The data edge-cloud collaborative acquisition architecture based on chip encryption can realize the cross-network transmission of confidential data. At the same time, this paper proposes an edge-cloud collaborative information security protection method for industrial control systems by combining end-address hopping and load balancing algorithms. Finally, using WinCC, Unity3D, MySQL and other development environments comprehensively, the feasibility and effectiveness of the system are verified by experiments.

Authored by Xiuyun Lu, Wenxing Zhao, Yuquan Zhu

The Impact of Big Data Analytics on Traffic Prediction

The Internet of Vehicles (IoVs) performs the rapid expansion of connected devices. This massive number of devices is constantly generating a massive and near-real-time data stream for numerous applications, which is known as big data. Analyzing such big data to find, predict, and control decisions is a critical solution for IoVs to enhance service quality and experience. Thus, the main goal of this paper is to study the impact of big data analytics on traffic prediction in IoVs. In which we have used big data analytics steps to predict the traffic flow, and based on different deep neural models such as LSTM, CNN-LSTM, and GRU. The models are validated using evaluation metrics, MAE, MSE, RMSE, and R2. Hence, a case study based on a real-world road is used to implement and test the efficiency of the traffic prediction models.

Authored by Hakima Khelifi, Amani Belouahri

POX Controller Evaluation Based On Tree Topology For Data Centers

The Software Defined Networking (SDN) is a solution for Data Center Networks (DCN). This solution offers a centralized control that helps to simplify the management and reduce the big data issues of storage management and data analysis. This paper investigates the performance of deploying an SDN controller in DCN. The paper considers the network topology with a different number of hosts using the Mininet emulator. The paper evaluates the performance of DCN based on Python SDN controllers with a different number of hosts. This evaluation compares POX and RYU controllers as DCN solutions using the throughput, delay, overhead, and convergence time. The results show that the POX outperforms the RYU controller and is the best choice for DCN.

Authored by Jellalah Alzarog, Abdalwart Almhishi, Abubaker Alsunousi, Tareg Abulifa, Wisam Eltarjaman, Salem Sati

Isolating Compiler Optimization Faults via Differentiating Finer-grained Options

Code optimization is an essential feature for compilers and almost all software products are released by compiler optimizations. Consequently, bugs in code optimization will inevitably cast significant impact on the correctness of software systems. Locating optimization bugs in compilers is challenging as compilers typically support a large amount of optimization configurations. Although prior studies have proposed to locate compiler bugs via generating witness test programs, they are still time-consuming and not effective enough. To address such limitations, we propose an automatic bug localization approach, ODFL, for locating compiler optimization bugs via differentiating finer-grained options in this study. Specifically, we first disable the fine-grained options that are enabled by default under the bug-triggering optimization levels independently to obtain bug-free and bug-related fine-grained options. We then configure several effective passing and failing optimization sequences based on such fine-grained options to obtain multiple failing and passing compiler coverage. Finally, such generated coverage information can be utilized via Spectrum-Based Fault Localization formulae to rank the suspicious compiler files. We run ODFL on 60 buggy GCC compilers from an existing benchmark. The experimental results show that ODFL significantly outperforms the state-of-the-art compiler bug isolation approach RecBi in terms of all the evaluated metrics, demonstrating the effectiveness of ODFL. In addition, ODFL is much more efficient than RecBi as it can save more than 88% of the time for locating bugs on average.

Authored by Jing Yang, Yibiao Yang, Maolin Sun, Ming Wen, Yuming Zhou, Hai Jin

Open Source Software Computed Risk Framework

The increased dissemination of open source software to a broader audience has led to a proportional increase in the dissemination of vulnerabilities. These vulnerabilities are introduced by developers, some intentionally or negligently. In this paper, we work to quantity the relative risk that a given developer represents to a software project. We propose using empirical software engineering based analysis on the vast data made available by GitHub to create a Developer Risk Score (DRS) for prolific contributors on GitHub. The DRS can then be aggregated across a project as a derived vulnerability assessment, we call this the Computational Vulnerability Assessment Score (CVAS). The CVAS represents the correlation between the Developer Risk score across projects and vulnerabilities attributed to those projects. We believe this to be a contribution in trying to quantity risk introduced by specific developers across open source projects. Both of the risk scores, those for contributors and projects, are derived from an amalgamation of data, both from GitHub and outside GitHub. We seek to provide this risk metric as a force multiplier for the project maintainers that are responsible for reviewing code contributions. We hope this will lead to a reduction in the number of introduced vulnerabilities for projects in the Open Source ecosystem.

Authored by Jon Chapman, Hari Venugopalan

Demo: Real-Time Implementation of Block Orthogonal Sparse Superposition Codes

Short-packet communication is a key enabler of various Internet of Things applications that require higher-level security. This proposal briefly reviews block orthogonal sparse superposition (BOSS) codes, which are applicable for secure short-packet transmissions. In addition, following the IEEE 802.11a Wi-Fi standards, we demonstrate the real-time performance of secure short packet transmission using a software-defined radio testbed to verify the feasibility of BOSS codes in a multi-path fading channel environment.

Authored by Bowhyung Lee, Donghwa Han, Namyoon Lee

Deep CAPTCHA Recognition Using Encapsulated Preprocessing and Heterogeneous Datasets

CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is an important security technique designed to deter bots from abusing software systems, which has broader applications in cyberspace. CAPTCHAs come in a variety of forms, including the deciphering of obfuscated text, transcribing of audio messages, and tracking mouse movement, among others. This paper focuses on using deep learning techniques to recognize text-based CAPTCHAs. In particular, our work focuses on generating training datasets using different CAPTCHA schemes, along with a pre-processing technique allowing for character-based recognition. We have encapsulated the CRABI (CAPTCHA Recognition with Attached Binary Images) framework to give an image multiple labels for improvement in feature extraction. Using real-world datasets, performance evaluations are conducted to validate the efficacy of our proposed approach on several neural network architectures (e.g., custom CNN architecture, VGG16, ResNet50, and MobileNet). The experimental results confirm that over 90% accuracy can be achieved on most models.

Authored by Turhan Kimbrough, Pu Tian, Weixian Liao, Erik Blasch, Wei Yu

Implementation of Chaotic Encryption Architecture on FPGA for On-Chip Secure Communication

Chaos is an interesting phenomenon for nonlinear systems that emerges due to its complex and unpredictable behavior. With the escalated use of low-powered edge-compute devices, data security at the edge develops the need for security in communication. The characteristic that Chaos synchronizes over time for two different chaotic systems with their own unique initial conditions, is the base for chaos implementation in communication. This paper proposes an encryption architecture suitable for communication of on-chip sensors to provide a POC (proof of concept) with security encrypted on the same chip using different chaotic equations. In communication, encryption is achieved with the help of microcontrollers or software implementations that use more power and have complex hardware implementation. The small IoT devices are expected to be operated on low power and constrained with size. At the same time, these devices are highly vulnerable to security threats, which elevates the need to have low power/size hardware-based security. Since the discovery of chaotic equations, they have been used in various encryption applications. The goal of this research is to take the chaotic implementation to the CMOS level with the sensors on the same chip. The hardware co-simulation is demonstrated on an FPGA board for Chua encryption/decryption architecture. The hardware utilization for Lorenz, SprottD, and Chua on FPGA is achieved with Xilinx System Generation (XSG) toolbox which reveals that Lorenz’s utilization is 9% lesser than Chua’s.

Authored by Ravi Monani, Brian Rogers, Amin Rezaei, Ava Hedayatipour

Managing Information and Network Security using Chaotic Bio Molecular Computing Technique

Requirement Elicitation is a key phase in software development. The fundamental goal of security requirement elicitation is to gather appropriate security needs and policies from stakeholders or organizations. The majority of systems fail due to incorrect elicitation procedures, affecting development time and cost. Security requirement elicitation is a major activity of requirement engineering that requires the attention of developers and other stakeholders. To produce quality requirements during software development, the authors suggested a methodology for effective requirement elicitation. Many challenges surround requirement engineering. These concerns can be connected to scope, preconceptions in requirements, etc. Other difficulties include user confusion over technological specifics, leading to confusing system aims. They also don't realize that the requirements are dynamic and prone to change. To protect the privacy of medical images, the proposed image cryptosystem uses a CCM-generated chaotic key series to confuse and diffuse them. A hexadecimal pre-processing technique is used to increase the security of color images utilising a hyper chaos-based image cryptosystem. Finally, a double-layered security system for biometric photos is built employing chaos and DNA cryptography.

Authored by Fahd Al-Qanour, Sivaram Rajeyyagari

A Cryptographic Method for Defense Against MiTM Cyber Attack in the Electricity Grid Supply Chain

Critical infrastructures such as the electricity grid can be severely impacted by cyber-attacks on its supply chain. Hence, having a robust cybersecurity infrastructure and management system for the electricity grid is a high priority. This paper proposes a cyber-security protocol for defense against man-in-the-middle (MiTM) attacks to the supply chain, which uses encryption and cryptographic multi-party authentication. A cyber-physical simulator is utilized to simulate the power system, control system, and security layers. The correctness of the attack modeling and the cryptographic security protocol against this MiTM attack is demonstrated in four different attack scenarios.

Authored by Shuva Paul, Yu-Cheng Chen, Santiago Grijalva, Vincent Mooney

Investigating Novel Approaches to Defend Software Supply Chain Attacks

Software supply chain attacks occur during the processes of producing software is compromised, resulting in vulnerabilities that target downstream customers. While the number of successful exploits is limited, the impact of these attacks is significant. Despite increased awareness and research into software supply chain attacks, there is limited information available on mitigating or architecting for these risks, and existing information is focused on singular and independent elements of the supply chain. In this paper, we extensively review software supply chain security using software development tools and infrastructure. We investigate the path that attackers find is least resistant followed by adapting and finding the next best way to complete an attack. We also provide a thorough discussion on how common software supply chain attacks can be prevented, preventing malicious hackers from gaining access to an organization's development tools and infrastructure including the development environment. We considered various SSC attacks on stolen code-sign certificates by malicious attackers and prevented unnoticed malware from passing by security scanners. We are aiming to extend our research to contribute to preventing software supply chain attacks by proposing novel techniques and frameworks.

Authored by Md Faruk, Masrura Tasnim, Hossain Shahriar, Maria Valero, Akond Rahman, Fan Wu

Analyzing SocialArks Data Leak - A Brute Force Web Login Attack

In this work, we discuss data breaches based on the “2012 SocialArks data breach” case study. Data leakage refers to the security violations of unauthorized individuals copying, transmitting, viewing, stealing, or using sensitive, protected, or confidential data. Data leakage is becoming more and more serious, for those traditional information security protection methods like anti-virus software, intrusion detection, and firewalls have been becoming more and more challenging to deal with independently. Nevertheless, fortunately, new IT technologies are rapidly changing and challenging traditional security laws and provide new opportunities to develop the information security market. The SocialArks data breach was caused by a misconfiguration of ElasticSearch Database owned by SocialArks, owned by “Tencent.” The attack methodology is classic, and five common Elasticsearch mistakes discussed the possibilities of those leakages. The defense solution focuses on how to optimize the Elasticsearch server. Furthermore, the ElasticSearch database’s open-source identity also causes many ethical problems, which means that anyone can download and install it for free, and they can install it almost anywhere. Some companies download it and install it on their internal servers, while others download and install it in the cloud (on any provider they want). There are also cloud service companies that provide hosted versions of Elasticsearch, which means they host and manage Elasticsearch clusters for their customers, such as Company Tencent.

Authored by Jun Qian, Zijie Gan, Jie Zhang, Suman Bhunia