The Activity and Event Network (AEN) graph is a new framework that allows modeling and detecting intrusions by capturing ongoing security-relevant activity and events occurring at a given organization using a large time-varying graph model. The graph is generated by processing various network security logs, such as network packets, system logs, and intrusion detection alerts. In this paper, we show how known attack methods can be captured generically using attack fingerprints based on the AEN graph. The fingerprints are constructed by identifying attack idiosyncrasies under the form of subgraphs that represent indicators of compromise (IOes), and then encoded using Property Graph Query Language (PGQL) queries. Among the many attack types, three main categories are implemented as a proof of concept in this paper: scanning, denial of service (DoS), and authentication breaches; each category contains its common variations. The experimental evaluation of the fingerprints was carried using a combination of intrusion detection datasets and yielded very encouraging results.
Authored by Chenyang Nie, Paulo Quinan, Issa Traore, Isaac Woungang
Ethical bias in machine learning models has become a matter of concern in the software engineering community. Most of the prior software engineering works concentrated on finding ethical bias in models rather than fixing it. After finding bias, the next step is mitigation. Prior researchers mainly tried to use supervised approaches to achieve fairness. However, in the real world, getting data with trustworthy ground truth is challenging and also ground truth can contain human bias. Semi-supervised learning is a technique where, incrementally, labeled data is used to generate pseudo-labels for the rest of data (and then all that data is used for model training). In this work, we apply four popular semi-supervised techniques as pseudo-labelers to create fair classification models. Our framework, Fair-SSL, takes a very small amount (10%) of labeled data as input and generates pseudo-labels for the unlabeled data. We then synthetically generate new data points to balance the training data based on class and protected attribute as proposed by Chakraborty et al. in FSE 2021. Finally, classification model is trained on the balanced pseudo-labeled data and validated on test data. After experimenting on ten datasets and three learners, we find that Fair-SSL achieves similar performance as three state-of-the-art bias mitigation algorithms. That said, the clear advantage of Fair-SSL is that it requires only 10% of the labeled training data. To the best of our knowledge, this is the first SE work where semi-supervised techniques are used to fight against ethical bias in SE ML models. To facilitate open science and replication, all our source code and datasets are publicly available at https://github.com/joymallyac/FairSSL. CCS CONCEPTS • Software and its engineering → Software creation and management; • Computing methodologies → Machine learning. ACM Reference Format: Joymallya Chakraborty, Suvodeep Majumder, and Huy Tu. 2022. Fair-SSL: Building fair ML Software with less data. In International Workshop on Equitable Data and Technology (FairWare ‘22), May 9, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3524491.3527305
Authored by Joymallya Chakraborty, Suvodeep Majumder, Huy Tu
Security in the communication systems rely mainly on a trusted Public Key Infrastructure (PKI) and Certificate Authorities (CAs). Besides the lack of automation, the complexity and the cost of assigning a signed certificate to a device, several allegations against CAs have been discovered, which has created trust issues in adopting this standard model for secure systems. The automation of the servers certificate assignment was achieved by the Automated Certificate Management Environment (ACME) method, but without confirming the trust of assigned certificate. This paper presents a complete tested and implemented solution to solve the trust of the Certificates provided to the servers by using the blockchain platform for certificate validation. The Blockchain network provides an immutable data store, holding the public keys of all domain names, while resolving the trust concerns by applying an automated Blockchain-based Domain Control Validation (B-DCV) for the server and client server verification. The evaluation was performed on the Ethereum Rinkeby testnet adopting the Proof of Authority (PoA) consensus algorithm which is an improved version of Proof of Stake (Po \$S\$) applied on Ethereum 2.0 providing superior performance compared to Ethereum 1.0.
Authored by David Khoury, Patrick Balian, Elie Kfoury
Data or information are being transferred at an enormous pace and hence protecting and securing this transmission of data are very important and have been very challenging. Cryptography and Steganography are the most broadly used techniques for safeguarding data by encryption of data and hiding the existence of data. A multi-layered secure transmission can be achieved by combining Cryptography with Steganography and by adding message authentication ensuring the confidentiality of the message. Different approach towards Steganography implementation is proposed using rotations and flips to prevent detection of encoded messages. Compression of multimedia files is set up for increasing the speed of encoding and consuming less storage space. The HMAC (Hash-based Authentication Code) algorithm is chosen for message authentication and integrity. The performance of the proposed Steganography methods is concluded using Histogram comparative analysis. Simulations have been performed to back the reliability of the proposed method.
Authored by Aditya Kotkar, Shreyas Khadapkar, Aniket Gupta, Smita Jangale
In the present innovation, for the trading of information, the internet is the most well-known and significant medium. With the progression of the web and data innovation, computerized media has become perhaps the most famous and notable data transfer tools. This advanced information incorporates text, pictures, sound, video etc moved over the public organization. The majority of these advanced media appear as pictures and are a significant part in different applications, for example, chat, talk, news, website, web-based business, email, and digital books. The content is still facing various challenges in which including the issues of protection of copyright, modification, authentication. Cryptography, steganography, embedding techniques is widely used to secure the digital data. In this present the hybrid model of LSB steganography and Advanced Encryption Standard (AES) cryptography techniques to enhanced the security of the digital image and text that is undeniably challenging to break by the unapproved person. The security level of the secret information is estimated in the term of MSE and PSNR for better hiding required the low MSE and high PSNR values.
Authored by Manish Kumar, Aman Soni, Ajay Shekhawat, Akash Rawat
In our time the rapid growth of internet and digital communications has been required to be protected from illegal users. It is important to secure the information transmitted between the sender and receiver over the communication channels such as the internet, since it is a public environment. Cryptography and Steganography are the most popular techniques used for sending data in secrete way. In this paper, we are proposing a new algorithm that combines both cryptography and steganography in order to increase the level of data security against attackers. In cryptography, we are using affine hill cipher method; while in steganography we are using Hybrid edge detection with LSB to hide the message. Our paper shows how we can use image edges to hide text message. Grayscale images are used for our experiments and a comparison is developed based on using different edge detection operators such as (canny-LoG ) and (Canny-Sobel). Their performance is measured using PSNR (Peak Signal to Noise ratio), MSE (Mean Squared Error) and EC (Embedding Capacity). The results indicate that, using hybrid edge detection (canny- LoG) with LSB for hiding data could provide high embedding capacity than using hybrid edge detection (canny- Sobel) with LSB. We could prove that hiding in the image edge area could preserve the imperceptibility of the Stego-image. This paper has also proved that the secrete message was extracted successfully without any distortion.
Authored by Fatima Yahia, Ahmed Abushaala
While cloud-based deep learning benefits for high-accuracy inference, it leads to potential privacy risks when exposing sensitive data to untrusted servers. In this paper, we work on exploring the feasibility of steganography in preserving inference privacy. Specifically, we devise GHOST and GHOST+, two private inference solutions employing steganography to make sensitive images invisible in the inference phase. Motivated by the fact that deep neural networks (DNNs) are inherently vulnerable to adversarial attacks, our main idea is turning this vulnerability into the weapon for data privacy, enabling the DNN to misclassify a stego image into the class of the sensitive image hidden in it. The main difference is that GHOST retrains the DNN into a poisoned network to learn the hidden features of sensitive images, but GHOST+ leverages a generative adversarial network (GAN) to produce adversarial perturbations without altering the DNN. For enhanced privacy and a better computation-communication trade-off, both solutions adopt the edge-cloud collaborative framework. Compared with the previous solutions, this is the first work that successfully integrates steganography and the nature of DNNs to achieve private inference while ensuring high accuracy. Extensive experiments validate that steganography has excellent ability in accuracy-aware privacy protection of deep learning.
Authored by Qin Liu, Jiamin Yang, Hongbo Jiang, Jie Wu, Tao Peng, Tian Wang, Guojun Wang
Minimizing embedding impact model of steganography has good performance for steganalysis detection. By using effective distortion cost function and coding method, steganography under this model becomes the mainstream embedding framework recently. In this paper, to improve the anti-detection performance, a new steganography optimization model by constructing a reference cover is proposed. First, a reference cover is construed by performing a filtering operation on the cover image. Then, by minimizing the residual between the reference cover and the original cover, the optimization function is formulated considering the effect of different modification directions. With correcting the distortion cost of +1 and \_1 modification operations, the stego image obtained by the proposed method is more consistent with the natural image. Finally, by applying the proposed framework to the cost function of the well-known HILL embedding, experimental results show that the anti-detection performance of the proposed method is better than the traditional method.
Authored by Shichong Fu, Xiaoling Li, Yao Zhao
Steganography is the technique of hiding a confidential message in an ordinary message where the extraction of embedded information is done at its destination. Among the different carrier files formats; digital images are the most popular. This paper presents a Wavelet-based method for hiding secret information in digital images where skin areas are identified and used as a region of interest. The work presented here is an extension of a method published earlier by the authors that utilized a rule-based approach to detect skin regions. The proposed method, proposed embedding the secret data into the integer Wavelet coefficients of the approximation sub-band of the cover image. When compared to the original technique, experimental results showed a lower error percentage between skin maps detected before the embedding and during the extraction processes. This eventually increased the similarity between the original and the retrieved secret image.
Authored by Mennatallah Sadek, Amal Khalifa, Doaa Khafga
With the development of social networks, traditional covert communication requires more consideration of lossy processes of Social Network Platforms (SNPs), which is called robust steganography. Since JPEG compression is a universal processing of SNPs, a method using repeated JPEG compression to fit transport channel matching is recently proposed and shows strong compression-resist performance. However, the repeated JPEG compression will inevitably introduce other artifacts into the stego image. Using only traditional steganalysis methods does not work well towards such robust steganography under low payload. In this paper, we propose a simple and effective method to detect the mentioned steganography by chasing both steganographic perturbations as well as continuous compression artifacts. We introduce compression-forensic features as a complement to steganalysis features, and then use the ensemble classifier for detection. Experiments demonstrate that this method owns a similar and better performance with respect to both traditional and neural-network-based steganalysis.
Authored by Jinliu Feng, Yaofei Wang, Kejiang Chen, Weiming Zhang, Nenghai Yu
Edge detection based embedding techniques are famous for data security and image quality preservation. These techniques use diverse edge detectors to classify edge and non-edge pixels in an image and then implant secrets in one or both of these classes. Image with conceived data is called stego image. It is noticeable that none of such researches tries to reform the original image from the stego one. Rather, they devote their concentration to extract the hidden message only. This research presents a solution to the raised reversibility problem. Like the others, our research, first, applies an edge detector e.g., canny, in a cover image. The scheme next collects \$n\$-LSBs of each of edge pixels and finally, concatenates them with encrypted message stream. This method applies a lossless compression algorithm to that processed stream. Compression factor is taken such a way that the length of compressed stream does not exceed the length of collected LSBs. The compressed message stream is then implanted only in the edge pixels by \$n\$-LSB substitution method. As the scheme does not destroy the originality of non-edge pixels, it presents better stego quality. By incorporation the mechanisms of encryption, concatenation, compression and \$n\$-LSB, the method has enriched the security of implanted data. The research shows its effectiveness while implanting a small sized message.
Authored by Habiba Sultana, A Kamal
Protection of private and sensitive information is the most alarming issue for security providers in surveillance videos. So to provide privacy as well as to enhance secrecy in surveillance video without affecting its efficiency in detection of violent activities is a challenging task. Here a steganography based algorithm has been proposed which hides private information inside the surveillance video without affecting its accuracy in criminal activity detection. Preprocessing of the surveillance video has been performed using Tunable Q-factor Wavelet Transform (TQWT), secret data has been hidden using Discrete Wavelet Transform (DWT) and after adding payload to the surveillance video, detection of criminal activities has been conducted with maintaining same accuracy as original surveillance video. UCF-crime dataset has been used to validate the proposed framework. Feature extraction is performed and after feature selection it has been trained to Temporal Convolutional Network (TCN) for detection. Performance measure has been compared to the state-of-the-art methods which shows that application of steganography does not affect the detection rate while preserving the perceptual quality of the surveillance video.
Authored by Sonali Rout, Ramesh Mohapatra
In recent times, the occurrence of malware attacks are increasing at an unprecedented rate. Particularly, the image-based malware attacks are spreading worldwide and many people get harmful malware-based images through the technique called steganography. In the existing system, only open malware and files from the internet can be identified. However, the image-based malware cannot be identified and detected. As a result, so many phishers make use of this technique and exploit the target. Social media platforms would be totally harmful to the users. To avoid these difficulties, Machine learning can be implemented to find the steganographic malware images (contents). The proposed methodology performs an automatic detection of malware and steganographic content by using Machine Learning. Steganography is used to hide messages from apparently innocuous media (e.g., images), and steganalysis is the approach used for detecting this malware. This research work proposes a machine learning (ML) approach to perform steganalysis. In the existing system, only open malware and files from the internet are identified but in the recent times many people get harmful malware-based images through the technique called steganography. Social media platforms would be totally harmful to the users. To avoid these difficulties, the proposed Machine learning has been developed to appropriately detect the steganographic malware images (contents). Father, the steganalysis method using machine learning has been developed for performing logistic classification. By using this, the users can avoid sharing the malware images in social media platforms like WhatsApp, Facebook without downloading it. It can be also used in all the photo-sharing sites such as google photos.
Authored by Henry Samuel, Santhanam Kumar, R. Aishwarya, G. Mathivanan
In the present work, we intend to present a thorough study developed on a digital library, called HAT corpus, for a purpose of authorship attribution. Thus, a dataset of 300 documents that are written by 100 different authors, was extracted from the web digital library and processed for a task of author style analysis. All the documents are related to the travel topic and written in Arabic. Basically, three important rules in stylometry should be respected: the minimum document size, the same topic for all documents and the same genre too. In this work, we made a particular effort to respect those conditions seriously during the corpus preparation. That is, three lexical features: Fixed-length words, Rare words and Suffixes are used and evaluated by using a centroid based Manhattan distance. The used identification approach shows interesting results with an accuracy of about 0.94.
Authored by S. Ouamour, H. Sayoud
This research evaluates the accuracy of two methods of authorship prediction: syntactical analysis and n-gram, and explores its potential usage. The proposed algorithm measures n-gram, and counts adjectives, adverbs, verbs, nouns, punctuation, and sentence length from the training data, and normalizes each metric. The proposed algorithm compares the metrics of training samples to testing samples and predicts authorship based on the correlation they share for each metric. The severity of correlation between the testing and training data produces significant weight in the decision-making process. For example, if analysis of one metric approximates 100% positive correlation, the weight in the decision is assigned a maximum value for that metric. Conversely, a 100% negative correlation receives the minimum value. This new method of authorship validation holds promise for future innovation in fraud protection, the study of historical documents, and maintaining integrity within academia.
Authored by Jared Nelson, Mohammad Shekaramiz
The range of text analysis methods in the field of natural language processing (NLP) has become more and more extensive thanks to the increasing computational resources of the 21st century. As a result, many deep learning-based solutions have been proposed for the purpose of authorship attribution, as they offer more flexibility and automated feature extraction compared to traditional statistical methods. A number of solutions have appeared for the attribution of English texts, however, the number of methods designed for Hungarian language is extremely small. Hungarian is a morphologically rich language, sentence formation is flexible and the alphabet is different from other languages. Furthermore, a language specific POS tagger, pretrained word embeddings, dependency parser, etc. are required. As a result, methods designed for other languages cannot be directly applied on Hungarian texts. In this paper, we review deep learning-based authorship attribution methods for English texts and offer techniques for the adaptation of these solutions to Hungarian language. As a part of the paper, we collected a new dataset consisting of Hungarian literary works of 15 authors. In addition, we extensively evaluate the implemented methods on the new dataset.
Authored by Laura Oldal, Gábor Kertész
The globalization of the integrated circuit (IC) manufacturing industry has lured the adversary to come up with numerous malicious activities in the IC supply chain. Logic locking has risen to prominence as a proactive defense strategy against such threats. CAS-Lock (proposed in CHES'20), is an advanced logic locking technique that harnesses the concept of single-point function in providing SAT-attack resiliency. It is claimed to be powerful and efficient enough in mitigating existing state-of-the-art attacks against logic locking techniques. Despite the security robustness of CAS-Lock as claimed by the authors, we expose a serious vulnerability and by exploiting the same we devise a novel attack algorithm against CAS-Lock. The proposed attack can not only reveal the correct key but also the exact AND/OR structure of the implemented CAS-Lock design along with all the key gates utilized in both the blocks of CAS-Lock. It simply relies on the externally observable Distinguishing Input Patterns (DIPs) pertaining to a carefully chosen key simulation of the locked design without the requirement of structural analysis of any kind of the locked netlist. Our attack is successful against various AND/OR cascaded-chain configurations of CAS-Lock and reports 100% success rate in recovering the correct key. It has an attack complexity of \$\textbackslashmathcalO(m)\$, where \$m\$ denotes the number of DIPs obtained for an incorrect key simulation.
Authored by Akashdeep Saha, Urbi Chatterjee, Debdeep Mukhopadhyay, Rajat Chakraborty
Sample-then-lock construction is a reusable fuzzy extractor for low-entropy sources. When applied on iris recognition scenarios, many subsets of an iris-code are used to lock the cryptographic key. The security of this construction relies on the entropy of subsets of iris codes. Simhadri et al. reported a security level of 32 bits on iris sources. In this paper, we propose two kinds of attacks to crack existing sample-then-lock schemes. Exploiting the low-entropy subsets, our attacks can break the locked key and the enrollment iris-code respectively in less than 220 brute force attempts. To protect from these proposed attacks, we design an improved sample-then-lock scheme. More precisely, our scheme employs stability and discriminability to select high-entropy subsets to lock the genuine secret, and conceals genuine locker by a large amount of chaff lockers. Our experiment verifies that existing schemes are vulnerable to the proposed attacks with a security level of less than 20 bits, while our scheme can resist these attacks with a security level of more than 100 bits when number of genuine subsets is 106.
Authored by Feng Zhu, Peisong Shen, Kaini Chen, Yucheng Ma, Chi Chen
The rapid complexity growth of electronic systems nowadays increases their vulnerability to hacking, such as fault injection, including insertion of glitches into the system clock to corrupt internal state through timing errors. As a countermeasure, a frequency locked loop (FLL) based clock glitch detector is proposed in this paper. Regulated from an external supply voltage, this FLL locks at 16-36X of the system clock, creating four phases to measure the system clock by oversampling at 64-144X. The samples are then used to sense the frequency and close the frequency locked loop, as well as to detect glitches through pattern matching. Implemented in a 5nm FINFET process, it can detect the glitches or pulse width variations down to 3.125% of the input 40MHz clock cycle with the supply varying from 0.5 to 1.0V.
Authored by Sanquan Song, Stephen Tell, Brian Zimmer, Sudhir Kudva, Nikola Nedovic, Thomas Gray
Lock design is an important mechanism for scheduling management and security protection in operating systems. However, there is no effective way to identify the differences and connections among lock models, and users need to spend considerable time to understand different lock architectures. In this paper, we propose a classification scheme that abstracts lock design into three types of models: basic spinlock, semaphore amount extension, lock chain structure, and verify the effectiveness of these three types of lock models in the context of current mainstream applications. We also investigate the specific details of applying this classification method, which can be used as a reference for developers to design lock models, thus shorten the software development cycle.
Authored by Yi Gong, Minjie Chen, Lihua Song, Yanfei Guo
The burglary of a safe in the city of Jombang, East Java, lost valuables belonging to the Cemerlang Multipurpose Trading Cooperative. Therefore, a security system tool was created in the safe that serves as a place to store valuables and important assets. Change the security system using the security system with a private unique method with modulo arithmetic pattern. The security system of the safe is designed in layers which are attached with the RFID tag by registering and then verifying it on the card. Entering the password on the card cannot be read or is not performed, then the system will refuse to open it. arduino mega type 256 components, RFID tag is attached to the RFID reader, only one validated passive tag can open access to the security system, namely number B9 20 E3 0F. Meanwhile, of the ten passwords entered, only three match the modulo arithmetic format and can open the security system, namely password numbers 22540, 51324 and 91032. The circuit system on the transistor in the solenoid driver circuit works after the safety system opens. The servo motor can rotate according to the input of the open 900 servo angle rotation program.
Authored by Aripin Triyanto, Ariyawan Sunardi, Woro Nurtiyanto, Moch Ihksanudin, Mardiansyah
Network Intrusion Detection Systems (IDSs) have been used to increase the level of network security for many years. The main purpose of such systems is to detect and block malicious activity in the network traffic. Researchers have been improving the performance of IDS technology for decades by applying various machine-learning techniques. From the perspective of academia, obtaining a quality dataset (i.e. a sufficient amount of captured network packets that contain both malicious and normal traffic) to support machine learning approaches has always been a challenge. There are many datasets publicly available for research purposes, including NSL-KDD, KDDCUP 99, CICIDS 2017 and UNSWNB15. However, these datasets are becoming obsolete over time and may no longer be adequate or valid to model and validate IDSs against state-of-the-art attack techniques. As attack techniques are continuously evolving, datasets used to develop and test IDSs also need to be kept up to date. Proven performance of an IDS tested on old attack patterns does not necessarily mean it will perform well against new patterns. Moreover, existing datasets may lack certain data fields or attributes necessary to analyse some of the new attack techniques. In this paper, we argue that academia needs up-to-date high-quality datasets. We compare publicly available datasets and suggest a way to provide up-to-date high-quality datasets for researchers and the security industry. The proposed solution is to utilize the network traffic captured from the Locked Shields exercise, one of the world’s largest live-fire international cyber defence exercises held annually by the NATO CCDCOE. During this three-day exercise, red team members consisting of dozens of white hackers selected by the governments of over 20 participating countries attempt to infiltrate the networks of over 20 blue teams, who are tasked to defend a fictional country called Berylia. After the exercise, network packets captured from each blue team’s network are handed over to each team. However, the countries are not willing to disclose the packet capture (PCAP) files to the public since these files contain specific information that could reveal how a particular nation might react to certain types of cyberattacks. To overcome this problem, we propose to create a dedicated virtual team, capture all the traffic from this team’s network, and disclose it to the public so that academia can use it for unclassified research and studies. In this way, the organizers of Locked Shields can effectively contribute to the advancement of future artificial intelligence (AI) enabled security solutions by providing annual datasets of up-to-date attack patterns.
Authored by Maj. Halisdemir, Hacer Karacan, Mauno Pihelgas, Toomas Lepik, Sungbaek Cho
We propose TaintLock, a lightweight dynamic scan data authentication and encryption scheme that performs per-pattern authentication and encryption using taint and signature bits embedded within the test pattern. To prevent IP theft, we pair TaintLock with truly random logic locking (TRLL) to ensure resilience against both Oracle-guided and Oracle-free attacks, including scan deobfuscation attacks. TaintLock uses a substitution-permutation (SP) network to cryptographically authenticate each test pattern using embedded taint and signature bits. It further uses cryptographically generated keys to encrypt scan data for unauthenticated users dynamically. We show that it offers a low overhead, non-intrusive secure scan solution without impacting test coverage or test time while preventing IP theft.
Authored by Jonti Talukdar, Arjun Chaudhuri, Krishnendu Chakrabarty
Security is a key concern across the world, and it has been a common thread for all critical sectors. Nowadays, it may be stated that security is a backbone that is absolutely necessary for personal safety. The most important requirements of security systems for individuals are protection against theft and trespassing. CCTV cameras are often employed for security purposes. The biggest disadvantage of CCTV cameras is their high cost and the need for a trustworthy individual to monitor them. As a result, a solution that is both easy and cost-effective, as well as secure has been devised. The smart door lock is built on Raspberry Pi technology, and it works by capturing a picture through the Pi Camera module, detecting a visitor's face, and then allowing them to enter. Local binary pattern approach is used for Face recognition. Remote picture viewing, notification, on mobile device are all possible with an IOT based application. The proposed system may be installed at front doors, lockers, offices, and other locations where security is required. The proposed system has an accuracy of 89%, with an average processing time is 20 seconds for the overall process.
Authored by Om Doshi, Hitesh Bendale, Aarti Chavan, Shraddha More
With the advent of technology and owing to mankind’s reliance on technology, it is of utmost importance to safeguard people’s data and their identity. Biometrics have for long played an important role in providing that layer of security ranging from small scale uses such as house locks to enterprises using them for confidentiality purposes. In this paper we will provide an insight into behavioral biometrics that rely on identifying and measuring human characteristics or behavior. We review different types of behavioral parameters such as keystroke dynamics, gait, footstep pressure signals and more.
Authored by Mahipal Choudhry, Vaibhav Jetli, Siddhant Mathur, Yash Saini