Nearest Neighbor Search - Security CCTV cameras are important for public safety. These cameras record continuously 24/7 and produce a large amount of video data. If the videos are not reviewed immediately after an incident, it can be difficult and timeconsuming to find a specific person out of many hours of recording. In this work we present a system that can search for people that fit a textual description in a video. It utilizes a imagetext multimodal deep learning model to calculate the similarity between an image of a person against a text description and find the top matches. Normally this would require calculating the textimage similarity scores between one text description and every person in the video, which is O(n) in the number of people in the video and therefore impractical for real-time search. We propose a solution to this by pre-calculating embeddings of person images and applying approximate nearest neighbor vector search. At inference time, only one forward pass through the deep learning model is needed, the computational cost is therefore the time to embed a text description O(1), plus the time to perform an approximate nearest neighbor search O(log(n)). This makes realtime interactive search possible.
Authored by Sumeth Yuenyong
Natural Language Processing - Dissemination of fake news is a matter of major concern that can result in national and social damage with devastating impacts. The misleading information on the internet is dubious and seems to be arduous for identification. Machine learning models are becoming an irreplaceable component in the detection of fake news spreading on the social media. LSTM is a memory based machine learning model for the detection of false news. LSTM has a promising approach and eradicates the issue of vanishing gradient in RNNs. The integration of natural language processing and LSTM model is considered to be effective in the false news identification.
Authored by Abina Azees, Geevarghese Titus
Natural Language Processing - Rule-based Web vulnerability detection is the most common method, usually based on the analysis of the website code and the feedback on detection of the target. In the process, large amount of contaminated data and network pressure will be generated, the false positive rate is high. This study implements a detection platform on the basis of the crawler and NLP. We use the crawler obtain the HTTP request on the target system firstly, classify the dataset according to whether there is parameter and whether the samples get to interact with a database. then we convert text word vector, carries on the dimensionality of serialized, through train dataset by NLP algorithm, finally obtain a model that can accurately predict Web vulnerabilities. Experimental results show that this method can detect Web vulnerabilities efficiently, greatly reduce invalid attack test parameters, and reduce network pressure.
Authored by Xin Ge, Min-Nan Yue
Natural Language Processing - Application code analysis and static rules are the most common methods for Web vulnerability detection, but this process will generate a large amount of contaminated data and network pressure, the false positive rate is high. This study implements a detection system on the basis of the crawler and NLP. The crawler visits page in imitation of a human, we collect the HTTP request and response as dataset, classify the dataset according to parameter characteristic and whether the samples get to interact with a database, then we convert text word vector, reduce the dimension and serialized them, through train dataset by NLP algorithm, finally we obtain a model that can accurately predict Web vulnerabilities. Experimental results show that this method can detect Web vulnerabilities efficiently, greatly reduce invalid attack test parameters, and reduce network pressure.
Authored by Xin Ge, Minnan Yue
Natural Language Processing - Story Ending Generation (SEG) is a challenging task in natural language generation. Recently, methods based on Pretrained Language Models (PLM) have achieved great prosperity, which can produce fluent and coherent story endings. However, the pre-training objective of PLM-based methods is unable to model the consistency between story context and ending. The goal of this paper is to adopt contrastive learning to generate endings more consistent with story context, while there are two main challenges in contrastive learning of SEG. First is the negative sampling of wrong endings inconsistent with story contexts. The second challenge is the adaptation of contrastive learning for SEG. To address these two issues, we propose a novel Contrastive Learning framework for Story Ending Generation (CLSEG)†, which has two steps: multi-aspect sampling and story-specific contrastive learning. Particularly, for the first issue, we utilize novel multi-aspect sampling mechanisms to obtain wrong endings considering the consistency of order, causality, and sentiment. To solve the second issue, we well-design a story-specific contrastive training strategy that is adapted for SEG. Experiments show that CLSEG outperforms baselines and can produce story endings with stronger consistency and rationality.
Authored by Yuqiang Xie, Yue Hu, Luxi Xing, Yunpeng Li, Wei Peng, Ping Guo
Natural Language Processing - The new capital city (IKN) of the Republic of Indonesia has been ratified and inaugurated by President Joko Widodo since January 2022. Unfortunately, there are still many Indonesian citizens who do not understand all the information regarding the determination of the new capital city. Even though the Indonesian Government has created an official website regarding the new capital city (www.ikn.go.id) the information is still not optimal because web page visitors are still unable to interact actively with the required information. Therefore, the development of the Chatting Robot (Chatbot) application is deemed necessary to become an interactive component in obtaining information needed by users related to new capital city. In this study, a chatbot application was developed by applying Natural Language Processing (NLP) using the Term Frequency-Inverse Document Frequency (TFIDF) method for term weighting and the Cosine-Similarity algorithm to calculate the similarity of the questions asked by the user. The research successfully designed and developed a chatbot application using the Cosine-Similarity algorithm. The testing phase of the chatbot model uses several scenarios related to the points of NLP implementation. The test results show that all scenarios of questions asked can be responded well by the chatbot.
Authored by Harry Achsan, Deni Kurniawan, Diki Purnama, Quintin Barcah, Yuri Astoria
Natural Language Processing - In today’s digital era, online attacks are increasing in number and are becoming severe day by day, especially those related to web applications. The data accessible over the web persuades the attackers to dispatch new kinds of attacks. Serious exploration on web security has shown that the most hazardous attack that affects web security is the Structured Query Language Injection(SQLI). This attack addresses a genuine threat to web application security and a few examination works have been directed to defend against this attack by detecting it when it happens. Traditional methods like input validation and filtering, use of parameterized queries, etc. are not sufficient to counter these attacks as they rely solely on the implementation of the code hence factoring in the developer’s skill-set which in turn gave rise to Machine Learning based solutions. In this study, we have proposed a novel approach that takes the help of Natural Language Processing(NLP) and uses BERT for feature extraction that is capable to adapt to SQLI variants and provides an accuracy of 97\% with a false positive rate of 0.8\% and a false negative rate of 5.8\%.
Authored by Sagar Lakhani, Ashok Yadav, Vrijendra Singh
Natural Language Processing - In today s digital age, businesses create tremendous data as part of their regular operations. On legacy or cloud platforms, this data is stored mainly in structured, semi-structured, and unstructured formats, and most of the data kept in the cloud are amorphous, containing sensitive information. With the evolution of AI, organizations are using deep learning and natural language processing to extract the meaning of these big data through unstructured data analysis and insights (UDAI). This study aims to investigate the influence of these unstructured big data analyses and insights on the organization s decision-making system (DMS), financial sustainability, customer lifetime value (CLV), and organization s long-term growth prospects while encouraging a culture of self-service analytics. This study uses a validated survey instrument to collect the responses from Fortune-500 organizations to find the adaptability and influence of UDAI in current data-driven decision making and how it impacts organizational DMS, financial sustainability and CLV.
Authored by Bibhu Dash, Swati Swayamsiddha, Azad Ali
Natural Language Processing - Natural language processing (NLP) is a computer program that trains computers to read and understand the text and spoken words in the same way that people do. In Natural Language Processing, Named Entity Recognition (NER) is a crucial field. It extracts information from given texts and is used to translate machines, text to speech synthesis, to understand natural language, etc. Its main goal is to categorize words in a text that represent names into specified tags like location, organization, person-name, date, time, and measures. In this paper, the proposed method extracts entities on Hindi Fraud Call (publicly not available) annotated Corpus using XLM-Roberta (base-sized model). By pre-training model to build the accurate NER system for datasets, the Authors are using XLM-Roberta as a multi-layer bidirectional transformer encoder for learning deep bidirectional Hindi word representations. The fine-tuning concept is used in this proposed method. XLM-Roberta Model has been fine-tuned to extract nine entities from sentences based on context of sentences to achieve better performance. An Annotated corpus for Hindi with a tag set of Nine different Named Entity (NE) classes, defined as part of the NER Shared Task for South and Southeast Asian Languages (SSEAL) at IJCNLP. Nine entities have been recognized from sentences. The Obtained F1-score(micro) and F1-score(macro) are 0.96 and 0.80, respectively.
Authored by Aditya Choure, Rahul Adhao, Vinod Pachghare
Natural Language Processing - The Internet of Thigs is mainly considered as the key technology tools which enables in connecting many devices through the use of internet, this has enabled in overall exchange of data and information, support in receiving the instruction and enable in acting upon it in an effective manner. With the advent of IoT, many devices are connected to the internet which enable in assisting the individuals to operate the devise virtually, share data and program required actions. This study is focused in understanding the key determinants of creating smart homes by applying natural language processing (NLP) through IoT. The major determinants considered are Integrating voice understanding into devices; Ability to control the devices remotely and support in reducing the energy bills.
Authored by Shahanawaj Ahamad, Deepalkumar Shah, R. Udhayakumar, T.S. Rajeswari, Pankaj Khatiwada, Joel Alanya-Beltran
Natural Language Processing - This paper presents a system to identify social engineering attacks using only text as input. This system can be used in different environments which the input is text such as SMS, chats, emails, etc. The system uses Natural Language Processing to extract features from the dialog text such as URL s report and count, spell check, blacklist count, and others. The features are used to train Machine Learning algorithms (Neural Network, Random Forest and SVM) to perform classification of social engineering attacks. The classification algorithms showed an accuracy over 80\% to detect this type of attacks.
Authored by Juan Lopez, Jorge Camargo
Named Data Network Security - Design of the English APP security verification framework based on fusion IP-Address-MAC data features is studied in the paper. APP is named the client application, including third-party applications on PCs and mobile terminals, that is, smartphones. At present, Praat has become a software commonly used by researchers in the world of experimental phonetics, linguistics, language investigation, language processing and other related fields. Under this background, our target is selected to be the English AP. For the design of the framework, node forms a corresponding topology table according to the neighbor list detected by itself and the topology information obtained from the received TC message. To deal with the challenge of the high robustness, the IP and MAC data analysis are both considered. Through the data collection, processing and the further fusion, the comprehensive system is implemented. The proposed model is tested under different testing scenarios.
Authored by Jinxun Yu, Kai Xia
Named Data Network Security - Internet of Things (IoT) is becoming an important approach to accomplish healthcare monitoring where critical medical data retrieval is essential in a secure and private manner. Nevertheless, IoT devices have constrained resources. Therefore, acquisition of efficient, secure and private data is very challenging. The current research on applying architecture of Named Data Networking (NDN) to IoT design reveals very promising results. Therefore, we are motivated to combine NDN and IoT, which we call NDN-IoT architecture, for a healthcare application. Inspired by the idea, we propose a healthcare monitoring groundwork integrating NDN concepts into IoT in Contiki NG OS at the network layer that we call µNDN as it is a micro and light-weight implementation. We quantitatively explore the usage of the NDN-IoT approach to understand its efficiency for medical data retrieval. Reliability and delay performances were evaluated and analyzed for a remote health application. Our results, in this study, show that the µNDN architecture performs better than IP architecture when retrieving medical data. Thus, it is worth exploring the µNDN architecture further.
Authored by Alper Demir, Gokce Manap
Named Data Network Security - This article provides an overview of the security of VANET, which is a vehicle network. When reviewing this topic, publications of various researchers were considered. The article provides information security requirements for VANET, an overview of security research, an overview of existing attacks, methods for detecting attacks and appropriate countermeasures against such threats.
Authored by Halimjon Khujamatov, Amir Lazarev, Nurshod Akhmedov, Nurbek Asenbaev, Aybek Bekturdiev
Named Data Network Security - In networking, the data transmission rate is the coreelement to measure the network performance capability. A stable network infrastructure should support high transmission capacity with guaranteed network quality. In Named Data Networking (NDN), the performance of producer has been a hot topic to be discussed due to its transmission challenges. Hence in this paper, an analysis of transmission delay for single and multiple producers are discussed in detail. The simulation of network transmission delay for single producer and multiple producers is carried out using ndnSIM simulator. The factors that impacting network transmissions, such as sequence number and retransmission times are highlighted. The simulation results provide acceptable data to assist the development of more complextopology for NDN producers.
Authored by Zhang Wenhua, Wan Azamuddin, Azana Aman
Named Data Network Security - This research focuses on the interest flooding attack model and its impact on the consumer in the Named Data Networking (NDN) architecture. NDN is a future internet network architecture has advantages compared to the current internet architecture. The NDN communication model changes the communication paradigm from a packet delivery model based on IP addresses to names. Data content needed is not directly taken from the provider but stored in a distributed manner on the router. Other consumer request data can served by nearest router. It will increase the speed of data access and reduce delay. The changes communication model also have an impact on the existing security system. One attack that may occur is the threat of a denial of service (DoS) known as an interest flooding attack. This attack makes the network services are being unavailable. This paper discussed examining the interest flooding attack model that occurred and its impact on the performance of NDN. The result shows that interest flooding attacks can decrease consumer satisfied interest.
Authored by Jupriyadi, Syaiful Ahdan, Adi Sucipto, Eki Hamidi, Hasan Arifin, Nana Syambas
Named Data Network Security - Named Data Networking (NDN) is a network with a future internet architecture that changes the point of view in networking from host-centric to data-centric. Named data networking provides a network system where the routing system is no longer dependent on traditional IP. Network packets are routed through nodes by name. When many manufacturers produce packages with different names for several consumers, routing with load balancing is necessary. The case study carried out is to conduct a simulation by connecting all UIN campuses into a topology with the name UIN Topology in Indonesia, using several scenarios to describe the effectiveness of the load balancer on the UIN topology in Indonesia. This study focuses on load balancer applications to reduce delays in Named Data Networking (NDN), the topology of UIN in Indonesia.
Authored by Eki Hamidi, Syaiful Ahdan, Jupriyadi, Adi Sucinto, Hasan Arifin, Nana Syambas
Named Data Network Security - The concept of the internet in the future will prioritize content, by reducing delays in data transmission. Named Data Networking (NDN) is a content-based future internet concept that changes the paradigm of using IP. Inside the NDN router, there are three data structures, namely Content Store (CS), Pending Interest Table (PIT), and Forwarding Information Base (FIB). Pending Interest Table (PIT) contains a list of unfulfilled interests. This condition occurs when the node has not received a response after the interest forwarding process. Measurable and fast PIT performance is a challenge in Named Data Networks. In this study, we will try to do a simulation to measure and analyze the performance of PIT in NDN in the Palapa Ring topology. The research was conducted using the NDNSim simulator, to see the performance in the PIT. The simulation and analysis of the results show that the granularity of a prefix has an effect on In Satisfied Interest in an NDN network. At the number of interests of 100, the result obtained from the simulation is that there is a decrease in the percentage of interest data served, amounting to more than 20\%. At the amount of interest in 1000 about more than 30\%. The length of the prefix and the number of interest sent by the consumer affect the performance of the PIT, seen from the number of In Satisfied Interests.
Authored by Adi Sucipto, Jupriyadi, Syaiful Ahdan, Hasan Arifin, Eki Hamidi, Nana Syambas
Named Data Network Security - With the growing recognition that current Internet protocols have significant security flaws; several ongoing research projects are attempting to design potential next-generation Internet architectures to eliminate flaws made in the past. These projects are attempting to address privacy and security as their essential parameters. NDN (Named Data Networking) is a new networking paradigm that is being investigated as a potential alternative for the present host-centric IP-based Internet architecture. It concentrates on content delivery, which is probably underserved by IP, and it prioritizes security and privacy. NDN must be resistant to present and upcoming threats in order to become a feasible Internet framework. DDoS (Distributed Denial of Service) attacks are serious attacks that have the potential to interrupt servers, systems, or application layers. Due to the probability of this attack, the network security environment is made susceptible. The resilience of any new architecture against the DDoS attacks which afflict today s Internet is a critical concern that demands comprehensive consideration. As a result, research on feature selection approaches was conducted in order to use machine learning techniques to identify DDoS attacks in NDN. In this research, features were chosen using the Information Gain and Data Reduction approach with the aid of the WEKA machine learning tool to identify DDoS attacks. The dataset was tested using KNearest Neighbor (KNN), Decision Table, and Artificial Neural Network (ANN) algorithms to categorize the selected features. Experimental results shows that Decision Table classifier outperforms well when compared to other classification algorithms with the with the accuracy of 85.42\% and obtained highest precision and recall score with 0.876 and 0.854 respectively when compared to the other classification techniques.
Authored by Subasri I, Emil R, Ramkumar P
Named Data Network Security - With the continuous development of network technology as well as science and technology, artificial intelligence technology and its related scientific and technological applications, in this process, were born. Among them, artificial intelligence technology has been widely used in information detection as well as data processing, and has remained one of the current hot research topics. Those research on artificial intelligence, recently, has focused on the application of network security processing of data as well as fault diagnosis and anomaly detection. This paper analyzes, aiming at the network security detection of students real name data, the relevant artificial intelligence technology and builds the model. In this process, this paper firstly introduces and analyzes some shortcomings of clustering algorithm as well as mean algorithm, and then proposes a cloning algorithm to obtain the global optimal solution. This paper, on this basis, constructs a network security model of student real name data information processing based on trust principle and trust model.
Authored by Wenyan Ye
Multiple Fault Diagnosis - To solve the problems of low fault diagnosis rate and poor efficiency of AC-DC drive traction converter, a fault diagnosis method based on improved multiscale permutation entropy and wavelet analysis is proposed based on the multiple fault characteristics of input current curve in frequency domain. Firstly, the curve of the traction converter is decomposed by wavelet transform, and the modal components of different time scales are obtained. Then the fault characteristic parameters of different components are calculated by improved multi-scale permutation entropy. Finally, the multivariable support vector machine algorithm based on decision tree is used to obtain the tree-like optimal fault interval surface through small sample training, so as to achieve the fault classification of traction converters. The experimental results show that this method can effectively distinguish the fault types of traction converters, and improve the accuracy and efficiency of fault diagnosis, which has good adaptability and practical significance.
Authored by Lei Yang, Zheng Li, Haiying Dong
Multiple Fault Diagnosis - Aiming at the difficulty of extracting fault features on the aircraft landing gear hydraulic system, traditional feature extraction methods rely heavily on expert knowledge, and the accuracy of fault diagnosis is difficult to guarantee. This paper combined convolutional neural network (CNN) and support vector machine classification algorithm (SVM) to propose a fault diagnosis model suitable for aircraft landing gear hydraulic system. The diagnosis model adopted the onedimensional multi-channel CNN network structure, took the original pressure signal of multiple nodes as input, adaptively extracts the feature value of the pressure signal through CNN, and built a multi-feature fusion layer to realize the feature fusion of the pressure signal of each node. Finally, input the fused features into the SVM classifier to complete the fault classification. In order to verify the proposed fault diagnosis model, a typical aircraft landing gear hydraulic system simulation model was built based on AMESim, and several typical fault types such as hydraulic pump leakage, actuator leakage, selector valve clogging and accumulator failure were simulated, and corresponding Fault type data set, and use overlapping sample segmentation for data enhancement. Experiments show that the diagnosis accuracy of the proposed fault diagnosis algorithm can reach 99.25\%, which can realize the adaptive extraction of the fault features of the aircraft landing gear hydraulic system, and the features after multidimensional fusion have better discrimination, compared with traditional feature extraction methods more effective and more accurate.
Authored by Dongyang Feng, Chunying Jiang, Mowu Lu, Shengyu Li, Changlong Ye
Multiple Fault Diagnosis - Traditional mechanical and electrical fault diagnosis models for high-voltage circuit breakers (HVCBs) encounter the following problems: the recognition accuracy is low, and the overfitting phenomenon of the model is serious, making its generalization ability poor. To overcome above problems, this paper proposed a new diagnosis model of HVCBs based on the multi-sensor information fusion and the multi-depth neural networks (MultiDNN). This approach used fifteen typical time-domain features extracted from signals of exciting coil current and angular displacement to indicate the operational state of HVCBs, and combined the multiple deep neural networks (DNN) to improve the accuracy and standard deviation. Six operational states were simulated based on the experimental platform, including normal state, two typical mechanical faults and four typical electrical faults, and the coil current and angular displacement signals are collected in each state to verify the effectiveness of the proposed model. The experimental results showed that, compared with the traditional fault diagnosis model, the Multi-DNN based on multi-sensor information fusion can be applied to finding a better equilibrium between underfitting and overfitting phenomenon of the model.
Authored by Qinghua Ma, Ming Dong, Qing Li, Yadong Xing, Yi Li, Qianyu Li, Lemeng Zhang
Multiple Fault Diagnosis - Multiple fault diagnosis is a challenging problem, especially for complex high-risk systems such as nuclear power plants. Multilevel Flow Models (MFM) is a powerful tool for identifying functional failures of complex process systems composing of mass, energy and information flows. The method of fault diagnosis based on MFM is generally based on the assumption that only a single fault occurs, and based on this, the Depth First Search (DFS) is adopted to identify the abnormal functions at the lower level of an MFM. This paper presents a method based on Multilevel Flow Models (MFM) for diagnosing multiple functionally related and coupled faults. An MFM model is firstly transformed into a reasoning Causal Dependency Graph (CDG) model according to a group of alarm events. The CDG model is further decoupled to generate causal trees by a DFS algorithm, each of which represents an overall explanation of a cause of alarm events. The paper presents a comparative analysis of cases. It proves that the method proposed in the paper can give more comprehensive diagnostic results than the existing method.
Authored by Gengwu Wu, Jipu Wang, Haixia Gu, Gaojun Liu, Jixue Li, Hongyun Xie, Ming Yang
Multiple Fault Diagnosis - In this article, fault detection (FD) method for multiple device open-circuit faults (OCFs) in modified neutral-point- clamped (NPC) inverters has been introduced using Average Current Park Vector (ACPV) algorithm. The proposed FD design circuit is loadindependent and requires only the converter 3- phase output current. The validity of the results has been demonstrated for OCF diagnostics using a 3-level inverter with one faulty switch. This article examines ACPV techniques for diagnosing multiple fault switches on the single-phase leg of 3-step NPC inverter. This article discusses fault tolerance for a single battery or inverter switch during a standard, active level 3 NPC inverter with connected neutral points. The primary goal here is to detect and locate open circuits in inverter switches. As a result, simulations and experiments are used to investigate and validate a FD algorithm based on a current estimator and two fault localization algorithms based on online adaptation of the space vector modulation (S VM) and the pulse pattern injection principle. This technique was efficiently investigated and provides three-stage modified NPC signature table that accounts for all possible instances of fault. The Matlab / S imulink software is used to validate the introduced signature table for the convergence of permanent magnet motors.
Authored by P Selvakumar, G Muthukumaran