Facing the urgent requirement for effective emergency management, our study introduces a groundbreaking approach leveraging the capabilities of open-source Large Language Models (LLMs), notably LLAMA2. This system is engineered to enhance public emergency assistance by swiftly processing and classifying emergencies communicated through social media and direct messaging. Our innovative model interprets user descriptions to analyze context and integrate it with existing Situation Reports, streamlining the alert process to government agencies with crucial information. Importantly, during peak emergency times when conventional systems are under stress, our LLM-based solution provides critical support by offering straightforward guidance to individuals and facilitating direct communication of their circumstances to emergency responders. This advancement significantly bolsters the efficiency and efficacy of crisis response mechanisms.
Authored by Hakan Otal, Abdullah Canbaz
While code review is central to the software development process, it can be tedious and expensive to carry out. In this paper, we investigate whether and how Large Language Models (LLMs) can aid with code reviews. Our investigation focuses on two tasks that we argue are fundamental to good reviews: (i) flagging code with security vulnerabilities and (ii) performing software functionality validation, i.e., ensuring that code meets its intended functionality. To test performance on both tasks, we use zero-shot and chain-of-thought prompting to obtain final “approve or reject” recommendations. As data, we employ seminal code generation datasets (HumanEval and MBPP) along with expert-written code snippets with security vulnerabilities from the Common Weakness Enumeration (CWE). Our experiments consider a mixture of three proprietary models from OpenAI and smaller open-source LLMs. We find that the former outperforms the latter by a large margin. Motivated by promising results, we finally ask our models to provide detailed descriptions of security vulnerabilities. Results show that 36.7 \% of LLM-generated descriptions can be associated with true CWE vulnerabilities.CCS CONCEPTS• Software and its engineering → Software verification and validation; Software development techniques.
Authored by Rasmus Jensen, Vali Tawosi, Salwa Alamir
In this survey, we delve into the integration and optimization of Large Language Models (LLMs) within edge computing environments, marking a significant shift in the artificial intelligence (AI) landscape. The paper investigates the development and application of LLMs in conjunction with edge computing, highlighting the advantages of localized data processing such as reduced latency, enhanced privacy, and improved efficiency. Key challenges discussed include the deployment of LLMs on resource-limited edge devices, focusing on computational demands, energy efficiency, and model scalability. This comprehensive analysis underscores the transformative potential and future implications of combining LLMs with edge computing, paving the way for advanced AI applications across various sectors.
Authored by Sarthak Bhardwaj, Pardeep Singh, Mohammad Pandit
With the rapid advancement of technology and the expansion of available data, AI has permeated many aspects of people s lives. Large Language Models(LLMs) such as ChatGPT are increasing the accuracy of their response and achieving a high level of communication with humans. These AIs can be used in business to benefit, for example, customer support and documentation tasks, allowing companies to respond to customer inquiries efficiently and consistently. In addition, AI can generate digital content, including texts, images, and a wide range of digital materials based on the training data, and is expected to be used in business. However, the widespread use of AI also raises ethical concerns. The potential for unintentional bias, discrimination, and privacy and security implications must be carefully considered. Therefore, While AI can improve our lives, it has the potential to exacerbate social inequalities and injustices. This paper aims to explore the unintended outputs of AI and assess their impact on society. Developers and users can take appropriate precautions by identifying the potential for unintended output. Such experiments are essential to efforts to minimize the potential negative social impacts of AI transparency, accountability, and use. We will also discuss social and ethical aspects with the aim of finding sustainable solutions regarding AI.
Authored by Takuho Mitsunaga
The emergence of large language models (LLMs) has brought forth remarkable capabilities in various domains, yet it also poses inherent risks to trustfulness, encompassing concerns such as toxicity, stereotype bias, adversarial robustness, ethics, privacy, and fairness. Particularly in sensitive applications like customer support chatbots, AI assistants, and digital information automation, which handle privacy-sensitive data, the adoption of generative pre-trained transformer (GPT) models is pervasive. However, ensuring robust security measures to mitigate potential security vulnerabilities is imperative. This paper advocates for a proactive approach termed "security shift-left," which emphasizes integrating security measures early in the development lifecycle to bolster the security posture of LLM-based applications. Our proposed method leverages basic machine learning (ML) techniques and retrieval-augmented generation (RAG) to effectively address security concerns. We present empirical evidence validating the efficacy of our approach with one LLM-based security application designed for the detection of malicious intent, utilizing both open-source datasets and synthesized datasets. By adopting this security shift-left methodology, developers can confidently develop LLM-based applications with robust security protection, safeguarding against potential threats and vulnerabilities.
Authored by Qianlong Lan, Anuj Kaul, Nishant Pattanaik, Piyush Pattanayak, Vinothini Pandurangan
This study investigates the performance and security indicators of mainstream large language models in Chinese generation tasks. It explores potential security risks associated with these models and offers suggestions for improvement. The study utilizes publicly available datasets to assess Chinese language generation tasks, develops datasets and multidimensional security rating standards for security task evaluations, compares the performance of three models across 5 Chinese tasks and 6 security tasks, and conducts Pearson correlation analysis using GPT-4 and questionnaire surveys. Furthermore, the study implements automatic scoring based on GPT-3.5-Turbe. The experimental findings indicate that the models excel in Chinese language generation tasks. ERNIE Bot outperforms in the evaluation of ideology and ethics, ChatGPT excels in rumor and falsehood and privacy security assessments, and Claude performs well in assessing factual fallacy and social prejudice. The fine-tuned model demonstrates high accuracy in security tasks, yet all models exhibit security vulnerabilities. Integration into the prompt project proves to be effective in mitigating security risks. It is recommended that both domestic and foreign models adhere to the legal frameworks of each country, reduce AI hallucinations, continuously expand corpora, and update iterations accordingly.
Authored by Yu Zhang, Yongbing Gao, Weihao Li, Zirong Su, Lidong Yang
LLMs face content security risks such as prompt information injection, insecure output processing, sensitive information leakage, and over-dependence, etc. By constructing a firewall for LLMs with intelligent detection strategies and introducing multi-engine detection capabilities such as rule matching, semantic computing, and AI models, we can intelligently detect and dispose of inputs and outputs of the LLMs, and realize the full-time on-line security protection of LLM applications. The system is tested on open-source LLMs, and there is a significant improvement in terms of the detection rate of insecure content.
Authored by Tianrui Huang, Lina You, Nishui Cai, Ting Huang
Deep Learning Large Language Models (LLMs) have the potential to automate and simplify code writing tasks. One of the emerging applications of LLMs is hardware design, where natural language interaction can be used to generate, annotate, and correct code in a Hardware Description Language (HDL), such as Verilog. This work provides an overview of the current state of using LLMs to generate Verilog code, highlighting their capabilities, accuracy, and techniques to improve the design quality. It also reviews the existing benchmarks to evaluate the correctness and quality of generated HDL code, enabling a fair comparison of different models and strategies.
Authored by Erik Hollander, Ewout Danneels, Karel-Brecht Decorte, Senne Loobuyck, Arne Vanheule, Ian Van Kets, Dirk Stroobandt
AI pair programmers, such as GitHub s Copilot, have shown great success in automatic code generation. However, such large language model-based code generation techniques face the risk of introducing security vulnerabilities to codebases. In this work, we explore the direction of fine-tuning large language models for generating more secure code. We use real-world vulnerability fixes as our fine-tuning dataset. We craft a code-generation scenario dataset (C/C++) for evaluating and comparing the pre-trained and fine-tuned models. Our experiments on GPT-J show that the fine-tuned GPT-J achieved 70.4\% and 64.5\% ratios of non-vulnerable code generation for C and C++, respectively, which has a 10\% increase for C and a slight increase for C++ compared with the pre-trained large language model.
Authored by Junjie Li, Aseem Sangalay, Cheng Cheng, Yuan Tian, Jinqiu Yang
The development of AI computing has reached a critical inflection point. The scale of large-scale AI neural network model parameters has grown rapidly to “pre-trillion-scale” level. The computing needs of training large-scale AI neural network models have reached “exa-scale” level. Besides, AI Foundation Model also affects the correctness of AI applications, and becoming a new information security issue. Future AI development will be pushed by progress of computing power (supercomputer), algorithm (neural network model and parameter scale), and application (foundation model and downstream fine tuning). In particular, the computational efficiency of AI will be a key factor in the commercialization and popularization of AI applications.
Authored by Bor-Sung Liang
With the rapid advancement of technology and the expansion of available data, AI has permeated many aspects of people s lives. Large Language Models(LLMs) such as ChatGPT are increasing the accuracy of their response and achieving a high level of communication with humans. These AIs can be used in business to benefit, for example, customer support and documentation tasks, allowing companies to respond to customer inquiries efficiently and consistently. In addition, AI can generate digital content, including texts, images, and a wide range of digital materials based on the training data, and is expected to be used in business. However, the widespread use of AI also raises ethical concerns. The potential for unintentional bias, discrimination, and privacy and security implications must be carefully considered. Therefore, While AI can improve our lives, it has the potential to exacerbate social inequalities and injustices. This paper aims to explore the unintended outputs of AI and assess their impact on society. Developers and users can take appropriate precautions by identifying the potential for unintended output. Such experiments are essential to efforts to minimize the potential negative social impacts of AI transparency, accountability, and use. We will also discuss social and ethical aspects with the aim of finding sustainable solutions regarding AI.
Authored by Takuho Mitsunaga