"Study Finds AI Assistants Help Developers Produce Code That's More Likely to Be Buggy"
Stanford University computer scientists have discovered that programmers who accept assistance from Artificial Intelligence (AI) tools such as GitHub Copilot write less secure code than those who do not. In a paper titled, "Do Users Write More Insecure Code with AI Assistants?," Stanford researchers provide an affirmative response to this question. They discovered that help from AI tends to mislead developers regarding the quality of their code. Participants with access to an AI assistant were often found to produce more security vulnerabilities than those without AI help. There were significant results for string encryption and SQL injection for those with access to an AI assistant. In addition, the researchers discovered that participants who had access to an AI helper were more likely to believe they produced secure code than those who did not. In a previous study, NYU researchers demonstrated that AI-based programming suggestions are often insecure. The Stanford researchers cite the study published in August 2021 titled "Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions," which determined that, given 89 situations, around 40 percent of the computer programs created with help from Copilot contained potentially exploitable flaws. According to the authors of the Stanford study, that study's scope was limited because it only covers a constrained set of prompts corresponding to 25 vulnerabilities and three programming languages: Python, C, and Verilog. The Stanford user research included 47 participants with a range of expertise levels, including undergraduates, graduate students, and industry experts. The participants were required to write code in response to five prompts using a React-based Electron application that was monitored by the study administrator. For one specific question, those who relied on AI assistance were more likely to write wrong and insecure code than those who worked alone. Only 67 percent of the aided group provided the proper response, whereas 79 percent of the control group did so. This article continues to discuss the impact of AI assistance on the quality and security of code.