In recent years, tools such as generative AI coding assistants such as GitHub Copilot have been widely promoted, claiming to increase developer productivity and reduce development time. However, recent research suggests that these tools may not work as well as expected, and in some cases may even lead to reduced code quality and productivity.
Original link: https://shenisha.substack.com/p/are-ai-coding-assistants-really-saving
作者 | Shenisha 翻译 | 郑丽媛
出品 | CSDN(ID:CSDNnews)
A recent study questioned the claim that AI coding tools can improve developer productivity, and found that using GitHub Copilot led to a 41% increase in bugs, highlighting concerns about the quality of their code. While both the control and experimental groups worked fewer hours, the burnout and stress of developers using GitHub Copilot did not decrease; In addition, developers need to spend more time reviewing AI-generated code, which partially offsets the potential time savings.
All in all, the impact of AI coding tools on developer productivity seems to be minimal.
After surveying 800 developers, Uplevel pointed out that GitHub Copilot is not doing anything
In the developer community, the debate continues around whether AI coding assistants are truly productivity-boosting. While some companies report significant productivity gains from AI tools, others find that these tools introduce more bugs and complicate the debugging process. Junior developers, in particular, struggle to be as efficient as senior developers, even with the help of AI tools.
Coding tools were one of the early applications of generative AI's rise, but a recent study by analytics firm Uplevel suggests that the expected productivity gains may be overestimated or even non-existent. Uplevel's analysis of coding and collaboration data found that using GitHub Copilot resulted in a 41% increase in error rates.
In the report, Can Generative AI Boost Developer Productivity, Uplevel mentions, "This suggests that Copilot may be negatively impacting code quality." Engineering leads may need to investigate pull requests with bugs and implement safeguards for the responsible use of generative AI. ”
The study, which measured pull request (PR) cycle time (the amount of time it takes to merge code into a repository) and PR throughput (i.e., the number of merge requests), found that developers using GitHub Copilot didn't see significant improvement in these metrics — findings from Uplevel's research report that aims to answer three key questions:
- Does using GitHub Copilot help developers write code faster?
- Does GitHub Copilot help developers produce higher-quality code?
- GitHub Copilot 是否能缓解开发者的倦怠感?
Uplevel analyzed data from its customers and compared the performance of about 800 developers in the three months before and after adopting GitHub Copilot, and came up with the following two main findings:
(1) There was no significant change in the efficiency index
"When comparing PR cycle time, throughput, and complexity with and without testing, GitHub Copilot is neither helpful nor hindering developers or improving coding speed. While some of these metrics are statistically significant, these changes have no real impact on technical results – for example, the PR cycle time was reduced by only 1.7 minutes. Uplevel noted in the report.
(2) Reduce burnout
Uplevel's "Always Online" metric, which tracks hours worked outside of regular working hours and is a leading indicator of burnout, showed a decline in both groups. However, the number of developers using GitHub Copilot dropped by 17%, while the number of developers who didn't use the tool dropped by nearly 28%.
The results of the research published by GitHub have come to different conclusions
"Uplevel's research was motivated by curiosity about the idea that AI coding assistants would become mainstream." says Matt Hoffman, product manager and data analyst at Uplevel. In contrast, a study published by GitHub in August 2024 found that 97% of software engineers, developers, and programmers reported that they were using AI coding assistants, and other studies found similar results.
GitHub's research notes that more than 97% of respondents have used AI coding tools at work. However, there is a smaller percentage of whether companies actively encourage or allow the use of these tools, and this percentage varies by region. Here are some of the key findings from the survey:
● The wave of generative AI in software development continues to grow. The survey covered 2,000 participants, almost all of whom (over 97%) had used these tools in some form, both at work and outside of work (although not all companies officially endorse their use).
● While many respondents said their companies are open to AI technology, there is still room for improvement. Survey data shows that across different markets, between 59% and 88% of respondents said that "their companies actively encourage or allow the use of these tools."
● Software development teams are increasingly recognizing the additional benefits of AI coding tools, including creating more secure software, improving code quality, generating better test cases, and speeding up the learning process for new programming languages. These changes save developers a lot of time and allow them to focus on more strategic tasks.
Researchers at GitClear also found that AI tools like GitHub Copilot mostly recommend adding new code and less recommending updating or removing code, which often leads to redundant code. In addition, they observed a sharp increase in the phenomenon of "code revisions", meaning that the code is modified frequently—often a sign of poor code quality.
"The AI code generated with each iteration is becoming more and more inconsistent because different parts are generated based on different prompts. As a result, the code is becoming more and more difficult to understand and debug, making troubleshooting resource-intensive and sometimes easier to rewrite," one user noted, noting that AI has not yet been able to improve productivity.
Be cautious about adopting the strategy of an AI coding assistant
The introduction of AI tools like GitHub Copilot raises several important questions: Can AI help developers work faster? Does it improve code quality and prevent burnout? In response, Uplevel responded in the report: "For the current group, the answer is no. However, innovation is growing rapidly, and GitHub has found that Copilot has increased developer satisfaction. ”
In general, engineering leaders may need to take a cautious approach to how to adopt Copilot:
- Set specific goals: Define the desired outcomes for integrating GitHub Copilot into your team's workflow. For example, what specific improvements would you like to achieve?
- Provide team training: Conduct initial training to explain when GitHub Copilot should or shouldn't be used, and establish safeguards to ensure proper implementation.
- Continue experimenting with generative AI: Identify specific use cases where Copilot performs well and refine the prompts that deliver the best results. Share successful strategies across the organization so you can replicate successes.
- Monitor technical efficiency metrics: Conduct A/B testing to collect objectively quantifiable data to assess whether AI is truly improving developer productivity and helping to achieve operational goals.
Finally, there is good news: the PM-Summit Global Product Manager Conference "Cloud Member" is coming! Cloud members will communicate, learn and grow with the world's top product managers through a variety of online and offline interactive channels. No matter where you are in the world, a cloud membership program will open the door to a treasure trove of knowledge and resources.
In addition, the Global Machine Learning Technology Conference (ML-Summit) will be held on November 14-15, 2024 at The Westin Bund Center, Beijing. Focusing on engineering practice in the field of large models, including 50+ lecturers at home and abroad, 12 major themes. For details, please refer to the official website: http://ml-summit.org/ (or click on the original link)