By Anna Tong
San Francisco (Reuters) Control for popular faith, with the help of advanced artificial intelligence tools delayed experienced software developers when they worked in code bases that were known to them, found a new study.
AI Research Non-Profit Metr carried out the in-depth study to a group of seasoned developers earlier this year, while they used Cursor, a popular AI coding assistant, to help them complete tasks in open-source projects with which they were known.
Before the study, the open-source developers believed that the use of AI would accelerate, which would estimate that it would reduce the task performance time by 24%. Even after completing the tasks with AI, the developers believed that they had reduced the task times by 20%. But the study found that the use of AI did the opposite: it increased the task performance time by 19%.
The main authors of the study, Joel Becker and Nate Rush, said they were shocked by the results: Prior to the study, Rush had written that he expected “a 2x gear, somewhat clear.”
The findings are convincing that AI always makes expensive human engineers much more productive, a factor that has attracted substantial investments in companies that sell AI products to support the development of software.
AI is also expected to replace the coding positions at entry level. Dario Amodei, CEO of Anthropic, recently told Axios that AI could wipe out half of all entry level white collar tracks in the next one to five years.
Previous literature on productivity improvements has found considerable profit: one study found the use of AI codes with 56%, another study showed that developers could complete 26% more tasks in a certain time.
But the new METR study shows that those profits do not apply to all scenarios for software development. In particular, this study showed that experienced developers who are closely familiar with the peculiarities and requirements of large, established open source code bases experienced a delay.
Other studies often depend on benchmarks for software development for AI, which sometimes incorrectly display Real-World tasks, according to the authors of the study.
The delay came from developers who had to spend time going and correcting what the AI models suggested.
“When we watched the videos, we discovered that the AIS made some suggestions about their work, and the suggestions were often directoral, but not exactly what is needed,” Becker said.
The authors warned that they do not expect that the delay in other scenarios will apply, such as for junior engineers or engineers who work in code bases that they are not familiar with.
Nevertheless, the majority of the participants of the study, as well as the authors of the study, continue to use cursor today. The authors believe that it is because AI makes the development experience easier, and in turn, more pleasant, related to editing an essay instead of staring at an empty page.
“Developers have goals other than the task to complete as quickly as possible,” said Becker. “So they go with this less strenuous route.”
(Reporting by Anna Tong in San Francisco; Edit by Sonali Paul)