A recent randomized controlled trial conducted by METR, a non-profit research organization, has found that experienced open-source developers were 19% slower when using AI coding tools, despite perceiving a 20% increase in their speed. This surprising finding challenges widespread assumptions about the immediate productivity benefits of AI in software development.
The study involved 16 experienced open-source developers working on 246 real tasks within their own established repositories, averaging over 22,000 stars and 1 million lines of code. Tasks were randomly assigned to either allow or disallow AI assistance, typically utilizing Cursor Pro with Claude 3.5/3.7 models. Developers initially forecasted a 24% speedup from AI tools.
However, the empirical results contradicted these perceptions. Lead authors Joel Becker and Nate Rush expressed their surprise, noting that the slowdown stemmed from developers spending less time actively coding and searching, and more time prompting AI, waiting for or reviewing AI outputs, and being idle. The AI suggestions, while often directionally correct, frequently required significant correction.
This outcome stands in contrast to other studies that have reported significant productivity gains from AI in coding, with some indicating speedups of up to 56% or a 26% increase in completed tasks. METR's research highlights that these gains may not universally apply, particularly for seasoned developers intimately familiar with large, complex open-source codebases.
METR, which focuses on empirically testing AI systems for capabilities and risks, emphasized the importance of "in the wild" experiments over self-contained benchmarks to understand real-world AI impact. Despite the measured slowdown, many participants and the study authors continue to use AI tools, suggesting that factors like reduced effort and a more pleasant development experience might outweigh direct productivity in some contexts.