Michael Webster, Principal Engineer at CircleCI, presented AI Works, Pull Requests Don’t: How AI Is Breaking the SDLC and What To Do About It at the very first QCon AI New York 2025.
Webster kicked off his presentation with an AI timeline and where we are today:
More and more organizations are integrating AI into their development workflows resulting in an increase in code generation.
An AI-Assisted Coding study led by Hao He, Ph.D. Student in Software Engineering at Carnegie Mellon University, focused on the long-term impact of more than 2000 open-source software projects comparing those that incorporated Cursor, an agentic AI coding assistant, with a matched set of projects that didn’t incorporate Cursor. The study concluded that the persistent increase in static analysis warnings and code complexity generated by Cursor provided an increase in development velocity and software quality. However, this was only temporary, lasting only about one month, before an observed drop in development velocity.
Similarly, a State of AI-Assisted Software Development study conducted by DORA, concluded that AI-assisted coding is an amplifier providing an increase in development velocity, but there was also an observed increase in instability.
Regarding code reviews, the number of code additions can typically be 25 times larger than the number of code deletions, creating a challenge for any organization. According to this State of Code Review report conducted by Graphite, it takes approximately four hours for small organizations to merge a pull request compared to larger organizations taking approximately 13 hours. However, it was determined that this discrepancy was due to small organizations more than four times likely to altogether skip a formal code review.
Webster discussed the impact of AI on the Software Development Lifecycle (SDLC) and Continuous Integration/Continuous Delivery (CI/CD) processes at CircleCI.
Using AI in their SDLC provided a three-to-five times development velocity increase for about a month before an observed persistent technical debt accumulation.
Queuing Theory is the mathematical study of waiting lines, analyzing arrival rates, service times, and system capacity to predict the length and wait times of queues. An example of this is Little’s Law, defined as L = λW, such that:
- L = average number of items in a system
- λ = average arrival rate = exit rate = throughput
- W = average wait time in a system for an item (duration inside)
Using a form of queuing theory, Webster defined that a delay is equal to the arrival rate being greater than the processing rate.
The traditional testing approach is to run a full suite of tests. Using code coverage can decrease testing time by: building a map from the code for testing; running the tests with dependencies; and periodically rebuilding.
Test Impact Analysis (TIA), a testing strategy that identifies and runs only the tests affected by recent code changes, eliminates running the entire test suite to improve CI/CD throughput and ultimately reduce costs. Tests are mapped using code coverage data.
Webster displayed the TIA strategy used by CircleCI featuring a map of defined endpoints to relevant TypeScript test files. In a recent study of 7500 tests, testing time using this TIA strategy was reduced from 30 minutes to 1.5 minutes. He maintained that the same principle may be applied to code reviews since not all code requires the same level of scrutiny.
Webster concluded his presentation by introducing Chunk, an AI agent developed by CircleCI, that claims to “validates code at AI speed.” The process is to: validate first; use pipelines and environments that developers already use; keep software products production ready by fixing flaky tests and providing the right tests and code coverage; and learn from merges, reverts, rollbacks and comments.
