Where Do Humans Fit In AI-Assisted Software Development?

Martin Fowler’s blog recently examined the role of humans in AI-assisted software engineering, arguing that developers are unlikely to move entirely “out of the loop”. Kief Morris, who authored the article, suggests many teams may increasingly work “on the loop”, designing the specifications, tests, and feedback mechanisms that guide AI agents rather than reviewing every generated artifact directly. His analysis appears alongside broader industry discussions about how such agent-driven development workflows should be verified, monitored, and governed in practice

In his article, Morris outlines three ways humans can interact with AI systems: in the loop, where developers review each AI output; out of the loop, where systems operate largely autonomously; and on the loop, where humans design and maintain the mechanisms that guide and validate the system’s behavior. Morris suggests the third model may prove particularly useful for software engineering, where developers focus on building testing frameworks, constraints, and evaluation pipelines that shape how AI agents operate rather than inspecting every generated line of code.

Source MartinFowler.com

Organizations across the industry are experimenting with coding agents in software development workflows. In a technical deep dive into its Codex system, OpenAI described the “agent loop” that coordinates interactions between the user, the model, and external tools. In this architecture, the output of the system is often not simply a chat response but code written or modified directly on the machine, produced through iterative tool use and feedback cycles.

Source OpenAI

At the same time, developer sentiment around AI-generated code remains mixed. Stack Overflow discussions and commentary have highlighted concerns that productivity gains from generative AI tools can come with trade-offs in maintainability and technical debt, particularly when generated code requires substantial review or refactoring before it can be safely integrated into production systems.

Survey data from Stack Overflow also reflects this tension. In the 2025 Developer Survey, 84% of developers reported that they are using or planning to use AI tools in their development workflows, but significantly fewer said they trust AI-generated output. Many respondents reported that debugging AI-generated code or verifying its correctness can require additional effort.

Some engineering teams have begun exploring ways to address these challenges through stronger verification and control mechanisms. In a recent engineering post, Datadog argued that human review alone does not scale for AI-generated artifacts, particularly when agents can produce large volumes of code. Instead, teams may need to invest in automated verification pipelines that combine specifications, simulation testing, bounded verification, and runtime telemetry to validate system behavior.

Source DataDog

Datadog describes what it calls a “harness-first” approach to agent development, in which automated verification systems evaluate agent behavior through specifications, simulation testing, and runtime telemetry. In this model, developers focus on building the harness that validates agent outputs rather than relying on manual inspection of each generated artifact.

These discussions highlight an emerging focus on the systems surrounding AI development tools. As coding agents become more capable, several organizations have emphasized the importance of testing harnesses, evaluation frameworks, and observability systems that help developers monitor and guide AI-generated output.

Morris’ “on the loop” framing reflects this broader theme. Rather than removing humans from the development process entirely, many teams appear to be exploring how developers can design and maintain the guardrails that shape how increasingly autonomous software systems operate.