AI-assisted development promised revolutionary gains. But 66% of developers report code that’s “almost right, but not quite,” with 322% more vulnerabilities in AI-generated code. Before AWS introduced AI-DLC and GitHub Spec Kit, Boiko’s teams were already running multi-agent processes that separated business, technical, and coding stages—learning from the hype phase when developers overestimated AI’s capabilities.
“This article isn’t about AI in general,” Boiko explains. “It’s about how to use it effectively right now. Early on, the hype led developers to overestimate what AI models could do, which increased development time instead of optimizing it.”
When Implementation Beats Announcements
With 18 years building products across SaaS, fintech, and video-on-demand, Boiko knows what breaks at scale. At Field Complete, where he manages distributed teams across three continents, he implemented what’s now called agentic architecture: separate AI agents handle distinct roles, moving to the next stage only after the previous one passes human review. It’s the same principle AWS later described as AI-DLC and GitHub as Spec Kit—proof that discipline beats hype.
“Before the agentic approach was announced, we had already implemented it,” Boiko says. “Separate chats performed distinct roles. We moved to the next stage only after the previous one passed review. Later, big players like Microsoft and Amazon described their approaches using agents that convey the same essence.”
An AI assistant first works through ideas to create detailed business documentation. A separate agent then writes technical documentation based on business requirements. Finally, another agent handles coding based on technical specs—following vertical slice principles where software is written in pieces that close complete business processes.
“You can launch it, test it, cover it with tests, and freeze it as ready before moving to the next module,” Boiko explains. Current industry data validates his approach: companies using clear processes report 20-30% of code being AI-generated with 55.8% faster task completion. Without structure, those numbers invert into productivity loss.
The Second Pilot That Never Sleeps
Boiko calls AI a “second pilot”—a 24/7 expert that blurs hierarchy. Juniors gain senior patterns; seniors explore new domains. Refactoring, once delayed due to context overload, becomes manageable. Test coverage—rare in traditional teams—becomes affordable and continuous.
“AI performs well in code refactoring when the tasks themselves aren’t complex but require rich context from the developer and many hours of focusing, which isn’t always possible,” Boiko notes. “Such tasks are most often put off, increasing technical debt, complicating code, and increasing costs and time to maintain it.”
Test coverage transforms economics. “You can cover existing code with tests quite cheaply, which is almost never done in companies without AI,” he observes.
What the Data Shows and Hides
By 2025, top models like Grok 4, GPT-5, and Claude 4 hit over 90% accuracy on coding tests. But here’s the catch: developers feel faster while actually slowing down on complex work. A 2025 METR study found experienced developers were 19% slower on complex tasks. The challenge lies in the final 30%—production readiness and edge cases—where AI often creates more work than it saves.
Boiko addresses this with multi-model routing—light models for routine work, heavy ones for reasoning, each with token and budget limits. In 2025, GPT-5 faced strong open-source rivals like DeepSeek R1 and Llama 4 Scout. Multi-model routing cuts costs through monitoring and spending alerts.
“The industry is very young, and there aren’t many how-to guides, let alone how-not-to guides,” Boiko admits when asked about common mistakes. Rather than overselling certainty, he admits what’s still unknown.
Documentation as Constraint, Not Record
For Boiko, documentation isn’t bureaucracy—it’s control. His teams document intent and requirements before implementation, using each layer as context for the next. This reduces hallucination and keeps scope tight.
The workflow moves from business intent to technical requirements before implementation. Each phase serves as context for the next while creating checkpoints for human review. This separation between stable requirements and flexible implementation details reduces AI hallucination and scope creep.
Testing strategies evolved in parallel. Organizations using AI for test generation demonstrate 61% confidence versus 27% without AI assistance. Tools like Qodo generate comprehensive unit tests with edge case detection, while platforms like Checksum auto-detect user flows and generate self-healing end-to-end tests.
The Constraint-Context Framework
Boiko uses a simple rule: AI handles high-constraint, low-context work like removing feature flags and generating boilerplate; humans own high-context, low-constraint tasks like architecture decisions and business logic. That boundary prevents the “almost right” syndrome that drives up debugging time.
Organizations juggle 47% more pull requests and spend more time debugging AI-generated code. The gap between perceived and actual efficiency demands better tracking.
Warning signs: developers spending more time reviewing AI code than writing original code, multiple failed CI/CD builds from AI-generated changes, team members unable to explain AI-generated solutions, and loss of architectural consistency.
From Two Roles to One Philosophy
At Field Complete, Boiko leads the AI-powered field service platform connecting contractors and property managers. At Plat.ai, he keeps coding. The same principle—framework and measurable execution—he applies as juror at UAtech Venture Night at Web Summit Vancouver, where he evaluates Ukrainian-founded startups. His criteria mirror his philosophy: can teams execute and reach measurable results within a quarter? That’s the difference between shipping and promising.
Organizations successful in 2025 treat model selection as core competency. With Grok 4, GPT-5, Claude 4, and open-source alternatives all delivering competitive performance, teams must develop expertise in cost-performance optimization, multi-model routing strategies, and vendor risk management.
Boiko’s teams implement workflow orchestration layers that route decisions to specialized AI agents with bounded contexts. Circuit breakers and fallback mechanisms prevent runaway AI behavior. Explicit handoff protocols between AI and human processes maintain controllability.
What’s Working Now
Microsoft reports over 30% of new code at Google is AI-generated, but within processes that keep humans in control. Companies that can swap AI models easily and follow clear workflows are winning.
Boiko’s approach shows results. His teams at Field Complete and Plat.ai ship faster without breaking things. The vertical slice principle—building complete business processes that can be tested and deployed independently—turns AI from a code generator into a productivity multiplier.
“Use AI to scale discipline, not replace it,” Boiko says.
For CTOs building their AI strategy, his advice is direct: start small, measure everything, and don’t let AI make architectural decisions. Document your requirements before you code. Test continuously. And remember that the final 30% of any feature—the production-ready part—still needs human judgment.
The approach Boiko built before the giants formalized theirs proves one thing: implementation beats promises. His teams didn’t wait for industry consensus. They built systems, measured results, and iterated based on what worked.
Where’s this heading? Field Complete continues expanding its AI-powered field service platform, applying the same principles that worked in development to contractor operations at scale. At Plat.ai, Boiko’s still coding—proof that the best CTOs stay technical.
The competitive edge in 2025 isn’t about having the perfect AI model. It’s about knowing what to automate and what to keep human.
About Oleksandr Boiko
Oleksandr Boiko serves as CTO at Field Complete (USA) and Principal Backend Engineer at Plat.ai, bringing over 18 years of experience building products for millions across SaaS, Fintech, and Video-on-Demand. His career spans from early engineering roles to Head of Engineering and CTO positions, with consistent focus on aligning technology with business growth and long-term scalability. He manages globally distributed teams across three continents, designs scalable architectures, and fosters high-performing engineering cultures. As a juror for UAtech Venture Night at Web Summit Vancouver, Boiko evaluates Ukrainian-founded startups with emphasis on execution discipline and long-term impact.
