Leapwork recently released new research showing that while confidence in AI-driven software testing is growing rapidly, accuracy, stability, and ongoing manual effort remain decisive factors in how far teams are willing to trust automation. The study, based on responses from more than 300 software engineers, QA leaders, and IT decision-makers worldwide, finds that organizations see AI as central to the future of testing, but only when it delivers reliable, maintainable results.
According to the survey, 88% of respondents say AI is now a priority for their organization’s testing strategy, with nearly half rating it as a critical or high priority. Optimism is also high: 80% believe AI will have a positive impact on testing over the next two years. Yet adoption is still uneven. While 65% say they are already using or exploring AI across some testing activities, only 12.6% currently apply AI across key test workflows, reflecting a cautious, incremental approach.
Concerns about accuracy and test stability largely drive the gap between enthusiasm and confidence. More than half of respondents (54%) said worries about quality and reliability are holding back broader AI use. Teams cited fragile tests, difficulty automating end-to-end flows across systems, and the time required to update tests as their biggest challenges. In fact, 45% reported that updating tests after changes in a critical system takes three days or more, slowing release cycles and eroding trust in automation.
Manual effort also continues to limit progress. On average, only 41% of testing is automated today. Test creation was identified as the biggest bottleneck by 71% of respondents, followed by test maintenance at 56%. More than half (54%) said lack of time is a major barrier to adopting or improving test automation, underscoring why many teams remain selective in how they deploy AI.
“It is no longer a question of whether testing teams will leverage agentic capabilities in their work. The question is how confidently and predictably they can rely on it,” said Kenneth Ziegler, CEO of Leapwork. “Our research shows teams want AI to help them move faster, expand coverage, and reduce effort, but accuracy remains table stakes. The real opportunity lies in applying and integrating AI alongside stable automation, so teams gain speed and scale without sacrificing trust in outcomes.”
The findings suggest that organizations will achieve the greatest impact by pairing AI with mature, resilient automation foundations rather than treating it as a standalone solution. As systems grow more complex and change more frequently, teams that balance innovation with reliability are likely to be best positioned to scale AI-driven testing with confidence.
Leapwork’s survey is one of several in the industry that highlight similar findings around the nature of testing and AI:
Puppet’s influential DevOps survey consistently shows that high-performing teams invest significantly more in test automation, stability, and fast feedback loops, and that teams with unstable CI/CD pipelines suffer slower delivery and lower confidence in automation. In its 2024 State of DevOps Report, Puppet notes that teams with strong automated testing practices perform better across reliability, lead time, and deployment frequency, but only if tests are reliable and easily maintainable. Unreliable or flaky tests were cited as one of the top blockers to automated delivery workflows.
GitLab’s annual survey, which draws responses from thousands of developers and DevOps practitioners, found that over 70% of respondents believe AI will transform software development workflows, including testing and security. However, similar to Leapwork’s findings, only a minority currently uses AI tools deeply in production workflows. Many respondents expressed concerns about trust, explainability, and integration with existing toolchains, particularly in regulated or enterprise contexts.
The Tricentis World Quality Report surveyed organizations globally and found that automation coverage across test types (unit, functional, performance, etc.) averages between 30–50%, matching Leapwork’s finding of ~41% automation. Respondents again cited maintenance effort, unstable tests, and lack of skilled resources as major constraints to progressing further. The report also notes an emerging trend: AI-assisted test generation tools are gaining interest, but many teams are hesitant to replace human validation entirely because of risk and accuracy concerns.
While not focused solely on AI, the DORA research (often published via Google Cloud) underscores that teams with strong test automation, observability, and failure recovery practices outperform peers on key metrics like deployment frequency and change lead time. In more recent releases, DORA surveys include questions about AI tooling. Responses indicate that teams that adopt AI features in DevOps tools also invest heavily in observability and automated validation, suggesting that AI works best when layered on a strong automation foundation.
A broader enterprise AI survey by IDC found that while 60-70% of companies are piloting AI use cases across business functions, only 20-30% have deployed AI in robust, production-grade applications. When asked why, respondents pointed to governance risk, lack of talent, and operational complexity, similar reasons that Leapwork respondents qualify their testing tool adoption.
