Datadog recently announced that its LLM Observability platform now provides automatic instrumentation for applications built with Google’s Agent Development Kit (ADK), offering deeper visibility into the behavior, performance, cost, and safety of AI-driven agentic systems. The integration, highlighted on the Google Cloud Blog, aims to make it easier for developers and SRE teams to monitor and troubleshoot complex multi-step AI agent workflows without extensive manual setup or custom instrumentation.
As enterprises increasingly adopt autonomous AI agents built with frameworks like ADK, the non-deterministic nature of these systems can make it difficult to predict outputs, diagnose errors, or control costs. Datadog’s new integration sinks signals from ADK applications into its observability system, allowing teams to visualize agent decision paths, trace tool calls, measure token usage and latency, and highlight unexpected loops or misrouted steps that may degrade performance or inflate API costs. By correlating these telemetry data with other system metrics, Datadog helps teams improve agent reliability and operational confidence.
The integration also bridges a common gap in agent deployment: while ADK provides a flexible framework for building AI agents across use cases, it does not inherently include monitoring and governance tools tailored for production environments. Datadog’s instrumentation fills that gap by automatically tracing each agent’s operations and presenting them in a unified timeline, making it easier to pinpoint issues like incorrect tool selection or inefficient retry loops that can sharply increase latency or token expenses.
Datadog’s LLM Observability platform now enables users to see token usage and latency per tool and workflow branch, helping identify where agents may be misbehaving or costing more than expected. This is particularly relevant in enterprise contexts where complex agentic orchestration may involve multiple models, workflows, and external integrations, and where traditional application performance monitoring falls short for AI-centric logic.
With this integration, Datadog extends its broad observability platform, already covering infrastructure, security, and distributed systems, to encompass the emerging class of agentic AI applications, helping bridge the gap between AI experimentation and robust production deployments.
Other Observability tools are also working on similar integrations, as the industry looks to help organizations better make sense of the use of LLMs:
New Relic offers full-stack observability and APM with strong distributed tracing and performance insights, and is evolving toward AI observability by expanding its telemetry correlation and AI-aware monitoring features. While it doesn’t yet have the same level of dedicated LLM-centric tooling as Datadog’s ADK integration, it provides solid end-to-end visibility across applications and infrastructure insights that can help teams understand how AI and agent workloads interact with the rest of the stack. New Relic’s pricing model, based on data ingested rather than hosts, can be more predictable for teams concerned about cost.
Splunk’s observability offerings (including Splunk Observability Cloud) excel at high-volume log ingestion and querying, making them very powerful for detailed forensic analysis across datasets of all types. However, they may require more effort to correlate AI-specific signals (e.g., token usage or model decision paths) out of the box, compared with Datadog’s more integrated agent observability features. Splunk remains strong where large unstructured telemetry and security-centric monitoring are priorities, but may lag in built-in AI/agent workflows without custom instrumentation or add-ons.
Emerging needs around AI and agent observability are driving all vendors to evolve their tooling, focusing on runtime tracking, sequence and path visualization, and cost/latency insights for AI workloads, but each takes a slightly different approach depending on its core strengths.
