Report Shows OpenTelemetry’s Impact On Go Performance

A new benchmark study from observability platform Coroot has shed light on the performance costs of implementing OpenTelemetry in high-throughput Go applications. The findings show that while OpenTelemetry delivers valuable trace-level insights, it introduces notable overhead, increasing CPU usage by approximately 35% and increasing network traffic and latency under load.

Using a simple HTTP service backed by an in-memory database, the study compared baseline performance with full OpenTelemetry instrumentation under identical load conditions (10,000 requests per second). Runs occurring in Docker containers across four Linux hosts revealed several findings: enabling tracing increased CPU usage from 2 to 2.7 cores (roughly 35%), memory usage rose by 5–8 MB, and 99th-percentile latency increased modestly from approximately 10 ms to 15 ms. Additionally, tracing data resulted in approximately 4 MB/s of outbound network traffic, highlighting the resource implications of full request-level telemetry.

The study also contrasted SDK-based tracing with eBPF-based approaches, although caution should be noted, as Coroot sells an eBPF-based observability solution. eBPF, which avoids modifying application code, exhibited lower resource consumption—under 0.3 cores—even under heavy load when only metrics were collected. Coroot concluded that while OpenTelemetry’s SDK offers detailed trace visibility, it comes with measurable overhead that must be weighed against observability needs. They argue that for use cases prioritizing low latency and running with capped resources, an eBPF-based implementation may be a more suitable compromise.

This evaluation sparked conversations in the Go community. A discussion on Hacker News suggested performance gains could be achieved by optimizing SDK internals, such as using faster time functions, replacing mutexes with atomics, or marshaling methodically. On Reddit, users noted that even with zero sampling, significant overhead remains due to context propagation and span management. These perspectives underscore a broader recognition that while OpenTelemetry brings essential insights, it also introduces resource tradeoffs that require careful implementation and tuning.

One user, FZambia, stated the following:

“I was initially skeptical about tracing with its overhead (both resource-wise and instrumentation process-wise) vs properly instrumented app using metrics. As time goes, I have more and more examples when tracing helped to diagnose the issue, investigating incidents. The visualization helps a lot here – cross-team communication simplifies a lot when you have a trace. Still, I see how spans contains so much unnecessary data in tags, and collecting them on every request seems so much work to do while you are not using 99.999% of those spans. Turning on sampling is again controversial – you won’t find span when it’s needed (sometimes it’s required even if the request was successful). So reading such detailed investigations of tracing overhead is really useful, thanks!”

Coroot’s benchmarks provide valuable data showing that OpenTelemetry in Go delivers powerful observability at a measurable cost, with approximately 35% CPU overhead, and increased latency under load. The community response suggests that optimizations are underway, yet teams should still balance the need for trace-level visibility against performance constraints and explore lighter-weight options like eBPF-based metrics when appropriate.

Report Shows OpenTelemetry’s Impact on Go Performance

Leave a Reply

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Leave a Reply