Adding to the excitement around the possibilities provided by the in-kernel eBPF Linux tech, Meta shared that their Strobelight software they are working on open-sourcing for profiling across servers has yielded a 20% reduction in CPU cycles and in turn a 10-20% reduction in the number of required servers for Meta’s top services.
Strobelight is a fleet-wide profiler framework developed at Meta. Meta currently has this GitHub repository for Strobelight but hasn’t been updated since October of last year. Back in January the Meta/Facebook engineering team announced this profiling service built atop eBPF. There they mentioned in that January announcement they are working on open-sourcing all of Strobelight’s profilers and libraries.
Now today via the eBPF Foundation blog is a look at how this profiling orchestrator is yielding some very significant benefits for Meta and their massive fleet of servers.
One anecdote is that via Strobelight’s eBPF profiling capabilities they discovered a single one-character code change that saves 15,000 servers worth of annual capacity. They are seeing a 20% reduction in CPU cycles via leveraging Strobelight and a 10~20% reduction in the number of required servers. Plus Strobelight makes their profiling and debugging much faster. Overall very exciting and they are working on making greater use of eBPF at Meta for AI/ML workloads and other areas moving forward. Stay tuned for more information when the rest of Strobelight is open-sourced.