GitHub engineers recently traced user reports of unexpected “Too Many Requests”; errors to abuse-mitigation rules that had accidentally remained active long after the incidents that prompted them.
According to GitHub, the affected users were not generating high-volume traffic; they were “making a handful of normal requests”; that still tripped protections. The investigation found that older incident rules were based on traffic patterns that were strongly associated with abuse at the time, but later began matching some legitimate, logged-out requests. GitHub described these detections as “combinations of industry-standard fingerprinting techniques alongside platform-specific business logic”, noting that “composite signals can occasionally produce false positives.”
GitHub also quantified how the layered signals behaved in practice. Among requests that matched suspicious fingerprints, only a small subset were blocked. Specifically those that also triggered business-logic rules resulting in roughly 0.5 – 0.9% of fingerprint matches being blocked, while false positives were a tiny fraction of total traffic (on the order of a few requests per 100,000). Even so, the post argues that the user impact was unacceptable, and uses the episode to highlight a broader operational pattern: emergency controls are often correct during an active incident, but “don’t age well as threat patterns evolve and legitimate tools and usage change”.
A key takeaway from GitHub’s write-up is that layered defenses can make attribution harder when something goes wrong. GitHub says it traced requests across multiple layers of infrastructure to determine where blocks occurred and summarizes the practical difficulty: each layer can legitimately rate-limit or block, and isolating which layer made the decision requires correlating logs across multiple systems with different schemas.
Source: GitHub
To resolve the immediate issue, GitHub reviewed mitigations by comparing what each rule blocks today versus what it was meant to block when created, removing rules that no longer served their purpose while retaining protections against ongoing threats. Longer term, GitHub says it is investing in lifecycle management for defensive controls: better cross-layer visibility to trace the source of rate limits and blocks, treating incident mitigations as temporary by default, and adding post-incident practices to evolve emergency controls into sustainable, targeted solutions.
Source: GitHub
While GitHub’s post focuses on rule lifecycle and observability across layers, comparable “defense-in-depth” request pipelines appear in other large platforms that handle internet traffic. Vercel’s published request lifecycle, for example, describes requests encountering “multiple stages” of its firewall protections spanning network (L3), transport (L4), and application (L7), followed by an additional WAF stage for project-level policies. Vercel also notes feedback loops across layers: if a WAF rule triggers a persistent action, upstream stages can intercept future requests earlier.
Layering also shows up outside edge traffic management. Kubernetes’ API server security model is explicitly staged: admission controllers intercept requests after authentication and authorization but before persistence, providing a structured chain where additional policy and safety checks can accumulate over time.
Taken together, these examples highlight a shared trade-off in large systems: layering defensive controls improves resilience and flexibility, but also increases the risk that protections outlive the context in which they were introduced. GitHub’s experience shows that the long-term effectiveness of defense-in-depth depends not only on where controls are placed, but on how clearly their intent, impact, and lifespan are understood as systems and usage patterns evolve.
