By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Yelp Publishes Blueprint for Managing S3 Server-Access Logs at Massive Scale
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Yelp Publishes Blueprint for Managing S3 Server-Access Logs at Massive Scale
News

Yelp Publishes Blueprint for Managing S3 Server-Access Logs at Massive Scale

News Room
Last updated: 2025/12/13 at 8:12 AM
News Room Published 13 December 2025
Share
Yelp Publishes Blueprint for Managing S3 Server-Access Logs at Massive Scale
SHARE

In a detailed engineering post, Yelp shared how it built a scalable and cost-efficient pipeline for processing Amazon S3 server-access logs (SAL) across its infrastructure, overcoming traditional limitations of raw log storage and querying at high volume. The article outlines both the challenges the company faced, like log volume, storage cost, and query performance, and the technical strategies they used to make object-level logging at scale practical.

In essence, Yelp now writes terabytes of daily access logs but converts them into compact, parquet-formatted archives that are easy to query with tools like Amazon Athena. Through a process of periodic “compaction,” raw plaintext log objects are merged into fewer, larger Parquet files, reducing storage usage by about 85% and cutting the number of objects by more than 99.99%. This transformation makes analytics efficient and cost-effective, enabling quick lookups for permission debugging, cost attribution, incident investigation, and data retention analysis.

Behind the scenes, the architecture leverages AWS Glue Data Catalog for managing schemas across multiple AWS accounts, and a mix of scheduled batch jobs, Lambda functions, and partition-projection-based tables for robust, automated log ingestion. The system is designed to tolerate delayed or duplicate log delivery, something SAL inherently allows, by making inserts idempotent, and tagging old log objects for lifecycle expiration once their contents are safely archived.

Yelp’s system also supports key operational use-cases. For debugging, engineers can query whether a particular object was accessed (or denied) at a given time. For cost analysis, it is possible to aggregate API usage by IAM role to understand which services or teams generate the most traffic. For data hygiene, combining access logs with S3 inventory allows the team to identify and safely delete objects that haven’t been accessed for defined periods.

The significance of Yelp’s work is two-fold: it demonstrates that object-level logging on S3, long considered too expensive or unwieldy at scale, can in fact be made efficient and operationally manageable, and it provides a reference architecture for other companies seeking similar visibility or compliance posture. As demand grows for tighter data governance, auditing, and cost visibility in cloud storage environments, Yelp’s lessons offer a practical approach to scaling access-logging without blowing up storage costs or compromising queryability.

Alongside this Yelp example, there are several other examples that echo or implement similar design patterns to what Yelp described in its “S3 server-access logs at scale” architecture.

Upsolver is a data-lake/ETL platform that offers built-in support for ingesting S3 access logs, converting them into analytics-ready formats, and optimizing them for query engines. Their S3 Access Logs processing workflow mimics what Yelp did: ingest logs, transform them, and make them queryable by SQL engines like Amazon Athena. This allows teams to skip writing custom log-processing pipelines and still get the benefit of scalable log analytics.

AWS itself published an example architecture for processing S3 server-access logs using a Glue job (particularly interesting when paired with Ray for scalable Python-based processing). The pipeline partitions, formats (into Parquet), catalogs the result, and then uses Athena (or, in some cases, visualization tools like QuickSight) to query or analyze access patterns at scale. This essentially matches the “compaction + table + catalog + query” pattern that Yelp adopted, but as a managed recipe from AWS.

Additionally, projects like Druid (for analytical workloads / time-series or event data) and Presto/Trino (for SQL querying over large datasets, including S3 object stores) are often used as the underlying query engines for large-scale log or event data lakes. With logs converted to columnar formats (e.g., Parquet, ORC, or managed via lake-table formats like Apache Iceberg), these engines can serve as scalable, low-latency query layers – making them useful backbones for access-log, audit-log, or event-log architectures./p>

And for organizations that want near-real-time search/alerting (e.g., for security or anomaly detection), the AWS blog also describes a pattern to ingest server-access logs from S3 into OpenSearch (using Lambda + ingestion pipelines) and visualize them with Kibana. Though this trades off some of the long-term storage efficiency that Parquet + Athena offers, it delivers more immediacy and real-time investigative capability – useful in security, compliance, or operational monitoring contexts.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Quantum risk to quantum readiness: A PQC roadmap | Computer Weekly Quantum risk to quantum readiness: A PQC roadmap | Computer Weekly
Next Article 5 Mistakes Stalling Your Affiliate Business And How To Fix Them 5 Mistakes Stalling Your Affiliate Business And How To Fix Them
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

The “API First” Illusion: Why Your “Simple” Endpoints Turn Into Technical Debt (And How to Fix It) | HackerNoon
The “API First” Illusion: Why Your “Simple” Endpoints Turn Into Technical Debt (And How to Fix It) | HackerNoon
Computing
As DRAM Costs Soar, Prices For NVMe SSDs Start To Rise Too
As DRAM Costs Soar, Prices For NVMe SSDs Start To Rise Too
News
Flatpak Adds Support For Building OCI Bundles Using Zstd Compressed Layers
Flatpak Adds Support For Building OCI Bundles Using Zstd Compressed Layers
Computing
Ikea’s new Qi2 wireless chargers are cute, cheap and practical
Ikea’s new Qi2 wireless chargers are cute, cheap and practical
Gadget

You Might also Like

As DRAM Costs Soar, Prices For NVMe SSDs Start To Rise Too
News

As DRAM Costs Soar, Prices For NVMe SSDs Start To Rise Too

6 Min Read
Conduent data breach affected 10.5 million, included SSNs
News

Conduent data breach affected 10.5 million, included SSNs

4 Min Read
Apple TV app update brings Google Cast support on Android
News

Apple TV app update brings Google Cast support on Android

2 Min Read
Fallout season 2 is streaming one day early
News

Fallout season 2 is streaming one day early

1 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?