By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: The 5-Second Data Delay That Nearly Broke Us (And How We Fixed It in Milliseconds) | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > The 5-Second Data Delay That Nearly Broke Us (And How We Fixed It in Milliseconds) | HackerNoon
Computing

The 5-Second Data Delay That Nearly Broke Us (And How We Fixed It in Milliseconds) | HackerNoon

News Room
Last updated: 2025/03/30 at 7:55 PM
News Room Published 30 March 2025
Share
SHARE

Less than a month ago, our team came upon a challenge that sounded deceptively simple. The task at hand was to create a dashboard that monitored transactions in real time for a particular financial platform we maintained. In principle, it was straightforward – Event data would be put into DynamoDB, and OpenSearch would be used for real-time analytics.

There was a simple expectation: For every single event that is recorded in DynamoDB, OpenSearch should be ready to analyze it instantly. No delays. No needless waiting.

And this is where we completely miscalculated.

When “Real-Time” Wasn’t Really Real-Time

I remember vividly our CTO’s reaction during the first demo. “Why is there such a lag?” he asked while pointing at the dashboard, which astonishingly was displaying metrics from almost 5 seconds ago.

We opted for lower than 1-second latency. What we provided was a system that sometimes, during surges of traffic, was as far behind as 3-5 seconds. Sometimes, it was worse. In finance monitoring, that kind of lag is as good as hours. Proactive or reactive? A system failure detected immediately versus finding out after customer complaints? That’s the true difference.

Some drastic changes were required and required fast.

Our First Attempt: The Traditional Batch Approach

Just like other teams that had come before us, we too leaned on familiar territory:

  1. Schedule an AWS Lambda job for execution every 5 seconds.
  2. Make it retrieve new records stored in DynamoDB.
  3. Collect these updates and group them into batches.
  4. Push the batches to OpenSearch for indexing.

It worked. Kind of. It is problematic that “working” and “fulfilling” are two terms that are completely divorced from each other.

Things crashed and burned at such an impressive rate:

  • Intrinsically Induced Delay: For 5 seconds, new data simply stayed put in DynamoDB waiting to be transferred, and could not be moved.

  • Latency in Indexing: Batching within OpenSearch caused great delays in queries, making it increasingly slow

  • Issues of Reliability: Crashing mid-batch meant that all updates in the batch were irretrievable.

During one case that was particularly annoying, our system missed certain fundamental error events inductively because they were encompassed in a batch failure. By the time the issue was diagnosed, thousands of transactions had already gone through the system.

“For the love of god, this isn’t sustainable.” He continued after recounting yet another report. “There is a clear need to change the system fundamentally and allow these updates to be streamed live instead of batched.”

He was exceptionally correct.

The Solution: Live Streaming Updates As They Occur

I was working on the AWS documentation one night for an extended period of time, and the solution hit me: DynamoDB Streams.

What if we could capture and process every single change made to the DynamoDB table the second it happened instead of having to pull updates in batches on a schedule?

This completely changed how we operate for the better:

  • Set up DynamoDB Streams to capture every insert, modification, and removal of the records

  • Add AWS Lambda functions that will process these changes instantly

  • Push the updates to OpenSearch and do some light processing on the data.

In my initial tests, the results were incredible. The latency dropped from 3-5 seconds to under 500ms. I will never forget the message I sent to the team Slack at 3 AM waiting for ‘We think we have cracked it’ responses.

Making it Work in Practice

This was not just a software engineering assignment or design project where we had to get a proof of concept pipeline working. In one of many sleepless nights over mountains of coffee, we decomposed our problem into three actionable steps. The first one was to get notified about changes in DynamoDB.

Getting Notified About Changes in DynamoDB

Challenge number one was, how do we know that something changed in DynamoDB? After some Googling, I discovered that we need to enable DynamoDB Streams. As it turned out, it was one CLI command away, albeit a painful one for me at first.

aws dynamodb update-table 
    --table-name RealTimeMetrics 
    --stream-specification StreamEnabled=true,StreamViewType=NEW_IMAGE

I still recall telling my coworker at 11 PM how excited I was when I got this to work: “It’s working! The table is streaming all changes it goes through!”

Creating a Change Event Listener

And now, with streams enabled, we needed something to catch those events. Thus, we built a Lambda function I decided to call “The Watchdog.” It waits for DynamoDB to inform it of a change, and as soon as it does, off to action it goes.

import json
import boto3
import requests

OPENSEARCH_URL = "https://your-opensearch-domain.com"
INDEX_NAME = "metrics"
HEADERS = {"Content-Type": "application/json"}

def lambda_handler(event, context):
    records = event.get("Records", [])

    for record in records:
        if record["eventName"] in ["INSERT", "MODIFY"]:
            new_data = record["dynamodb"]["NewImage"]
            doc_id = new_data["id"]["S"]  # Primary key
              
            # Convert DynamoDB format to JSON
            doc_body = {
                "id": doc_id,
                "timestamp": new_data["timestamp"]["N"],
                "metric": new_data["metric"]["S"],
                "value": float(new_data["value"]["N"]),
            }

            # Send update to OpenSearch
            response = requests.put(f"{OPENSEARCH_URL}/{INDEX_NAME}/_doc/{doc_id}", 
                                   headers=HEADERS, 
                                   json=doc_body)
            print(f"Indexed {doc_id}: {response.status_code}")

    return {"statusCode": 200, "body": json.dumps("Processed Successfully")}

Looking simple now, the code was not so simple to write – it took me three attempts to operationalize. Our initial attempt continuously crashed due to a timeout because we were ignoring the response format from DynamoDB.

Teaching OpenSearch to Keep Up

That last issue was the most difficult one to solve and caught us off guard. Even with immediately updating OpenSearch, the updates were not available in real-time. It turns out that OpenSearch uses its own batching technique for simplicity.

“This makes no sense,” my teammate moaned. “We are sending real-time data and it does not show in real-time!”

curl -X PUT "https://your-opensearch-domain.com/metrics/_settings" -H 'Content-Type: application/json' -d ' { "index": { "refresh_interval": "500ms", "number_of_replicas": 1 } }'

After a bit of research and trial and error, we located the parameters we needed to modify. Making this modification made a huge difference. *It instructed OpenSearch to make new data available for search within half a second instead of waiting for its refresh cycle. I was about to leap out of my seat when I witnessed the first event show up in our dashboard milliseconds after it had been created in DynamoDB.”

Consequences: From Seconds to Milliseconds

The first week with the new system in production taught us a great deal. The dashboard was no longer a lagged view asynchronous to reality but rather a fully functioning service.

We achieved:

  • Average latency of < 500ms (was 3-5 seconds)

  • No more batch delays – propagation of changes was instantaneous

  • Zero indexing bottlenecks – smaller, more frequent updates were more efficient

  • Improved overall system resilience – no more ‘all or nothing’ batch failures

When we shared the updated dashboard with our leadership, they noticed the change immediately. The difference was clear; our CTO mentioned, “This is what we needed from the beginning.”

Further Scaling: Managing Traffic Peaks

As good as the new approach was for normal traffic, it had its difficulties during extreme usage spikes. During reconciliation periods at the end of the day, the rate of events would increase, ranging from hundreds up to thousands per second.

To mitigate this issue, we added Amazon Kinesis Firehose as a buffer. Instead of lambda sending every single update directly to OpenSearch, we changed it so that it would stream data out to Firehose:

firehose_client = boto3.client("firehose")

firehose_client.put_record(
    DeliveryStreamName="MetricsStream",
    Record={"Data": json.dumps(doc_body)}
)

Firehose took care of delivery to OpenSearch, scaling automatically in response to throughput requirements without compromising the pipeline’s real-time characteristics.

Lessons Learned: The Quest for Speed Goes On

We learned that with real-time data systems, constantly working to reduce lag time is a never-ending battle. We are trying even harder right now, and we have gotten our metrics lag down to 500ms:

  • Running transformation steps in OpenSearch Pipelines instead of in Lambda

  • Using AWS ElastiCache for faster retrieval of frequent queries

  • Considering edge computing for users spread around the globe

In financial monitoring, every microsecond counts. As one of my senior engineers says, “It’s not a pleasant place to be in monitoring; you’re either ahead of a problem or behind it. No in-between.”

What Have You Tried?

Have you tried solving issue of real-time data problems? What did you do that made a difference? I’m eager to learn about your adventures on the edges with AWS.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article I tested game emulators on the Samsung Galaxy S25 Ultra, but found a better option instead
Next Article The best MacBook deals in Amazon’s Big Spring Sale: The new M4 Airs are $50 off
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Looking For the Perfect Gaming Phone Under Rs 30,000? Check Out Top Picks in June 2025 With Price in India and Features
Mobile
What Is an Internal Search Engine? Best Tools & How They Work
Computing
Experts share everything you need to know about snoring — what is it, why it happens and how to stop it
News
Today's NYT Strands Hints, Answer and Help for June 21 #475- CNET
News

You Might also Like

Computing

What Is an Internal Search Engine? Best Tools & How They Work

18 Min Read
Computing

Scattered Spider Behind Cyberattacks on M&S and Co-op, Causing Up to $592M in Damages

4 Min Read
Computing

XLibre 25.0 Released As Inaugural Version Of X.Org Server Fork

4 Min Read
Computing

JD slashes prices for 618 campaign amid market pressures · TechNode

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?