By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Working With LLMs for a Whole Year: The Lessons I Picked Up Along the Way | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Working With LLMs for a Whole Year: The Lessons I Picked Up Along the Way | HackerNoon
Computing

Working With LLMs for a Whole Year: The Lessons I Picked Up Along the Way | HackerNoon

News Room
Last updated: 2025/04/07 at 6:43 PM
News Room Published 7 April 2025
Share
SHARE

How Not to Mess Up Your AI Build (and Actually Save Time)

We’re in the era of AI. Every company wants to add some kind of AI feature to their product. LLMs (large language models) are popping up everywhere - some are even building their own. But while everyone’s jumping into the hype, I just want to share a few things I’ve learned after working with LLMs across different projects over the past year.

1. Pick the Right Model for the Right Job

One big mistake I’ve seen (and made) is assuming all LLMs are equal. They’re not.

Some models have way more knowledge about certain locations or domains (Gemini is great at some areas where OpenAI models don’t perform well). Some are good at reasoning but slow. Others are fast but bad at critical thinking.

Each model has its own strengths. Use OpenAI GPT-4 for deep reasoning tasks. Use Claude or Gemini for other areas depending on how they’ve been trained. Models like Gemini Flash are optimized for speed, but they tend to skip deeper reasoning.

The bottom line: Don’t use one model for everything. Be intentional. Try, test, and pick the best one for your use case.

2. Don’t Expect LLMs to Do All the Thinking

I used to believe you could just throw a prompt at an LLM, and it would do all the heavy lifting.

For example, I was working on a project where users selected their favorite teams, and the app had to generate a match-based travel itinerary. At first, I thought I could just send the whole match list to the LLM and expect it to pick the best ones and build the itinerary. It didn’t work.

It was slow, messy, and unreliable.

So I changed the approach: the system picks the right matches first, then passes only the relevant ones to the LLM to generate the itinerary. That worked much better.

Lesson? Let your app handle the logic. Use LLMs to generate things, not to decide everything. They’re great at language; not always great at logic, at least for now.

3. Give Each Agent One Responsibility

Trying to make a single LLM do multiple jobs is a recipe for confusion.

In one of my projects, I had a supervisor agent that routed messages to different specialized agents based on user input. Initially, I added too much logic to it - handling context, figuring out follow-ups, deciding continuity, etc. It eventually got confused and made the wrong calls.

So, I split it up. I moved some logic (like thread continuity) outside and kept the supervisor focused only on routing. After that, things became much more stable.

Lesson: Don’t overload your agents. Keep one responsibility per agent. This helps reduce hallucinations and improves reliability.

4. Latency Is Inevitable  –  Use Streaming

LLMs that are good at reasoning are usually slow. That’s just the reality right now. Some models like GPT-4 or Claude 2 take their time, especially with complex prompts. You can’t fully eliminate the delay, but you can make it feel better for the user.

One way to do that? Stream the output as it’s generated. Most LLM APIs support text streaming, allowing you to start sending partial responses - even sentence by sentence - while the rest is still being processed.

In my apps, I stream whatever’s ready to the client as soon as it’s available. It gives users a sense of progress, even if the full result takes a bit longer.

Lesson: You can’t avoid latency, but you can hide it. Stream early, stream often - even partial output makes a big difference in perceived speed.

5. Fine-Tuning Can Save You Time (and Tokens)

People often avoid fine-tuning because it seems complex or expensive. But in some cases, it actually saves a lot.

If your prompts need to include the same structure or context every time and caching doesn’t help, you’re spending extra tokens and time. Instead, just fine-tune the model with that structure. After that, you don’t need to pass the same example every time - it just knows what to do.

But be careful: don’t fine-tune on data that changes frequently, like flight times or prices. You’ll end up teaching the model outdated information. Fine-tuning works best when the logic and format are stable.

Lesson: Fine-tune when things are consistent - not when they’re constantly changing. It saves effort in the long run and leads to faster, cheaper prompts.

🎯 Final Thoughts

Working with LLMs is not just about prompts and APIs. It’s about architecture, performance, clarity, and most importantly, knowing what to expect (and what not to expect) from these models.

Hope this helps someone out there building their next AI feature.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article An In-Depth Look at the Essential Forklift Parts You Should Know
Next Article Get Your Taxes Filed but Don't Get Scammed
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

TuSimple closes Guangzhou gaming division and faces legal claims from employees · TechNode
Computing
Windsurf Launches SWE-1 Family of Models for Software Engineering
News
How to watch Microsoft’s Build 2025 conference
News
12 Ways to Upgrade Your Wi-Fi and Make Your Internet Faster
Gadget

You Might also Like

Computing

TuSimple closes Guangzhou gaming division and faces legal claims from employees · TechNode

1 Min Read
Computing

Bolt tested inDrive’s fare negotiation model in Nigeria; here’s how that went

6 Min Read
Computing

13 Top MeetGeek AI Alternatives for Better Meeting Management

39 Min Read
Computing

Meet the HackerNoon Top Writers – Laszlo Fazekas and Kindness In Content Writing | HackerNoon

8 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?