By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: RAG Systems Are Breaking the Barriers of Language Models: Here’s How | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > RAG Systems Are Breaking the Barriers of Language Models: Here’s How | HackerNoon
Computing

RAG Systems Are Breaking the Barriers of Language Models: Here’s How | HackerNoon

News Room
Last updated: 2025/07/23 at 6:39 PM
News Room Published 23 July 2025
Share
SHARE

Large language models (LLMs) are very popular in the software world these days. They introduce new articles, blog posts, courses, and models from leading companies in our industry, such as Meta, Huggingface, Microsoft, etc., which requires us to follow new technologies closely.

We’ve decided to write some short, informative articles to introduce these topics and keep up-to-date on the latest technology. The first topic we will cover will be RAG (Retrieval-Augmented Generation).

We will create a series of articles on the topic we have determined, with three different articles that are useful and complement each other. In this article, we begin our series with the definition and basic information of RAG models.

Large Language Models have entered every aspect of our lives. We could say they’ve revolutionized the field. However, they’re not as seamless a tool as we like to call them. It also has a major drawback: It’s faithful to its training. It remains faithful to the data it was trained on. They can’t deviate from this. A model that completed training in November 2022 won’t be able to master news, laws, technological developments, etc., that emerged in January 2023. For example, an LLM model that completed training and entered service in 2021 cannot answer a question about the Russia-Ukraine war that began on February 24, 2022.

This is because its development was completed before that date. Of course, this problem was not left unsolved, and a new product, a new system, was introduced. The RAG (Retrieval-Augmented Generation) system emerged to provide you with up-to-date information whenever you need it. In the rest of our article, let’s take a closer look at both the system created with the LLM model and the RAG system, one by one, to get to know them.

Comparison of LLM and Rag SystemComparison of LLM and Rag System

Access to Information in LLM Models

The working principle of Large Language Models is based on the data taught during training, in other words, static knowledge. They have no ability to extract external data through any means. To give an example, consider a child. If we cut off this child’s external communication and teach them only English, we won’t be able to hear a single Chinese word from them.

This is because we’ve raised a child who is fluent in English, not Chinese. By cutting off their connection to the outside world, we’ve also restricted their ability to learn from external sources. Just like this child, LLM models are also infused with basic knowledge but are closed to external data.

LLM Models Architecture with encoder and decoder structuresLLM Models Architecture with encoder and decoder structures

Another characteristic of LLM models is that they are black boxes. These models aren’t fully aware of why they perform the operations they perform. They base their calculations solely on mathematical operations. Asking any LLM model, “Why did you give that answer?” You are probably pushing him too hard. They don’t answer questions through reasoning or research. The keyword here is “why?” We can also consider it a bug in LLM models.

To better understand this structure, let’s consider an example from the healthcare field. When a user asks, “I have pain in my face and eyes, and persistent postnasal drip. What should I do?”, the LLM model might respond, “Pain in my face and eyes, and postnasal drip could be signs of sinusitis. Please consult a doctor. Acute sinusitis is treated with antibiotics. In addition to medication, you can use nasal sprays like seawater or saline to soothe the sinuses.”

Everything seems normal up to this point. But if we ask the model, “Why did you give that answer?” after this answer, things get complicated. The reason the model provided this answer is because the words “facial pain” and “nasal drip” frequently appeared together with the word “sinusitis” in the training data. These models prefer to store information in their memory as a statistical pattern. Mathematical expressions are important for LLM models.

Because it doesn’t prefer to store information in its memory based on sources, it answers the question and satisfies most people instantly, but when people with a more investigative nature ask the model, “Why did you give this answer?”, the model fails to produce any explanatory answer. I believe we now have a sufficient understanding of LLM models. Now, we can discuss the RAG system and its solutions to these problems.

RAG: Combining LLMs With Retrieval Systems

RAG systems offer innovations compared to systems built on pure LLM models. One of these is that they work with dynamic information, not static information like LLM models. In other words, they also scan external sources without being limited to the data they were trained on. The first “r” in RAG stands for retrieval. The retrieval component’s role is to perform search operations.

The generator is the second main component, and its role is to generate the correct answer based on the data retrieval returns. In this article, we will briefly touch on RAG’s operating principle: Retrieval scans information from external sources and retrieves documents relevant to the user’s question by breaking them into small pieces called “chunks.”

It vectorizes these chunks and the user’s question, and then generates the most efficient answer by examining the fit between them. This task of generating the answer is performed by “Generation.” We will discuss it in detail in the next article in this series. These basic components that make up RAG systems make it a stronger structure. They prevent them from being stuck in static information like pure LLM models. RAG systems offer significant advantages, especially in many fields that require up-to-date information. For example, you might want to create a doctor’s model in the medical field.

Because your model will serve a vital area, it won’t have outdated or incomplete information. The medical world, just like the IT sector, is developing every day, and new studies are being put forward. Therefore, your model is expected to master even the latest studies. Otherwise, you risk endangering human life with a misleading model. In such cases, RAG-supported systems eliminate the problem of outdated information by connecting to external databases.

Base form of RAG ArchitectureBase form of RAG Architecture

The fundamental difference between RAG systems and LLM models is RAG’s core philosophy: “Don’t store information, access it when you need it!” While pure large language models store information in their memories and produce answers after being trained, RAG systems access information by searching and scanning outside whenever they need it, in line with their philosophy.

Just like a human searching the internet, this overcomes one of the most significant disadvantages of pure LLM models: their memory-reliance. To further illustrate our point, we can compare these two systems.

Scenario:

User: “What is the United States’ December 2024 inflation rate?”

LLM: “According to December 2022 data, it was 6.5%.” (Not an up-to-date answer)

RAG:

  1. Retrieves December 2024 data from a reliable source or database (World Bank, Trading Economics, etc.).
  2. LLM uses this data and responds, “According to Trading Economics, inflation in the United States for December 2024 is announced as 2.9%.”

Let’s briefly compare what we’ve discussed so far and present it in the table below.

Features

LLM (static model)

RAG (retrieval-augmented generation)

Information

Limited to training data

Can pull real-time information from external sources

Current Level

Low

High

Transparency

The source of the decision cannot be disclosed (black box)

The source can be cited

In conclusion, to summarize briefly, while LLM models are limited to the data they are trained on, RAG systems are not only built on a specific LLM model and have basic knowledge, but also have the ability to draw real-time information from external sources. This advantage ensures that it is always up-to-date. This concludes the first article in our series. In the next article, we will delve into the more technical aspects. For friends who want to access or examine the practice of this work, they can find the links to the relevant repos I created with Python and related libraries in my GitHub account at the end of the article.

Hope to see you in the next article of the series.

Metin YURDUSEVEN.

Further Reading

“We would also like to acknowledge the foundational contribution of Facebook AI’s 2020 RAG paper, which significantly informed this article’s perspective.”

metinyurdev github

Multi-Model RAG Chatbot Project

PDF RAG Chatbot Project

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article WhatsApp is refused right to intervene in Apple legal action on encryption ‘backdoors’ | Computer Weekly
Next Article Easy-to-miss Sky gaming freebie hidden on millions of TVs unlocks hours of fun
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Faisal Movers is Leading the Transportation In Pakistan
Gadget
62 million Americans facing automatic slash to Social Security checks by 24%
News
SharePoint users hit by Warlock ransomware, says Microsoft | Computer Weekly
News
Sky customers can claim FREE Sainsbury’s snacks with millions up for grabs
News

You Might also Like

Computing

Vulkan + Mesa Drivers For AI Inferencing? It’s Already Showing Potential On Radeon RADV

3 Min Read
Computing

The HackerNoon Newsletter: The Tech Behind War Robots’ First Sword-Wielding Mech (7/24/2025) | HackerNoon

2 Min Read
Computing

PEPE Holders Search For The Next 100x: Is Pepeto The Breakout Memecoin Of 2025? | HackerNoon

5 Min Read
Computing

Zircuit Launches AI Trading Engine For Lightning-Fast, Cross-Chain Trading | HackerNoon

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?