By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Hugging Face Introduces RTEB, a New Benchmark for Evaluating Retrieval Models
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Hugging Face Introduces RTEB, a New Benchmark for Evaluating Retrieval Models
News

Hugging Face Introduces RTEB, a New Benchmark for Evaluating Retrieval Models

News Room
Last updated: 2025/10/16 at 2:58 AM
News Room Published 16 October 2025
Share
Hugging Face Introduces RTEB, a New Benchmark for Evaluating Retrieval Models
SHARE

Hugging Face introduced the Retrieval Embedding Benchmark (RTEB), a new evaluation framework designed to more accurately measure how well embedding models generalize in real-world retrieval tasks. The beta benchmark aims to establish a community standard for evaluating retrieval accuracy in both open and private datasets.

Retrieval quality is crucial for various AI systems, such as RAG, intelligent agents, enterprise search, and recommendation engines. However, existing benchmarks often do not represent real-world performance accurately. Models may perform well on public benchmarks but often fall short in production due to being indirectly trained on that evaluation data, resulting in a “generalization gap.” This makes it difficult for developers to predict how their models will handle unseen data.

RTEB tackles this problem with a hybrid evaluation strategy. It combines open datasets, which are public and reproducible, with private datasets that remain accessible only to the MTEB maintainers, ensuring that results reflect genuine generalization rather than memorization. For each private dataset, only descriptive statistics and sample examples are released, maintaining transparency while preventing data leakage.

In addition to its methodological improvements, RTEB focuses on real-world applicability. It includes datasets across critical domains such as law, healthcare, finance, and code, covering 20 languages from English and Japanese to Bengali and Finnish. The benchmark’s simplicity is also deliberate: datasets are large enough to be meaningful but small enough to enable efficient evaluation.

The launch of RTEB has already sparked discussion among AI researchers and practitioners. On LinkedIn, Shai Nisan, Ph.D., Head of AI at Copyleaks, commented:

Beautiful work! Thank you for this. Anyways, it’s highly important to have your own private benchmark on your specific task. That’s the best way to predict success.

Tom Aarsen, one of the benchmark’s co-authors and a maintainer of Sentence Transformers at Hugging Face, replied:

That’s the be-all-end-all, but not everyone has that data ready. If you can, though: use your own tests. E.g. Sentence Transformers allows for easily swapping out models.

The team also notes several limitations and future directions for RTEB. The benchmark currently focuses on text-only retrieval and may later expand to include multimodal tasks such as text-to-image search. The maintainers are also working to extend language coverage, particularly for Chinese, Arabic, and low-resource languages, and are encouraging community contributions of new datasets.

With RTEB now live on Hugging Face’s MTEB leaderboard under the new Retrieval section, developers and researchers can already submit their models for evaluation. The project’s maintainers emphasize that this is only the beginning: RTEB will evolve through open collaboration, with the long-term goal of becoming the community’s trusted standard for measuring retrieval performance in AI.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article US senator Ed Markey proposes TikTok ban deadline extension bill while TikTok plans to shut down on Sunday · TechNode US senator Ed Markey proposes TikTok ban deadline extension bill while TikTok plans to shut down on Sunday · TechNode
Next Article Advancing Medical Image Analysis Through Machine Learning Expertise By Abhijeet Sudhakar | HackerNoon Advancing Medical Image Analysis Through Machine Learning Expertise By Abhijeet Sudhakar | HackerNoon
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Samsung will debut two new wireless speakers at CES 2026
Samsung will debut two new wireless speakers at CES 2026
News
Here are six exciting Apple product launches to look forward to in 2026 – 9to5Mac
Here are six exciting Apple product launches to look forward to in 2026 – 9to5Mac
News
I Can't Stop Gawking at Samsung's New Wi-Fi Speaker
I Can't Stop Gawking at Samsung's New Wi-Fi Speaker
News
Drop those free apps — own Microsoft Office for just
Drop those free apps — own Microsoft Office for just $35
News

You Might also Like

Samsung will debut two new wireless speakers at CES 2026
News

Samsung will debut two new wireless speakers at CES 2026

2 Min Read
Here are six exciting Apple product launches to look forward to in 2026 – 9to5Mac
News

Here are six exciting Apple product launches to look forward to in 2026 – 9to5Mac

5 Min Read
I Can't Stop Gawking at Samsung's New Wi-Fi Speaker
News

I Can't Stop Gawking at Samsung's New Wi-Fi Speaker

2 Min Read
Drop those free apps — own Microsoft Office for just
News

Drop those free apps — own Microsoft Office for just $35

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?