By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Google DeepMind Launches EmbeddingGemma, an Open Model for On-Device Embeddings
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Google DeepMind Launches EmbeddingGemma, an Open Model for On-Device Embeddings
News

Google DeepMind Launches EmbeddingGemma, an Open Model for On-Device Embeddings

News Room
Last updated: 2025/09/11 at 3:20 PM
News Room Published 11 September 2025
Share
SHARE

Google DeepMind has introduced EmbeddingGemma, a 308M parameter open embedding model designed to run efficiently on-device. The model aims to make applications like retrieval-augmented generation (RAG), semantic search, and text classification accessible without the need for a server or internet connection.

The model is built with Matryoshka representation learning, allowing embeddings to be truncated to smaller vectors, and uses Quantization-Aware Training for efficiency. According to Google, inference can be done in under 15ms for short inputs on EdgeTPU hardware.

Despite its compact size, EmbeddingGemma ranks as the highest-performing open multilingual embedding model under 500M parameters on the Massive Text Embedding Benchmark (MTEB). It supports over 100 languages and can operate in less than 200MB of RAM when quantized. Developers can also adjust output dimensions (from 768 to 128) for different speed and storage trade-offs, while maintaining quality.

Source: Google Blog

EmbeddingGemma is intended for offline and privacy-sensitive scenarios, such as searching across personal files locally, running mobile RAG pipelines with Gemma 3n, or building domain-specific chatbots. Developers can fine-tune the model for specialized tasks, and it is already integrated with tools like transformers.js, llama.cpp, MLX, Ollama, LiteRT, and LMStudio.

On Reddit, users discussed the role of embeddings in practice:

What do people actually use embedding models for? I knew the applications, but how does it specifically help with them?

User igorwarzocha replied: 

Apart from obvious search engines, you can place it between a larger model and your database as a helper model. A few coding apps have this functionality. unsure if this actually helps or confuses the LLM even more.

I tried using it as a “matcher” for description vs keywords (or the other way round, can’t remember) to match an image from the generic assets library to the entry, without having to do it manually. It kind of worked, but I went with bespoke generated imagery instead.

Beyond search, Google suggests EmbeddingGemma could be used for offline assistants, personal file search, or industry-specific chatbots where privacy is a concern. Since the model processes data locally, sensitive information such as emails or business documents does not need to leave the device. Developers can also fine-tune the model for domain-specific tasks or languages.

With this release, Google positions EmbeddingGemma as a complement to its larger server-side Gemini Embedding model, offering developers a choice between offline, efficient embeddings for local applications and higher-capacity embeddings served through the Gemini API for large-scale deployments.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Apple Watch Series 11 vs. Ultra 3: Which Apple Wearable Should You Pick?
Next Article Complete Guide to Google Ads Budgets (+Free Template!) | WordStream
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Honor announces Alpha Strategy at MWC 2025, pledging $10 billion for AI ecosystem development · TechNode
Computing
M&S parts ways with CTO after cyber attack | Computer Weekly
News
These 3 “optimizer” apps made my PC worse
Computing
Smarter Security, Smaller Price: Google’s Nest Cam Duo Is 25% Off Right Now
News

You Might also Like

News

M&S parts ways with CTO after cyber attack | Computer Weekly

4 Min Read
News

Smarter Security, Smaller Price: Google’s Nest Cam Duo Is 25% Off Right Now

4 Min Read
News

iPhone 17 vs. Pixel 10: Which phone has the better display, price, and specs?

8 Min Read

The Viral Hearing Aid That Makes Conversations Effortless

0 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?