Google LLC’s DeepMind artificial intelligence lab has released one of its smallest models yet in the shape of Gemma 3 270M, with just 270 million parameters.
That means it’s much smaller than many of the most powerful frontier large language models, which generally have billions of parameters, or internal settings that govern their behavior.
The number of parameters in a model generally describes how powerful it is, but with Gemma 3 270M, Google has opted to create something that’s much more streamlined, with the intention being that it can run directly on low power devices such as smartphones, without an internet connection. Despite this, Google says Gemma 3 270M is still more than capable of handling a narrow range of complex, domain-specific tasks, because developers can quickly fine-tune it to meet their needs.
Google DeepMind Staff AI Developer Relations Engineer Omar Sanseviero said in a post on X that Gemma 3 270M is open-source and small enough to run “in your toaster,” or alternatively on a device such as the palm-sized Raspberry Pi computer.
This can run in your toaster or directly in your browser
Try it in https://t.co/KAfiH3hUnf
— Omar Sanseviero (@osanseviero) Aug. 14, 2025
— Omar Sanseviero (@osanseviero) August 14, 2025
In a blog post announcing Gemma 3 270M, Google’s DeepMind team explained that the model combines 170 million “embedding parameters” with 100 million “transformer block parameters.” It’s able to handle very specific and rare tokens too, making it a “strong base model” that can be fine-tuned on specific tasks and languages.
The company added that Gemma 3 270M’s architecture is suitable for “strong performance” in instruction-following tasks, yet small enough to be fine-tuned rapidly and deployed on devices with limited power. Its architecture is based on the larger Gemma 3 models, which are designed to run on a single graphics processing unit, and comes with various fine-tuning recipes, documentation and deployment guides for developer tools including Hugging Face, JAX and UnSlot to help users start building applications for the model quickly.
Strong performance in instruction following
Gemma 3 270M’s benchmark results look fairly impressive. On the IFEval benchmark, which aims to measure AI models’ ability to follow instructions properly, an instruction-tuned version of the model achieved a 51.2% score, according to results shared on X. That surpasses the score of similarly sized small models such as Qwen 2.5 0.5B Instruct and SmolLM2 135M Instruct by a large margin. It’s also not far behind some of the smaller billion-parameter models, Google noted.
That said, Gemma 3 270M may not be the best in its class. One of Google’s rivals, a startup called Liquid AI Inc., posted in response that the company neglected to include its LFM2-350M model, which was launched last month and achieved a 65.12% score on the same benchmark, despite only having a few more parameters.
great release, tho you forgot to include the SoTA in the chart: LFM2-350M @LiquidAI_ pic.twitter.com/n0SQWPmyWV
— Ramin Hasani (@ramin_m_h) August 14, 2025
Nonetheless, Google stressed that Gemma 3 270M is all about energy efficiency, pointing to internal tests using the INT4-quantized version of the model on a Pixel 9 Pro smartphone. It said that in 25 conversations, the model only used up 0.75% of the Pixel’s battery power.
As such, Google says Gemma 3 270M is an excellent option for developers looking to deploy on-device AI, which is often preferable for applications where privacy and offline functionality are necessary.
Accelerating offline and on-device AI
Google stressed that AI developers need to choose the right tool for the job, rather than simply focusing on model size to increase the performance of their AI applications. For workloads such as creative writing, compliance checks, entity extraction, query routing, sentiment analysis and structured text generation, it believes that Gemma 3 270M can be fine-tuned to do an effective job with much greater cost efficiency than a multibillion-parameter large language model.
In a demo video posted on YouTube, Google showed how one developer built a Bedtime Story Generator app powered by Gemma 3 270M. It’s capable of running offline in a web browser and creating original stories for kids based on the parent’s prompts:
The video demonstrates Gemma 3 270M’s ability to synthesize multiple inputs at once, so the user could specify a main character, such as a magic cat, a setting, like an enchanted forest, a theme for the story, a plot twist, such as the character finds a mysterious box with something inside, and also the length of the story. Once the user sets these parameters, Gemma 3 270M quickly generates a coherent, original story based on the user’s inputs.
It’s a great example of how quickly on-device AI is progressing, creating possibilities for new kinds of applications that don’t even need an internet connection.
Google said Gemma 3 270M can be found on Hugging Face, Docker, Kaggle, Ollama and LM Studio, with both pretrained and instruction-tuned versions available to download.
Image: Google
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
- 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
- 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About News Media
Founded by tech visionaries John Furrier and Dave Vellante, News Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.