By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Pruna AI open sources its AI model optimization framework | News
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Pruna AI open sources its AI model optimization framework | News
News

Pruna AI open sources its AI model optimization framework | News

News Room
Last updated: 2025/03/20 at 4:15 AM
News Room Published 20 March 2025
Share
SHARE

Pruna AI, a European startup that has been working on compression algorithms for AI models, is making its optimization framework open source on Thursday.

Pruna AI has been creating a framework that applies several efficiency methods, such as caching, pruning, quantization and distillation, to a given AI model.

“We also standardize saving and loading the compressed models, applying combinations of these compression methods, and also evaluating your compressed model after you compress it,” Pruna AI co-fonder and CTO John Rachwan told News.

In particular, Pruna AI’s framework can evaluate if there’s significant quality loss after compressing a model and the performance gains that you get.

“If I were to use a metaphor, we are similar to how Hugging Face standardized transformers and diffusers — how to call them, how to save them, load them, etc. We are doing the same, but for efficiency methods,” he added.

Big AI labs have already been using various compression methods already. For instance, OpenAI has been relying on distillation to create faster versions of its flagship models.

This is likely how OpenAI developed GPT-4 Turbo, a faster version of GPT-4. Similarly, the Flux.1-schnell image generation model is a distilled version of the Flux.1 model from Black Forest Labs.

Distillation is a technique used to extract knowledge from a large AI model with a “teacher-student” model. Developers send requests to a teacher model and record the outputs. Answers are sometimes compared with a dataset to see how accurate they are. These outputs are then used to train the student model, which is trained to approximate the teacher’s behavior.

“For big companies, what they usually do is that they build this stuff in-house. And what you can find in the open source world is usually based on single methods. For example, let’s say one quantization method for LLMs, or one caching method for diffusion models,” Rachwan said. “But you cannot find a tool that aggregates all of them, makes them all easy to use and combine together. And this is the big value that Pruna is bringing right now.”

Left to right: Rayan Nait Mazi, Bertrand Charpentier, John Rachwan, Stephan GünnemannImage Credits:Pruna AI

While Pruna AI supports any kind of models, from large language models to diffusion models, speech-to-text models and computer vision models, the company is focusing more specifically on image and video generation models right now.

Some of Pruna AI’s existing users include Scenario and PhotoRoom. In addition to the open source edition, Pruna AI has an enterprise offering with advanced optimization features including an optimization agent.

“The most exciting feature that we are releasing soon will be a compression agent,” Rachwan said. “Basically, you give it your model, you say: ‘I want more speed but don’t drop my accuracy by more than 2%.’ And then, the agent will just do its magic. It will find the best combination for you, return it for you. You don’t have to do anything as a developer.”

Pruna AI charges by the hour for its pro version. “It’s similar to how you would think of a GPU when you rent a GPU on AWS or any cloud service,” Rachwan said.

And if your model is a critical part of your AI infrastructure, you’ll end up saving a lot of money on inference with the optimized model. For example, Pruna AI has made a Llama model eight times smaller without too much loss using its compression framework. Pruna AI hopes its customers will think about its compression framework as an investment that pays for itself.

Pruna AI raised a $6.5 million seed funding round a few months ago. Investors in the startup include EQT Ventures, Daphni, Motier Ventures and Kima Ventures.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Early Tesla investor calls for Elon Musk to resign amid stock slump
Next Article Insta360 Link 2 Review
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Related Work: vAttention in LLM Inference Optimization Landscape | HackerNoon
Computing
What is ‘Grow a Garden?’ The Roblox farming simulator exploding in popularity
News
Sam Altman says Meta tried and failed to poach OpenAI’s talent with $100M offers | News
News
China’s cyberspace chief makes Time100 AI list, along with Baichuan AI founder · TechNode
Computing

You Might also Like

News

What is ‘Grow a Garden?’ The Roblox farming simulator exploding in popularity

2 Min Read
News

Sam Altman says Meta tried and failed to poach OpenAI’s talent with $100M offers | News

4 Min Read
News

The 5 best TV shows of 2025 so far, now that we’re halfway through the year

7 Min Read
News

Senate passes stablecoin framework in major crypto milestone

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?