By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Red Hat Ai Inference Server Democratiza at the generation
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Mobile > Red Hat Ai Inference Server Democratiza at the generation
Mobile

Red Hat Ai Inference Server Democratiza at the generation

News Room
Last updated: 2025/05/21 at 4:45 AM
News Room Published 21 May 2025
Share
SHARE

Red Hat AI Inference Server is the new inference business server that comes to make the vision of Red Hat of Execute any generative the AI ​​model In any AI accelerator in any cloud environment. It has been presented at the Red Hat Summit that the company is celebrating this week accompanying the presentation of the Red Hat Enterprise Linux 10, the star of the event.

Red Hat AI Inference Server is integrated as a new offer within the Red Hat AI ecosystem, born from the powerful VLLM community project and is optimized with the integration of Magic Neural Technologies by Red Hat, offering faster speed, efficiency in the use of accelerators and profitability in the hybrid cloud. The server can be deployed independently or as an integrated component of Red Hat Enterprise Linux AI (Rhel AI) and Red Hat Onshift AI.

Red Hat ai inference server: generative hybrid cloud.

Red Hat explains that inference is the Critical Execution Motor of the AIwhere pre-straight models convert the data into practical applications. It is the key point of the user’s interaction, which demands quick and precise answers, as Joe Fernandes, vice president and general manager of the AI ​​Business Unit on Red Hat:

«Inference is where the true promise of generative AI becomes a reality, where user interactions are responded quickly and precision thanks to a specific model, but this must be done effectively and profitably. Red Hat AI Inference Server is designed to meet the demand for high performance inference and with a scale response capacity, keeping resource needs and providing a common inference layer that admits any model, which is executed in any accelerator in any environment ».

And as the generative models become increasingly complex and the deployments in production increase, inference can become an important bottleneck, quickly consuming hardware resources and threatening to paralyze the response capacity and increase operating costs.

Robust inference servers are no longer a luxury, but a need to unlock the true potential of AI on scale and overcome the underlying complexities in an easier way. The new Red Hat server directly addresses these challenges directly, proposing an open inference solution designed for high performance and equipped with leading compression and optimization tools of models.

This innovation allows organizations to make the most of the transforming power of the generative AI by offering much more receptive user experiences and unprecedented freedom in their choice of AI accelerators, models and IT environments.

In any deployment environment, Red Hat AI Inference Server provides users with a reinforced and supported distribution of VLLM, in addition to:

  • Smart compression tools LLM to drastically reduce the size of both foundational and tight AI models, minimizing computer consumption and, at the same time, preserving and potentially improving the precision of the model.
  • An optimized models repositoryhoused in the Red Hat AI organization in Hugging Face, which offers instant access to a validated and optimized collection of models of the leaders ready for the deployment of inference, which helps to accelerate efficiency between 2 and 4 times without compromising the accuracy of the model.
  • Business support of Red Hat with decades of experience in bringing community projects to production environments.
  • Third -party support For greater flexibility of deployment, which allows Red Hat AI Inference Server to deploy on Linux and Kubernetes platforms that are not Red Hat, following the support policy of third -party Red Hat.

VLLM: Key to innovate in inference

Red Hat AI Inference Server is based on the VLLM project, industry leader, which was initiated by the University of California, Berkeley in mid -2023. This community project offers High -performance generative inferencesupport for extensive input contexts, multi-GPU acceleration of models, support for continuous batches, among others.

The broad VLLM support for publicly available models, along with its integration from the 0 of leading models such as Deepseek, Gemma, calls, flame Nemotron, Mistral, Phi, among others, as well as open reasoning models and business application as Nemotron calls, it positions it as a de facto standard for the future innovation in inference of ia. The growing acceptance of VLLM by the main suppliers of models consolidates its key role in the configuration of the future of generative AI.

The Vision of Red Hat

The future of AI «It must be defined by unlimited opportunities, And not because of the limitations imposed by infrastructure silos »says the Open Source giant, who sees a future where organizations can display any model, in any accelerator, through any cloud, offering an exceptional and more consistent user experience without exorbitant costs.

To unlock the true potential of investments in generative, companies They need a universal inference platform: «A standard for an innovation in the most fluid and high performance, both now and in the future».

Just as Red Hat was a pioneer in his open company proposal by transforming Linux into the base of modern IT, he is now prepared to design the future of AI inference. The Vllm potential is that of a central axis for the inference of the standardized generative, and Red Hat undertakes to create a prosperous ecosystem around not only to the VLLM community, but also to LLM-D for inference distributed at a scale.

The vision of Red Hat is clear: regardless of the AI ​​model, the underlying accelerator or the implementation environment, the company intends to Turn VLLM into the final open standard For inference in the new hybrid cloud.

More information:

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Communication Orchestration Is Not Omnichannel Marketing
Next Article NotebookLM is getting AI video explainers, and they look surprisingly useful
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

See Why Cold Wallet’s Presale Is the #1 Crypto Pick for 2025! Privacy, Security, and 100x Returns Await
Gadget
Google Unveils Synthid Detector Verification Portal to Combat Deepfakes
Software
My Jaw Dropped When Google Told Me How Its New AI Shopping Feature Handles Privacy
News
NIO reports mixed third quarter as new SUV faces slow ramp up · TechNode
Computing

You Might also Like

Mobile

The ice age had a solar storm so powerful that its effects can still be detected in trees

4 Min Read
Mobile

Synology Beestation Plus, data storage and multimedia in the local cloud

3 Min Read
Mobile

3 reasons to play Capcom Fighting Collection 2

6 Min Read
Mobile

Tactile laptop, light and with great autonomy

6 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?