Red Hat has announced the purchase of the startup Neural Magica company born in 2018 as a spinoff of MIT dedicated to software development and process optimization algorithms using generative AI. Specifically, its systems are responsible for accelerating generative AI inference workloads.
The company therefore has experience in inference performance engineering, and since its inception has shown a commitment to open source, which put it in line with Red Hat’s vision for high-performance AI workloads that They adapt directly to the customer’s specific use cases and data at any point in the hybrid cloud.
The large language models that support generative AI systems are increasing in size, which also increases the computing power necessary for the development of profitable and reliable LLMs. Also the consumption of energy resources required to achieve it, as well as the specialized operational skills to execute them. This makes having personalized, ready-to-deploy AI with an adequate level of security awareness increasingly complicated for companies.
Given this, Red Hat has among its objectives to make generative AI more accessible to companies through the open innovation proposed by the vLLM project. Developed by the University of California at Berkeley, this project is driven by the community for open model serving, that is, it directly affects how generative AI models infer and solve problems.
The vLLM project offers support for all major model families, advanced research in inference acceleration, and various hardware backends. These include AMD or Nvidia GPUs, AWS Neuron, Google TPUs, Intel Gaudi, and x86 CPUs. Neural Magic is one of the companies that has stood out the most so far within the vLLM project, which caught the attention of Red Hat and led to its purchase.
The operation will add its experience to Red Hat’s portfolio of AI technologies for hybrid cloud, which will allow it to offer its clients solutions for the development of specific AI strategies to meet their needs, regardless of where their assets are hosted. data.
With the experience and knowledge of the vLLM project, Neural Magic develops an enterprise inference stack to optimize, deploy and scale LLM workloads in hybrid cloud environments with full control over the choice of infrastructure, security policies and the cycle model life.
Additionally, he conducts research in model optimization, develops the LLM Compressor library for LLM optimization with sparsity and quantization algorithms, and maintains a repository of pre-optimized models ready to deploy with vLLM. All of this will enhance Red Hat AI’s ability to support large language model deployments anywhere in the hybrid cloud with an open, streamlined inference stack.
According to Matt Hicks, CEO and President of Red Hat«AI workloads should run where customer data is in the hybrid cloud; This makes flexible, standardized, and open platforms and tools a necessity, allowing organizations to select the environments, resources, and architectures that best fit their unique operational and data needs. We are delighted to complement our hybrid cloud-centric AI portfolio with Neural Magic’s disruptive AI innovator, furthering our goal of not only being the ‘Red Hat’ of open source, but also the ‘Red Hat’ of AI.«.
For its part, Brian Stevens, CEO de Neural Magichas highlighted that «Open source has proven time and time again to drive innovation through the power of community collaboration. At Neural Magic, we’ve brought together some of the industry’s top talent in AI performance engineering with the singular mission of building open, cross-platform, and ultra-efficient LLM service capabilities. Joining Red Hat is not only a cultural fit, but will benefit companies large and small on their AI transformation journeys«.