GPU as a service (GPUaaS) is a new technique that some specialized companies have turned into a business and which aims to respond to the huge resource consumption of large artificial intelligence models.
Although they are not the only option, graphics accelerators have become a preferred component of choice when designing servers and data centers for AI due to their ability to efficiently manage multiple operations simultaneouslya fundamental feature in the development of deep learning models.
The problem is that not all AI startups have the budget to invest in the necessary hardware capable of running cutting-edge models and need (or prefer) to outsource it. In response, in recent years, companies such as Hyperbolic, Kinesis, Runpod or Vast.ai have emerged that offer their clients the necessary processing power remotely.
GPU as a service, an emerging business
The largest technology giants that offer cloud computing services, such as Amazon or Microsoft, own their infrastructure. New, smaller but innovative companies have created techniques to make the most of existing idle computing. «Businesses need computing. They need the model to be trained or their applications to be run; “They don’t necessarily need to own or manage servers.”says Bina Khimani, co-founder of Kinesis, as the basis for offering its services.
Some studies show that more than half of existing GPUs are not used at any time. Whether it’s personal computers or huge server farms, there is a lot of processing power that is not used. What these specialized companies do is identify idle processing capacity (both GPU and CPU) on servers around the world and compile it into a single processing source for companies to use.
To do this, companies like Kinesis partner with universities, data centers, companies and individuals who are willing to sell your unused processing capacity. Through special software installed on its servers, the company detects idle processing units, prepares them and offers them to its customers for temporary use. As the co-founder explains:
“We have developed a technology to bring together fragmented and idle computing power and reuse it in a self-managed, serverless computing platform… Customers even have the ability to choose where they want their GPUs or CPUs to be installed”explain this novel technique.
When servers can’t keep up with AI
Multimillion-dollar investments in AI infrastructure such as the one we learned about yesterday from Project Stargate that has been launched in the United States or those announced by technological giants such as Microsoft, Google or Amazon, confirm that large language models and machine learning consume an enormous amount of resources.
In case there was any doubt, OpenAI CEO Sam Altman admitted last fall that his company wasn’t releasing products as frequently as they wanted because they were facing “many limitations” in processing power. Likewise, Microsoft CFO Amy Woods told the company’s investors in a conference call that «Demand for AI is still greater than available capacity».
Techniques like GPU-as-a-service aim to help fill that gap. As learning models become more sophisticated, they require more power and infrastructure that can process information faster and faster. In other words, without a sufficient number of accelerators, big AI models can’t work, let alone improve.
The biggest advantage of GPUaaS, like other on-demand external service models, It is economical. By eliminating the need to purchase and maintain physical infrastructure, it allows companies to avoid investing in servers and IT management and instead put their resources into improving their own deep learning, large language, and large vision models. . It also allows customers to pay for the exact number of GPUs they use, saving them the costs of inevitable computing downtime that would come with their own servers.
They are also more sustainable. Specialized companies that do not use servers claim to be more environmentally friendly than traditional cloud computing companies. By leveraging existing, unused processing units instead of powering additional servers, they claim to significantly reduce power consumption. The AI industry is rapidly moving towards a stage in which the focus is shifting from simply building and training models to optimizing efficiency, reducing consumption and carbon emissions.
The growing demand for machine learning and colossal data consumption are making GPUaaS a highly profitable technology sector. In 2023, the market size of the industry was valued at $3.23 billion; in 2024, it grew to $4.31 billion. It is expected to increase to $49.84 billion in 2032.