Why AI Is Reshaping Computer System Design, And Just About Everything Else

For the past few years, AI has dominated any discussion of technology trends. Sure, PCs are faster and longer-lasting, phones can snap eye-popping photos, and TVs look better than ever. But these changes have been mostly incremental. The biggest change is the emergence of large language models, chatbots, and, though still early, AI agents.

This has had a big impact on system design. The latest PCs and mobile phones now use neural processors (NPUs) to run AI models locally. But the change in servers and the data centers that run them has been even more dramatic. While we’re likely to continue to need traditional CPU-based servers for a very long time, all of the new applications require a large number of graphics processing units (GPUs), even if they aren’t doing graphics.

Of course, this has resulted in a variety of new and enhanced chips to run these applications. Nvidia, which started the trend toward more general-purpose use of GPUs and on whose chips most of the important AI training still takes place, has released new chips designed for data centers every year. Longtime rival AMD has joined the competition, particularly in the past year, with credible entries in its Instinct line. And it seems like all the big cloud computing vendors—the hyperscalers—have dedicated and proprietary chips to run their own AI applications, with most taking a focus on inferencing, including Google’s TPUs, Amazon’s Trainium, and Microsoft’s Maia. All of these have introduced new versions in the past few months.

This has left traditional CPU architectures behind, forcing a redesign of AI applications, systems, and data centers. Indeed, the anticipated demand for new data centers populated with such chips is so great that there’s good reason to think they will require their own power sources. We’ve even had the hyperscalers talking about building or restarting nuclear power plants.

One thing that has stood out to me recently is how this has impacted the design of systems for large data centers. Rather than the big cloud providers or a few big computer companies designing the systems, now we have the GPU makers—Nvidia first, now joined by AMD—not only designing new chips and boards, but actually full systems that contain all the components you need to build a rack of servers.

This all was very clear at CES 2026, where what caught my eye most wasn’t consumer electronics, the field that originally gave the show its name, but large-scale servers meant for huge data centers, including AMD’s Helios, powered by its newest Instinct graphics cards and Nvidia’s Vera Rubin, which has its next-generation architecture. What stood out to me was just how much both companies are now focusing on a systems approach, promising not only improved performance but also much greater efficiency. This may be the most consequential change we’ve seen in computer design in years.

Nvidia’s Vera Rubin

Nvidia CEO Jensen Huang (Photo by Patrick T. Fallon / AFP via Getty Images)

At CES, Nvidia CEO Jensen Huang focused on its Vera Rubin “AI Factory,” the next generation of its data center server platform. This is a complete systems design including not only the Rubin GPU and Vera ARM-based CPU, but also four other chips: the NVLink 6 Switch for scale-up networking (i.e., networking within the rack), Spectrum-X Ethernet photonics for scale-out networking (i.e., networking among racks), CX9 network interface cards, and BlueField 4 data processing units. A full Vera Rubin NVL72 has 72 GPUs, which Nvidia says delivers 50 petaflops at FP4 (four-bit precision inference), as well as 36 CPUs. Huang says the system is now in production.

Nvidia Vera Rubin

Nvidia Vera Rubin (Credit: Michael J. Miller)

“The amount of computation necessary for AI is skyrocketing,” Huang explained, noting that the top models are increasing the number of parameters by 10 times each year and test-time scaling or “reasoning” is using five times the number of tokens per year. Nvidia says its emphasis on codesigning all six chips in the system has allowed the company to deliver a 10x reduction in inference cost per token, compared with the previous generation of servers based on the current Blackwell GPU. In a later Q&A session, Huang said it would allow five times the throughput per watt.

Lisa Su, chair and chief executive officer of Advanced Micro Devices Inc. (AMD), holds an AMD EPYC

Lisa Su holds an AMD EPYC ‘Venice’ CPU (Credit: Bridget Bennett/Bloomberg via Getty Images)

AMD CEO Lisa Su was just as forceful in her CES keynote. She noted that AI has moved from using 1 zettaflop (sextillion floating point operations) in 2022 to more than 100 zettaflops in 2025. “We don’t have nearly enough compute for everything that we can possibly do.”

Get Our Best Stories!

Your Daily Dose of Our Top Tech News

What's New Now Newsletter Image

Sign up for our What’s New Now newsletter to receive the latest news, best new products, and expert advice from the editors of PCMag.

By clicking Sign Me Up, you confirm you are 16+ and agree to our Terms of Use and Privacy
Policy.

Thanks for signing up!

Your subscription has been confirmed. Keep an eye on your inbox!

“To enable AI everywhere, we need to increase the world’s compute capacity, another 100 times over the next few years to more than 10 yottaflops (septillion floating point operations) over the next five years,” she said, noting that that would be 10,000 times more compute than we had in 2022, and a larger jump than we’ve ever seen in the history of computing.

amd helios

AMD Helios (Credit: Michael J. Miller)

AMD’s solution for this is a new version of its Instinct GPU, called the MI455X, and its own systems level product called Helios. It includes the Instinct MI455 accelerator with ultra fast HBM4 memory, along with a new Epyc CPU, codenamed Venice, as well as two Pensando DPUs (data processing units)—an existing Salian DPU plus an new one called Vulcano, which delivers faster UltraEthernet for scale-out computing. Again, this allows 72 GPUs per rack, which Su said will produce 2.9 Exaflops of AI compute, with 4,600 CPU cores, 18,000 GPU compute units, and 31 TB of HBM4 memory. This is due out later this year. Next year’s chip – the MI500 series – will be 10 times faster yet, and Su promised the combination would deliver a 1,000 times increase in AI performance over four years.

Why Systems Focus Is Important

A 1,000x improvement is pretty amazing—far faster than what we would have expected with Moore’s Law scaling, which in any case has actually been slowing. That’s due to both improvements on the chip level, including faster connections to memory; and on the systems level where all the different parts of the system are designed to work together. And of course, the designers of AI software are also focused on making their systems more efficient.

All this improvement is crucial, because if AI and agent usage continues to grow, we’re going to need new systems and new data centers to run these applications. And you would expect that lower costs would only increase demand. But there’s only so many data centers that can be built quickly, so making them more efficient—getting more transactions in each server and using less power per transaction—is critical.

While all the big AI players and hyperscalers have promised new data centers, these are all limited by the amount of power that is available. New power generation is not growing nearly as fast as data center demand, so it’s important to see the emphasis on improving power efficiency.

Hybrid AI

Another big trend I noticed at CES was a widening focus on “hybrid AI.” AMD and Nvidia, as well as Intel, Lenovo, and Qualcomm all used this phrase at their keynotes and press events. But exactly what that means varies.

lenovo rack

Lenovo rack (Credit: Michael J. Miller)

Nvidia talked about three kinds of AI: training, inference, and simulation. AMD, Intel, and Lenovo all talked about both data center servers and servers on “the edge” providing inference. AMD, Intel, and Qualcomm all talked about running local AI models on PCs, something Microsoft has also been talking about in recent months. And Qualcomm and Lenovo (with its Motorola line) talked about running AI on phones.

The focus on AI everywhere was a clear indication of just how much AI has come to dominate the tech discussion. It seemed like every consumer electronics product I saw at the show has some kind of AI angle, though most were just new versions of things that had been out for years (such as improved image processing on televisions) or just another name for software.

But there’s no question that devices, the applications we run, traditional personal software, and business applications are all undergoing some major shifts, driving by the AI emphasis. A lot has changed already, but the bigger changes are yet to come.