Microsoft has presented the Maia 200, its new internal chip for artificial intelligence systems with which it aims to reduce dependence on NVIDIA or AMD and compete with the developments of Amazon and Google in data centers.
In the midst of a race to dominate generative AI services, All major technology companies are developing their own chips. Microsoft launched the Azure Maia AI platform in 2023, developed the Cobalt CPU and announced the Maia 100 chip. Now comes the second generation that is committed to performance per dollar and efficiency as its main features.
Maia 200 is an AI accelerator designed for inference workloads. If the Maia 100 was built on 5nm technological processes, the second generation is based on the 5nm process node. 3 nm from TSMC and includes native FP8/FP4 tensor cores. It supports 216 GB of HBM3e memory with 7 TB/s of bandwidth, in addition to 272 MB of on-chip SRAM memory.
Microsoft has assured that it is the higher performance internal silicon designed by the Redmond firm and also by any hyperscaler, including Amazon and Google. Surprisingly, Microsoft published a comparison table showing the Maia 200 with equivalent chips from the other two giants. According to the published table, the Maia 200 offers almost double the FP8 performance of Amazon’s third-generation Trainium and about 10% more than Google’s seventh-generation TPU.
The Maia 200 also wants reduce dependency from the great leader in the sector, NVIDIA, and solutions like the Blackwell B300 Ultra, although direct comparisons here are relative. NVDIA’s accelerator is sold to third-party customers, is optimized for much higher power use cases than the Microsoft chip, and the software stack is released much earlier than any other contemporary model.
Maia 200, the commitment to efficiency
Where the Microsoft chip does stand out is in energy efficiency and performance for Price. Microsoft claims 30% higher performance per dollar than next-generation hardware currently deployed in Azure. Maia 200 is also designed for scale-up deployments, featuring an on-die network card (NIC) with 2.8 TB/s of bi-directional bandwidth for communication across a cluster of 6,144 accelerators.
The Maia 200 operates at almost half the TDP of NVIDIA’s B300 (750W vs 1400W) and if it runs like the original version, it will operate below its theoretical maximum TDP. Microsoft’s efficiency-first message follows its recent trend of emphasizing the corporation’s concern for the communities near its data centers, striving to mitigate negative reactions to the rise of AI. Microsoft CEO Satya Nadella recently spoke at the World Economic Forum in Davos about the need for AI to demonstrate its real usefulness so as not to lose what he called “social permission” and create the feared AI bubble that many are warning about.
Unlike the Maia 100, which was announced long before its implementation, Maia 200 is already deployed in main Microsoft data centers in the United States. The chip can work with a variety of AI models, including OpenAI’s GPT-5.2 models, allowing the company to offer AI capabilities in Microsoft 365 and other services. Microsoft’s Superintelligence team will also use it for synthetic data generation and reinforcement learning to develop future internal models.
To help developers and startups optimize their tools and models for Maia 200, Microsoft has released a preview version of the SDK. Includes integration with PyTorch, a Triton compiler, optimized kernel libraries, and access to the Maia low-level programming language.
