Microsoft announces powerful new chip for AI inference

by Alan North
0 comments


Microsoft has announced the launch of its latest chip, the Maia 200, which the company describes as a silicon workhorse designed for scaling AI inference.

The 200, which follows the company’s Maia 100 released in 2023, has been technically outfitted to run powerful AI models at faster speeds and with more efficiency, the company has said. Maia comes equipped with over 100 billion transistors, delivering over 10 petaflops in 4-bit precision and approximately 5 petaflops of 8-bit performance—a substantial increase over its predecessor.

Inference refers to the computing process of running a model, in contrast with the compute required to train it. As AI companies mature, inference costs have become an increasingly important part of their overall operating cost, leading to renewed interest in ways to optimize the process.

Microsoft is hoping that the Maia 200 can be part of that optimization, making AI businesses run with less disruption and lower power use. “In practical terms, one Maia 200 node can effortlessly run today’s largest models, with plenty of headroom for even bigger models in the future,” the company said.

Microsoft’s new chip is also part of a growing trend of tech giants turning to self-designed chips as a way to lessen their dependence on NVIDIA, whose cutting-edge GPUs have become increasingly pivotal to AI companies’ success. Google, for instance, has its TPU, the tensor processing units—which aren’t sold as chips but as compute power made accessible through its cloud. Then there’s Amazon Trainium, the e-commerce giant’s own AI accelerator chip, which just launched its latest version, the Trainium3, in December. In each case, the TPUs can be used to offload some of the compute that would otherwise be assigned to NVIDIA GPUs, lessening the overall hardware cost.

With Maia, Microsoft is positioning itself to compete with those alternatives. In its press release Monday, the company noted that Maia delivers 3x the FP4 performance of third generation Amazon Trainium chips, and FP8 performance above Google’s seventh generation TPU.

Microsoft says that Maia is already hard at work fueling the company’s AI models from its Superintelligence team. It has also been supporting the operations of Copilot, its chatbot. As of Monday, the company said it has invited a variety of parties — including developers, academics, and frontier AI labs — to use its Maia 200 software development kit in their workloads.

Techcrunch event

San Francisco
|
October 13-15, 2026



Source link

Related Posts

Leave a Comment