Microsoft has finally presented its First two models of generativedesigned and trained within the company. The first, More voice-1 It is a natural voice generation model, while the second, MAI-1-previewit is a text model, and the first foundational model with which the company has carried out a complete training process.
The Redmond are already using Maii-Voice-1 in Co-Co-Daily and Podcasts, while Mai-1-Preview is now available for the public to perform tests with him in Lmarena. You will also start trying it in the coming weeks in certain situations with Co -cilot.
As mentioned in the head of the Microsoft’s division, Mustafa Suleyman, both models developed based on both their efficiency and cost effectiveness. As a result, Mai-Voice-1 is able to work on a single GPU. He also economized in his training, which has made Mai-1-Preview trained in about 15,000 NVIDIA H-100 GPUS.
This amount of GPUS may seem very high, but it is taken into account that in the training of other models more than 100,000 GPUS has been used, the perception of its consumption decreases markedly. In addition, it points to a trend underlined by Suleyman himself, who stressed that the art of training models increasingly implies not only the selection of the most appropriate data, but also avoiding wasting tokens or flops unnecessarily.
The decision to launch its own models, unlike what it did with Co -Pilot, basically developed from Openai technology, is related to its intention to be independent of OpenAI in this area. Of course, since other technological ones take a lot of advantage in the development of their own models (it is estimated that they take about five years of advantage), those of Redmond will have to put the batteries and work hard to catch up.