According to the forecasts that are responsible for the Gartner consultant, in 2030 80% of applications and software that will be developed and used in companies will be multimodalswhich will represent an enormous advance with respect to the last year 2024, when only 10% of them were multimodal. In addition, they have indicated that within the period between the next one and three years, the multimodal generative models will be increasingly responsible for enriching more applications.
Those considered as high -impact technologies, such as generative artificial intelligence models will charge much importance in companies. Product managers will have to make critical decisions about the investments made in this type of emerging generative technologies, with the aim of allowing customers to achieve new objectives in terms of value for their companies.
The multimodal generative AI offers the ability to use different types of image inputs and outputs. Among them images, videos, audios (sounds and speech), text and numerical data, working with a single generative model.
In fact, multimodality increases the level of usability of generative artificial intelligence, since it allows the development of models that can interact with users, in addition to generating outputs from different types of data in different modalities.
At present there are already a good amount of multimodal models that offer a process of two or three data modalities. For example, the text passage to video or speech and voice voice. Its number will grow over the next few years to include more diversity and a greater number of modalities with which they can work.
According to Roberta Cozza, Senior De Gartner Directors analyst, «Change to multimodal company software is a fundamental transformation in operations and innovation in companies. The multimodal generative AI will revolutionize company applications, as it will add functions with which they could not work so far. By improving the specific language MDOELOS of each domain, it will improve precision, automate operations and give a boost to contextual decision intelligence. With this, AI will be able to perform actions between several tasks proactively«.
Cozza has also reminded companies that want to take advantage of this multimodal AI growth in company applications and software, which should «focus on integrating multimodal capabilities into their software to improve user experiences and efficiency in operations. Taking advantage of diverse data entries and outputs offered by multimodal generative AI, companies can achieve new levels of productivity and innovation«.