More than a decade ago ‘Her’ arrived in theaters, a film that immersed us in a future where artificial intelligence assistants (AI) not only interacted with users by voice, but they could also “see” the environment through the camera. At that time, Siri had only been in operation for two years, and what the film proposed was perceived as a still very distant future. OpenAI is shortening the path to that reality.
ChatGPT can finally see what’s happening around us. This means that it will be possible to give you access to our camera when we use the advanced voice mode. In this way, the famous chatbot will be able to process the images in real time thanks to the GPT-4o multimodal model. The option to share our screen will also be available so that you can obtain real-time data from the applications we are using.
ChatGPT can now process video in real time
Once it is available on our device, starting to use this new capability will be very simple. We will simply have to open the ChatGPT application and press the button in the upper right corner to start advanced voice mode. The next step will be to click on the camera button. In the interface we will find a button that will allow us to choose the front or rear camera in case we use a telephone.
OpenAI’s AI chatbot can now be much more useful. For example, we might ask you to help us perform certain tasks. A member of the team gave a demonstration this Thursday. He asked ChatGPT to show him step by step how to make coffee with a filter. The model was able to recognize each of the objects on the table in real time and guide the person throughout the process. The latency seemed negligible.
When the company announced this functionality in May of this year, it presented many other usage scenarios. Among them we found a father solving math problems with his son, people playing rock, paper, scissors and we even saw an excited ChatGPT meeting a dog. One of the most interesting examples came from the accessibility optionswhich allowed the environment to be described in simple language.
If we wanted to share our screen with the chatbot we would simply have to press the three dots menu and then select Share screen. We have to wait to see if the chatbot’s vision capabilities will meet expectations, but we must mention that like any AI model it can make mistakes. In any case, OpenAI demonstrates that it remains at the forefront of the artificial intelligence race.
OpenAI says the new video mode will be available in the coming days “in most countries” for ChatGPT Plus ($20/month) and ChatGPT Pro ($200/month) users. If you are reading this from Spain, the company has indicated that it hopes to offer “soon” this novelty in the European Union. So we do not have an arrival date in the countries of the community bloc, apparently due to regulatory issues.
Altman said this week in a message on X that some of his products will possibly arrive later in Europe. He also added that it is likely that they will not be able to offer some of their products. “We want to offer our products in Europe and we believe that a strong Europe is important for the world. We also have to comply with the regulations,” said the businessman in the same message.
Images | OpenAI | Screenshot
In WorldOfSoftware | OpenAI has just brought us closer than ever to ‘Her’: its new voice model will accompany us (and perhaps make us fall in love)
In WorldOfSoftware | Few digital gardens have walls as high as Apple’s. ChatGPT has managed to sneak into it