Announcements related to AI are increasing. After OpenAI which took advantage of its Advent calendar to present Sora, it is Google’s turn to launch into the deep end of image generation. Google Labs, the branch dedicated to testing and experimenting with the company’s latest technologies, launched this week Whisk, a new generation toolcapable of generating images, not from text prompts as is usually the case, but from other images. In an official blog post, the company explains: “Instead of generating images using long, detailed text, Whisk lets you create prompts using images“.
Subscribe to WorldOfSoftware
Gemini takes care of the rest
The concept of Whisk is simple: you just have to drag images into the generation tool to generate a new one. Three reference visuals are necessary: one for the subject, one for the scene, and one for the graphic style of the image. Once this information is entered, Gemini starts: the bot will write a detailed description of each modelbefore introducing the latter into Imagen 3, the latest image generation model from the Mountain View firm.
And copyright?
Let those who are already crying about non-respect of copyright and intellectual property be reassured, promises the company: “This process captures the essence of your subject, not an exact replica. So you can easily remix your topics, scenes and styles in original ways“. With this posture, Google wants to avoid repeating ChatGPT’s bad press and the rest of the textual generators, regularly accused of copying without authorization texts for which they do not have the intellectual property.
More concretely, “Whisk only extracts a few key features from your image, it can generate images that differ from your expectations. For example, the generated subject may have a different height, weight, hairstyle, or skin tone“, warns Google. To further refine the specific details of an image, it will however be possible to add textual detailsthis time through a classic prompt.
For the moment, Whisk is only accessible in the United States, in restricted preview. Google will undoubtedly wait for the first feedback from Internet users to refine its technology, before considering a wider deployment to the rest of the world.
🟣 To not miss any news on the WorldOfSoftware, , .