You’ve probably seen the highly realistic AI video saturating social media.
That Stormtrooper building a snowman? Made by Google Veo 3. The surfing unicorn passing ice floes while penguins rave under the northern lights? Also AI… we assume.
If you can dream it, you can create it, which is incredibly exciting – but also incredibly unsettling, in terms of what it means for creative industries as well as misinformation and fact checking.
A quick rundown: Google now allows you to create a cinematic video clip, just from typing in what you want to see. It includes realistic voices and sound, which sets it apart from other models.
Given I’ve never had skill as a filmmaker, I was amazed to be able to make a clip of something you’d previously need Hollywood special effects teams to conjure up, just from writing a couple of sentences on my computer.
The video below shows the three videos we made at Metro to test out the new tech, which Google launched in the UK on May 30.
1. An office daydream
First off, we thought we’d boost morale in the team by looking at what Londoners really think of Metro.
We used this prompt:
Starts off with a wide shot. A glorious sunny day. Quiet roads. Tracks into an excited crowd gathered on High Street Kensington in London, with majestic buildings all around, including a Whole Foods. The crowd is young, cool and full of anticipation. A red London bus pulls into shot and a door opens. A brown Lakeland Patterdale terrier scampers off the bus, barking excitedly, and leaps into the arms of a nearby man, tall, tanned wth curly hair and short tidy beard, very handsome, wearing a navy three piece suit. A statuesque woman with a chignon emerges from the bus with a stack of Metro newspapers in her arms. She distributes them to an ecstatic crowd who immediately start reading with great enthusiasm. “Long live Metro”, they all cry in unison.
On first glance, it’s pretty realistic (no?)
But you don’t need to be Sherlock Holmes to notice that buses don’t usually open through the front windscreen, the shop sign reads ‘Whole Foobs’, or that the editor rather inconsiderately drops the dog as soon as he gets outside.
Also, the crowds we imagined would represent multicultural, cool London were all quite similar young white men in shirts. It’s well known that AI can contain the biases of data it is trained on, so it’s possible that this is related.
After trying to refine the prompt by making it more detailed, we still didn’t get a diverse crowd, but we did get ‘Long Live Metro’ pronounced with ‘live’ rhyming with ‘dive’.
2. Vandalism spree outside Primark

I wanted to test how easy it would be to create something which could be spread as false information, inciting tensions by looking realistic. So I asked for the video to look like it was shot on a phone, like most witness footage of public incidents is.
It should look as though it is shot with a phone camera, with slightly shaky footage. The scene is a typical British high street, with shops including Boots, Primark and Tesco. A young man wearing a balaclava runs into view holding a hammer, and begins smashing all the shop windows, shouting ‘you’re going to pay for this’. A woman with shopping bags tries to stop him but he pushes her aside.
Thinking of the recent riots which affected towns across the UK, I wanted to produce something with the potential to go viral on social media and incite some angry reactions about law and order.
On this occasion, I don’t think anyone would be fooled.
The video came back without sound (an issue that has affected quite a few videos, which I’ll come to later), and the assailant’s balaclava vanished from his face mid hammer swipe, quite a giveaway that AI had a hand in it.
Shot with a glossy, high definition feel, it definitely didn’t look like grainy user generated content either.
3. Entering the Hellmouth

After writing about the fiery ‘Gate of Hell’ crater in Turkmenistan finally starting to burn itself out, this came to mind as a potentially cinematic backdrop.
Night is falling near the ‘Gate of Hell’ Darvaza Crater in Turkmenistan. The light from the fire within makes the dark sky glow. A woman, in her forties, wearing a protective suit but with her hair down, looks into the depths, seeing flames flickering inside. She says: ‘They say this pit will burn itself out soon. Before that happens, I will take the fire home.’ Then she clambers over the edge.
This one was my favourite and the most successful prompt, even though I didn’t go into too much detail. I didn’t see any immediately obvious AI flaws (maybe because with just one person, it was less complicated to create) and I think the special effects could even belong in a blockbuster film.
I suppose I shouldn’t be too pleased with myself, as I literally did nothing requiring talent to create it.
But opens up new pathways to explore whatever you can imagine, so I’m not surprised the feature has gone viral.
Where does Google see the tech going?
We asked Google where they see this tech heading in the future, given that AI is already accelerating at unnerving speed (mocking it for not being able to count fingers already feels hopelessly out of date).
Matthieu Lorrain, Creative Lead at Google DeepMind, told Metro: ‘We’re already seeing Veo 3 used for everything from making a quick clip for socials, to turning an inside joke into a moving meme, or visualising a cool concept quickly. These are some of the main use cases that we’ve seen since the feature launched on Gemini.’
Some of the clips they produced to showcase the feature are below:
For now, one of the annoying parts of making a video is that you can’t edit it; I can’t ask it to refine the clip and ask for the animal-loving editor not to drop his dog, for example. It would just come up with a new clip entirely.
Mr Lorrain said: ‘Adding the ability to more easily refine and finesse a prompt or generated video is definitely something we’re working on. For now, it’s a case of experimenting with the wording to try and get the video to generate as you’d like, which is trial and error, but it’s also part of the fun!’
Google is currently testing the ability to generate video from an image, which is one of the most in-demand as well as potentially concerning possibilities of AI video.
If you could upload an image of a real person, you could make a convincing deepfake with the potential to spread misinformation. But there are also legitimate reasons you might want to do this.
Reddit founder Alexis Ohanian recently shared a tweet of a video generated from a photo of his mother hugging him, using another AI software Midjourney. Explaining he lost his mother 20 years ago and that the family could not afford a camcorder, he had no moving images to remember her by so created the short animation to better imagine what happened either side of the shot.
People might also understandably want to imagine themselves in James Bond-like situations, or more boringly, for more polished content on their socials.
For now, you also cannot specify a famous person in the written prompt and make a video of them using publicly available images, even though this would technically be possible (there are both legal and ethical reasons for this).
I asked Gemini for a video of Keir Starmer giving a speech outside Downing Street to warn of an invasion of glowing, radioactive hamsters just to see, but sadly was blocked from bringing this into technicolour.
How can I make a video with Google Veo 3?
It is currently only available to those with a subscription, which costs £18.99 a month.
Once you have access, you can simply type your prompt into Gemini, the company’s rival to ChatGPT, or use Flow, which is designed for more serious AI filmmaking, and allows the use of consistent elements such as a particular character across clips.
Users can make three clips a day, to prevent servers being overloaded.
To make the film, you simply write a paragraph about what you want it to show, detailing the style and camera work as well as the subject and script. Google gave a list of tips for a successful prompt here.
What’s up with the sound?
Google warns users on Flow that audio is still an experimental feature and so videos ‘might not always have sound’ (so if this happens to you, it’s not a problem with your speakers).
They said speech does better with slightly longer transcripts, is muted for minors, and can trigger subtitles.
‘We’re working on it,’ they said.
Will I be seeing this in cinemas soon?
It’s a safe bet that AI will be shaking up filmmaking, just as it is every other industry.
You can already generate a realistic-sounding ‘podcast’ on any topic just from uploading information about it, and I wouldn’t be surprised if you could generate your own feature films soonish on any topic you like too, without having to log into Disney Plus or Netflix at all. Admittedly, the quality would probably be mixed, and there could be copyright issues if you just uploaded a manuscript of the latest bestseller.
Mr Lorrain said: ‘With regards to the future, as with any groundbreaking technology, we’re still understanding the full potential of AI in filmmaking. We see the emergence of these tools as an enabler, helping a new wave of filmmakers more easily tell their stories. By offering filmmakers early access to Flow, we were able to better understand how our technology could best support and integrate into their creative workflows — and we’ve woven their insights into Flow.
‘Veo 3 represents a huge step forward in quality, with greater realism, 4K output, and incredibly lifelike physics and audio. Like any powerful creative tool, it rewards practice—the more descriptive your prompts, the better your video. When it comes to getting the most out of Veo 3, think of prompting as learning to speak Veo’s language—the more fluently and descriptively you articulate your vision, the better the video will be.’
Get in touch with our news team by emailing us at [email protected].
For more stories like this, check our news page.
MORE: Games Inbox: Is AI going to ruin video games?
MORE: Front Mission 3: Remake updated its graphics with AI slop and fans are not happy
MORE: UK watchdog could force Google to make changes – what are they?