Since its original launch at Google I/O 2024, Project Astra has become a testing ground for Google’s AI assistant ambitions. The multimodal, all-seeing bot is not a consumer product, really, and it won’t soon be available to anyone outside of a small group of testers. What Astra represents instead is a collection of Google’s biggest, wildest, most ambitious dreams about what AI might be able to do for people in the future. Greg Wayne, a research director at Google DeepMind, says he sees Astra as “kind of the concept car of a universal AI assistant.”
Eventually, the stuff that works in Astra ships to Gemini and other apps. Already that has included some of the team’s work on voice output, memory, and some basic computer-use features. As those features go mainstream, the Astra team finds something new to work on.
This year, at its I/O developer conference, Google announced some new Astra features that signal how the company has come to view its assistant — and just how smart it thinks that assistant can be. In addition to answering questions, and using your phone’s camera to remember where you left your glasses, Astra can now accomplish tasks on your behalf. And it can do it without you even asking.
Astra’s most impressive new feature is its newfound proactivity. “Astra can choose when to talk based on events it sees,” Wayne says. “It’s actually, in an ongoing sense, observing, and then it can comment.” This is a big change: instead of pointing your phone at something and asking your AI assistant about it, Astra’s plan is to have that assistant constantly watching, listening, and waiting for its moment to step in. (The team is thinking about lots of devices on which Astra-like products might work, but it’s focused on phones and smart glasses. In this case, you can imagine how glasses in particular might be useful for an all-seeing and all-hearing assistant.)
Astra’s plan is to have its assistant constantly watching, listening, and waiting for its moment to step in
If Astra is watching while you do your homework, Wayne offers by way of example, it might notice you made a mistake and point out where you went wrong, rather than waiting for you to finish and specifically ask the bot to check your work. If you’re intermittent fasting, Astra might remind you to eat just before your designated time is up — or gently wonder if you should really be eating right now, given your diet plan.
Teaching Astra to act of its own volition has been part of the plan all along, says DeepMind CEO Demis Hassabis. He calls it “reading the room,” and says that however hard you think it is to teach a computer to do, it’s actually much harder than that. Knowing when to barge in, what tone to take, how to help, and when to just shut up, is a thing humans do relatively well but is hard to either quantify or study. And if the product doesn’t work well, and starts piping up unprompted and unwanted? “Well, no one would use it if it did that,” Hassabis says. Those are the stakes.
A truly great, proactive assistant is still a ways off, but one thing it will definitely require is a huge amount of information about you. That’s another new thing coming to Astra: the assistant can now access information from the web and from other Google products. It can see what’s on your calendar, in order to tell you when to leave; it can see what’s in your email to dig up your confirmation number as you’re walking up to the front desk to check in. At least, that’s the idea. Making it work at all – and then consistently and reliably – will take a while.
The last piece of the puzzle, though, is actually coming together: Astra is learning how to use your Android phone. Bibo Xiu, a product manager on the DeepMind team, showed me a demo in which she pointed her phone camera at a pair of Sony headphones, and asked which ones they were. Astra said it was either the WH-1000XM4 or the WH-1000XM3 (and honestly, how could anyone or anything be expected to know the difference), and Xiu asked Astra to find the manual, then to explain how to pair them with her phone. After Astra explained, Xiu interrupted: “Can you go ahead and open Settings and just pair the headphones for me, please?” All by itself, Astra did just that.
The process wasn’t perfectly seamless — Xiu had to manually turn on a feature that allowed Astra to see her phone’s screen. The team is still working on making that happen automatically, she says, “but that’s the goal, that it can understand what it can and cannot see at the moment.” This kind of automated device use is the same thing Apple is working toward with its next-generation Siri, and both companies imagine an assistant that can navigate apps, tweak settings, respond to messages, and even play games without you needing to touch the screen. It’s an incredibly hard thing to build, of course: Xiu’s demo was impressive, and was about as simple a task as you can imagine. But Astra is making progress.
Right now, most so-called “agentic AI” doesn’t work very well, or at all. Even in the best-case scenario, it still requires you to do a lot of the lifting: you have to prompt the system at every turn, supply all the additional context and information the app needs, and make sure everything’s going smoothly. Google’s goal is to begin to remove all that work, step by step. It wants Astra to know when it’s needed, to know what to do, to know how to do it, and to know where to find what it needs to get it done. Every part of that will require technological breakthroughs, most of which nobody has made yet. Then there will be complicated user interface problems, privacy questions, and more issues besides.
If Google or anyone is going to build a truly universal AI assistant, though, it will have to get this stuff right. “It’s another level of intelligence required to be able to achieve it,” Hassabis says. “But if you can, it will feel categorically different to today’s systems. I think a universal assistant has to have it to be really useful.”