I’ve been spending some time with the Apple Vision Pro, and I’m impressed. Apple has created the best user interface for AR/VR, and it should serve as a template for all future consumer headset displays.
The Vision Pro runs on the new visionOS, an amalgamation of iOS, iPadOS, and macOS designed for mixed reality. It uses the headset’s many internal and external cameras to track eye and hand movements, which are the primary way of interacting with the system. You can pair it with devices like keyboards, gamepads, and mice, but most of the time you’ll be controlling the Vision Pro with just your eyes and your hands.
Selecting each piece with my eyes, then pinching to move it on the board (Credit: Will Greenwald)
Whatever your eyes focus on — whether it’s a menu item, an icon, or any other interface element — will brighten very slightly to indicate that it’s selected. Tapping your index finger and thumb together will then click on the selected item, like clicking a mouse or tapping on a touch screen. Even the gesture feels like a mouse click.
That’s all you have to do to activate anything, just look and tap your fingers. It’s instinctively easy.
You need a bit more than a single click to interact with a functional computer, though. That’s why you can pinch and hold your fingers together, then move your hand as if you’re physically moving the item you’re pinching. Look at a neutral spot on a website, then pinch and flick your wrist up and down to scroll. Look at the small bar on the bottom of almost every window in visionOS, then pinch to move it anywhere around you. Do the same to the corner of the window and you can resize it.
These two simple gestures provide incredible control over the Vision Pro, and they can be faster than using a mouse or a touch screen after you get used to them because you only have to focus your eyes to select anything around you. Then tap your fingers together, and unless your hands are behind your head the gesture will be detected.
It even works flawlessly with my cat Pixel in my lap (Credit: Will Greenwald)
It’s not an exaggeration to say that this is a generational leap in mixed reality interface design, and a much-needed one. Most VR headsets and AR glasses require some additional controller to interact with anything. The Vision Pro takes the need of any such device out of the equation by making eye and hand control reliable and universal.
I’ve seen eye-tracking used in headset displays before and I’ve been impressed by the technology every time. The Meta Quest Pro incorporates it well into many apps and features, but it’s not consistently implemented. The PlayStation VR2 uses eye-tracking even less as a control system, only working with certain games that make the effort to incorporate it. Both seem to use eye tracking more for foveation than anything else, improving performance by focusing resources on rendering what you’re looking directly at in more detail and fewer resources on what’s in your peripheral vision. The Vision Pro also uses foveation, which in my early testing seems great for performance but it means the screenshots I’ve been taking for our coverage have had some very strange focus issues.
Meta and Sony’s headsets rely almost exclusively on their respective physical controllers. You need them in your hand to do almost anything. Even in the very few cases where you can select something with just your eyes, it requires staring at a menu element for seconds and making a bar fill up to confirm that it’s indeed what you want to activate.
As for hand tracking, the Quest Pro (and the Quest 3, which doesn’t have eye tracking) has the feature available, but it’s extremely unreliable. You need to keep your hands in a sweet spot in front of the headset’s cameras and get used to fairly specific gestures and motions so the tracking is maintained. It’s a nice extra on those headsets, but it doesn’t come close to replacing the controllers.
Jumping between scrolling a web page and playing chess (Credit: Will Greenwald)
VisionOS polishes and unifies eye and hand tracking as a fundamental part of the operating system, with hand tracking in particular that’s far more advanced and reliable than anything else I’ve seen. When I say you just tap your fingers together, I mean you tap them together almost any time, in any position. The setup process for the Vision Pro involves scanning the fronts and backs of your hands with the headset, and after that the headset can follow where they are anywhere in almost a 180-degree dome in front of you.
I’ve sat back and lazily tapped my fingers in my lap and the gestures registered on the Vision Pro. I’ve held my hands over my head and to the side of my head and both taps and drags have translated. It doesn’t work as well when your hands are completely out of view of the headset’s many cameras, like if they’re behind you or under a blanket, but both tapping and dragging have consistently worked for me in far more natural and relaxed positions than I expected. It’s an incredible jump over the awkward and unreliable hand tracking on the Meta Quest headsets.
Your hands being out of view might not always prevent you from tapping, either. Apple hasn’t said anything about such a feature, but I could definitely see eventual visionOS and watchOS updates that pair the Apple Watch Series 9 or later models with a visionOS device to incorporate the features that enable the double-tap gesture purely with motion sensing. It’s just hypothetical, but I could imagine such a connection working.
I’ve used every major consumer VR headset and several AR devices, and I’m not exaggerating when I say that the Vision Pro has by far the best user interface I’ve ever seen. It’s the kind of jump in usability that we saw with iOS making smartphones and tablets really come together for a wider user base than technophiles and professionals. Its underlying technology and individual elements aren’t new, but they’re assembled and polished so much that almost anyone could easily pick up and enjoy using the headset.
Recommended by Our Editors
This doesn’t mean you should buy the Apple Vision Pro (and before you consider it, be sure to read our upcoming review of it first). The interface is incredible and accessible, but the device isn’t. This is a $3,500 headset in a market where $1,500 is considered very expensive. That’s well beyond the typically expected Apple premium.
It can be easier to type with your eyes than your fingers (Credit: Will Greenwald)
Also, I said that the user interface is incredible. I didn’t say the operating system is incredible. VisionOS is built on a solid foundation, but it’s still a new platform, and that shows in its software library and some design elements.
You can rearrange different apps around you easily by pinching and dragging. You can only arrange them around you, though. Moving them automatically rotates them so they’re always directly facing your view. It’s clean, but it’s not as flexible as I would like.
Besides the tap gestures, you have the option of reaching out and directly tapping objects, like keys on a virtual keyboard, with your fingers. This isn’t nearly as responsive or reliable as selecting items with your eyes, and the lack of any physical feedback makes it awkward. This is another case where pairing the Vision Pro with an Apple Watch could hypothetically improve the experience, simply by making the watch vibrate slightly when you succeed in pressing a virtual key.
Note the lack of Google apps (Credit: Will Greenwald)
App selection is also frustratingly limited. I don’t expect a new mixed reality platform to launch with dozens of apps and games that take advantage of the technology, but visionOS is built on iOS, iPadOS, and macOS. It can actually use iPhone and iPad apps. It just can’t use all of them. There are no Google apps at all, including YouTube. No Twitch, either. Considering the operating system and hardware (the same M2 as the iPad Pro) are so close to devices that run them with no problem, there’s no reason for their absence.
The Vision Pro doesn’t tread new ground, but it does pave over ground that has already been well-tread. The interface turns the rough dirt road of past VR into a smooth street to stroll down, and I hope to see its concepts iterated on and emulated by both Apple and other developers.
Get Our Best Stories!
Sign up for What’s New Now to get our top stories delivered to your inbox every morning.