Microsoft started adding generative Copilot AI features to Windows in September 2023, well before its OS competitors had anything similar. And although most of the company’s recent, spiffy AI tools are exclusive to Copilot+ PCs, Copilot Vision works on every Windows 11 (and, shockingly, Windows 10) machine. This feature, which lives inside the Copilot app, lets the AI view whatever is on your screen and provide natural verbal assistance. The latest versions of ChromeOS and macOS can’t match this capability, despite adding some AI features piecemeal.
Copilot Vision first appeared in the Microsoft Edge web browser, and you can read my mixed impressions of that iteration. While it’s indeed helpful to converse with an AI about a website, Copilot Vision in Windows lets you do the same for any open app. Copilot Vision is also available within the Copilot app for Android and iOS, where it can chat about whatever you point your phone’s camera at, but I’m focusing on the Windows experience here. As of publication, the Copilot app for macOS lacks the Vision capabilities I describe here.
Get Ready for Copilot Vision
To use Copilot Vision, you first need to make sure that Windows 11 is up to date. Go to Settings > Windows Update and click the Check for Updates button. To speed things along, you can toggle the “Get the latest updates as soon as they’re available” option. You, of course, also need the Copilot app. If you don’t, head to the Microsoft Store to install it.
(Credit: Microsoft/PCMag)
You can use Copilot without signing in to a Microsoft account for limited interactions, but you miss out on several features, including Copilot Vision. Signing in also enables AI image creation, Copilot Voice, interaction history, longer conversations, and settings syncing.
Other requirements: Copilot Vision is available only in the US; a Microsoft blog post says it’s coming to more countries, outside of Europe, soon; Copilot is still not available in Europe because of Microsoft’s adherence to the region’s Digital Markets Act (DMA).
How to Use Copilot Vision
Start by opening Copilot, either by typing Windows Key-C or Alt-space bar (which opens the compact Copilot window). Alternatively, you can click on the Copilot icon in your taskbar. If you have a Copilot+ PC, you can simply press the dedicated Copilot key on your keyboard. In the Copilot app’s window, you should see an icon that looks like a pair of eyeglasses to the left of the text-entry box at the bottom of the window.
(Credit: Microsoft/PCMag)
Once you click the eyeglasses, you see a list of all app windows currently running on your PC. Programs running with non-minimized windows take priority here, but if you have more than four open, you can scroll down to find them.
(Credit: Microsoft/PCMag)
When you toggle one of the options for an app, a new element pops up at the bottom of Copilot’s window with highlighted eyeglasses and microphone icons. For tasks that involve more than one app, you have to press the initial eyeglasses icon again to add a second window for viewing—Copilot doesn’t let you enable two at the start. Your AI pal will then start talking with you, describing what’s in view on the screen. You can end the conversation at any time by clicking Stop or the X.
Get Our Best Stories!
Your Daily Dose of Our Top Tech News
By clicking Sign Me Up, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.
Thanks for signing up!
Your subscription has been confirmed. Keep an eye on your inbox!
(Credit: Microsoft/PCMag)
From then on, you can simply converse verbally with Copilot, asking what you need to know about the app in question. In testing, I asked Copilot how to get photos from File Explorer into Photoshop and then how to improve them in the photo app. Copilot has good knowledge of the process and the app (I found the same for Lightroom). Here’s a video showing my experience (pardon my weak webcam mic; Copilot’s voice is clear, loud, and better-spoken than mine).
If you tell Copilot “Show me how,” you see a large pointer in the Copilot panel, which flies up and draws a box or circle around the relevant interface element. Microsoft calls these Highlights. In my experience, this feature didn’t always highlight the correct object, but here’s a case in which it got it right:
After you stop your Copilot Vision session, you can see a transcript of the conversation in the Copilot app:
Recommended by Our Editors
(Credit: Microsoft/PCMag)
As with most AI tools, your results may vary, even with the same question or prompt. I got different responses when I asked for the same information multiple times. For example, it occasionally gave me instructions for Lightroom Classic rather than the newer version of Lightroom. Sometimes it paused, keeping silent for a few seconds, but this issue wasn’t severe enough to ruin the experience.
I like how generative AI tools (and Copilot in particular) let you tell them when they get something wrong. In such cases, they will recheck their information and correct themselves. So, when I responded that the instructions were for Lightroom Classic and I was using the newer Lightroom, Copilot apologized and gave the correct instructions.
Neither ChromeOS Nor macOS Has an Answer
Windows’ two big competitors don’t offer anything that competes with Copilot Vision. Google gets somewhat closer with its Select to Search With Lens and Text Capture features, the latter of which lets you find information about or take limited actions on selected text in images. But those ChromeOS AI features don’t let you converse verbally with an AI about what you’re looking at on the screen to get interactive help.
MacOS’s AI capabilities are limited to creating cartoon-like images, rewriting text, and summarizing emails and web pages; macOS Tahoe at least promises some improvements. Siri has become more conversational, but it can’t help you with what’s on your screen.
As I’ve concluded after testing other Copilot features in Windows, Microsoft’s desktop OS comfortably leads competitors in AI features. And in this case of Copilot Vision, you don’t even need the latest hardware and software to take advantage of it. The same isn’t true for the AI tools that ChromeOS and macOS do have. Of course, with the vast resources these companies are investing in AI, Microsoft’s lead is far from safe. I eagerly await the competition ramping up.
About Michael Muchmore
Lead Software Analyst
