Apple Study Looks Into How People Expect To Interact With AI Agents - 9to5Mac

A team of Apple researchers set out to understand what real users expect from AI agents, and how they’d rather interact with them. Here’s what they found.

Apple explores UX trends for the era of AI agents

In the study, titled Mapping the Design Space of User Experience for Computer Use Agents, a team of four Apple researchers says that while the market has been investing heavily in the development and evaluation of AI agents, some aspects of the user experience have been overlooked: how users might want to interact with them, and what these interfaces should look like.

To explore that, they divided the study into two phases: first, they identified the main UX patterns and design considerations that AI labs have been building into existing AI agents. Then, they tested and refined those ideas through hands-on user studies with an interesting method called Wizard of Oz.

By observing how those design patterns hold up in real-world user interactions, they were able to identify which current AI agent designs align with user expectations, and which fall short.

Phase 1: The taxonomy

The researchers looked into nine desktop, mobile, and web-based agents, including;

Claude Computer Use Tool
Adept
OpenAI Operator
AIlice
Magentic-UI
UI-TARS
Project Mariner
TaxyAI
AutoGLM

Then, they consulted with “8 practitioners who are designers, engineers, or researchers working in the domains of UX or AI at a large technology company,” which helped them map out a comprehensive taxonomy with four categories, 21 subcategories, and 55 example features covering the key UX considerations behind computer-using AI agents.

The four main categories included:

User Query: how users input commands
Explainability of Agent Activities: what information to present to the user about agent actions
User Control: how users can intervene
Mental Model & Expectations: how to help users understand the agent’s capabilities

In essence, that framework spanned everything from aspects of the interface that let agents present their plans to users, to how they communicate their capabilities, surface errors, and allow users to step in when something goes wrong.

With all of that at hand, they moved on to phase 2.

Phase 2: The Wizard-of-Oz study

The researchers recruited 20 users with prior experience with AI agents, and asked them to interact with an AI agent via a chat interface to perform either a vacation rental task or an online shopping task.

From the study:

Participants were provided with a mock user chat interface through which they could interact with an “agent” played by the researcher. Meanwhile, the participant were also presented with the agent’s execution interface, where the researcher acted as the agent and interacted with the Ul on screen based on the participant’s command. On the user chat interface, participants could enter textual queries in natural language, which then appeared in the chat thread. Then, the “agent” began execution, where the researcher controlled the mouse and keyboard on their end to simulate the agent’s actions on the web page. When the researcher completed the task, they entered a shortcut key that posted a “task completed” message in the chat thread. During execution, participants could use an interrupt button to stop the agent, and a message “agent interrupted” would appear in the chat.

In other words, unbeknownst to the users, the AI agent was, in reality, a researcher sitting in the next room, who would read the text instructions and perform the requested task.

For each task (vacation rental or online shopping), participants were requested to perform six functions with the help of the AI agent, some of which the agent would either purposely fail (such as getting stuck in a navigation loop) or make intentional mistakes (such as selecting something different from the user’s instruction).

At the end of each session, the researchers asked participants to reflect on their experience and propose features or changes to improve the interaction.

They also analyzed video recordings and chat logs from each session to identify recurring themes in user behavior, expectations, and pain points when interacting with the agent.

Main findings

Once all was said and done, the researchers found that users want visibility into what AI agents are doing, but not to micromanage every step, otherwise they could just perform the tasks themselves.

They also concluded that users want different agent behaviors depending on whether they’re exploring options, or executing a familiar task. Likewise, user expectations change based on whether they’re familiar with the interface. The more unfamiliar they were, the more they wanted transparency, intermediate steps, explanations, and confirmation pauses (even in low-risk scenarios).

They also found that people want more control when actions carry real consequences (such as making purchases, changing account or payment details, or contacting other people on their behalf), and also found that trust breaks down quickly when agents make silent assumptions or errors.

For instance, when the agent encountered ambiguous choices on a page, or deviated from the original plan without clearly flagging it, participants instructed the system to pause and ask for clarification, rather than just pick something seemingly at random and move on.

In that same vein, participants reported discomfort when the agent wasn’t transparent about making a particular choice, especially when that choice could lead to the wrong product being selected.

All in all, this is an interesting study for app developers looking to adopt agentic capabilities on their apps, and you can read it in full here.