Researchers at UC San Diego have released Orca, an open-source system that demonstrates how large language models (LLMs) can assist users on the web—not by taking control, but by guiding interaction. In the peer-reviewed white paper, the research team were able to demonstrate a significant improvement in task speed and accuracy during evaluation — offering early evidence of the potential of human-in-the-loop agents in real-world workflows.
Orca is built to assist users in extracting meaningful insights from the web, acting not as an autonomous browser agent, but as a decision-making ‘co-pilot’/assistant.
The system offers a suite of capabilities that include summarizing long web pages, extracting structured data from unstructured content, tracking changes across browsing sessions, and comparing claims across multiple sources. It can search, scroll, click, and interact with websites on command, allowing users to delegate repetitive or context-rich tasks while staying in control of the process.
In a lab study with eight participants, the researchers found that Orca accelerated web exploration, encouraged broader information foraging, and strengthened user trust in results.
Participants appreciated being able to visually organize pages, selectively delegate tasks to the AI, and maintain control over information sources. For example, one participant used Orca to compare Yelp options side-by-side, while another preferred to filter Reddit posts for product research. The spatial layout and batch interactions were particularly praised for reducing context-switching costs and making complex workflows easier to manage.
Notably, the researchers emphasized shared control as a core design principle—users initiate actions and remain in command, which contributed to increased trust and adoption. This emphasis on shared control supports transparency and trust—qualities the researchers identified as critical for building user confidence and ensuring agency in AI-assisted workflows.
The Orca system is implemented as an Electron application with a React-based frontend. Each web page is loaded into its own isolated webview, while the ‘Web Canvas’ interface, used for organizing and interacting with multiple pages, is built using the open-source tldraw library.
All language-based capabilities, such as summarization, extraction, and automation, are powered by the Claude 3.7 Sonnet model. Behind the scenes, Orca employs a custom HTML distillation and agentic pipeline architecture that transforms raw web content into structured representations usable by the LLM. These pipelines are shared across features and are designed to allow user intervention during execution.
The open-source release is positioned as a research prototype rather than a production-ready tool, aimed at helping developers explore future collaborative agent workflows. While promising, the researchers note that the prototype showed performance constraints under increasing workloads: “An M4 Max MacBook Pro with 36GB of unified memory handles up to around 80 webpages before freezing.”
Orca’s positive results of the benefits of ‘human-in-the-loop’ systems gives us a glimpse into what future collaborative user and agent interactions might look like —where AI agents assist but don’t replace users in high-context, decision-heavy workflows.
At the time of writing, Orca is certainly not alone in this philosophy, sharing the space with other emerging tools. Further examples can be seen in OpenAI’s Operator and the redesigned Opera Neon browser.