By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Microsoft’s new AI agent can control software and robots
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Software > Microsoft’s new AI agent can control software and robots
Software

Microsoft’s new AI agent can control software and robots

News Room
Last updated: 2025/02/22 at 6:49 PM
News Room Published 22 February 2025
Share
SHARE

On Wednesday, Microsoft Research introduced Magma, an integrated AI foundation model that combines visual and language processing to control software interfaces and robotic systems. If the results hold up outside of Microsoft’s internal testing, it could mark a meaningful step forward for an all-purpose multimodal AI that can operate interactively in both real and digital spaces.

Microsoft claims that Magma is the first AI model that not only processes multimodal data (like text, images, and video) but can also natively act upon it—whether that’s navigating a user interface or manipulating physical objects. The project is a collaboration between researchers at Microsoft, KAIST, the University of Maryland, the University of Wisconsin-Madison, and the University of Washington.

We’ve seen other large language model-based robotics projects like Google’s PALM-E and RT-2 or Microsoft’s ChatGPT for Robotics that utilize LLMs for an interface. However, unlike many prior multimodal AI systems that require separate models for perception and control, Magma integrates these abilities into a single foundation model.

A combined graphic that shows off various capabilities of the Magma model.


Credit:

Microsoft Research

Microsoft is positioning Magma as a step toward agentic AI, meaning a system that can autonomously craft plans and perform multi-step tasks on a human’s behalf rather than just answering questions about what it sees.

“Given a described goal,” Microsoft writes in its research paper, “Magma is able to formulate plans and execute actions to achieve it. By effectively transferring knowledge from freely available visual and language data, Magma bridges verbal, spatial, and temporal intelligence to navigate complex tasks and settings.”

Microsoft is not alone in its pursuit of agentic AI. OpenAI has been experimenting with AI agents through projects like Operator that can perform UI tasks in a web browser, and Google has explored multiple agentic projects with Gemini 2.0.

Spatial intelligence

While Magma builds off of Transformer-based LLM technology that feeds training tokens into a neural network, it’s different from traditional vision-language models (like GPT-4V, for example) by going beyond what they call “verbal intelligence” to also include “spatial intelligence” (planning and action execution). By training on a mix of images, videos, robotics data, and UI interactions, Microsoft claims that Magma is a true multimodal agent rather than just a perceptual model.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article iOS 18.4 to enable control of Matter-compatible robot vacuums
Next Article I’ve been using Apple AirTags for months, and this 4-pack deal is an absolute steal
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

FBI issues warning about AI voice impersonations of US officials
News
How to Create a Waterfall Chart in Excel: 4 Simple Steps |
Computing
YouTube intros new top-podcast chart, letting you see exactly who’s influencing America
News
Your Guide to Choosing the Best Refurbished Electronics
Gadget

You Might also Like

Software

Waymo Voluntarily Recalled 1,200 Robotaxis

2 Min Read
Software

Visionos 3 May Bring Eye-Scrolling Capability to Apple Vision Pro

3 Min Read

Elden ring nightreign, doom: the dark ages round up the biggest games of May

11 Min Read
Software

Amazon trims jobs in devices and services unit

2 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?