As Nvidia Corp. announced new robotics innovations at last week’s Conference on Robot Learning in South Korea, the company continues to extend its product line with new capabilities and enhancements.
Nvidia announcements at CoRL included:
- The Isaac GR00T N1.6 open foundation reasoning vision language action model that provides robots with humanlike reasoning to break down complex instructions and execute tasks using prior knowledge and common sense.
- Availability of the open-source Newton Physics Engine in the company’s Isaac Lab. A co-development with Google DeepMind and Disney Research enables the creation of more capable and adaptable robots.
- New Cosmos World Foundation models that provide developers with the ability to generate diverse data for accelerating training physical artificial intelligence models at scale.
“Humanoids are the next frontier of physical AI, requiring the ability to reason, adapt and act safely in an unpredictable world,” Rev Lebaredian, vice president of Omniverse and simulation technology at Nvidia, said in an analyst briefing before the conference. “With these latest updates, developers now have the three computers to bring robots from research into everyday life — with Isaac GR00T serving as the robot’s brains, Newton simulating their body, and Nvidia Omniverse as their training ground.”
Lebaredian said nearly half of the papers accepted at CoRL cited the use of Nvidia technology, including accelerated computing platforms and software libraries.
Nvidia’s long reach in the robotics world
Nvidia has taken an architectural approach to physical AI with three computer system to build robots. Blackwell DGX or HGX AI systems to train the physical AI model, RTX PRO servers and Omniverse with Cosmos to simulate and test physical AI models, and Jetson AGX Thor powered by Blackwell for on-robot inference. “We build the three computers, plus the open models, open-source simulation frameworks and data pipelines that run on these computers for physical AI developers,” explained Lebaredian.
Robots that reason
The updates to GR00T N1.6 include using Cosmos Reason as its long-thinking brain. Cosmos Reason is an open, customizable reasoning VLM, or vision language model, for physical AI. It will bring human-like reasoning to humanoids, allowing them to break down complex instructions and execute tasks using prior knowledge and common sense. It will also let humanoids move and handle objects simultaneously with more torso and arm freedom to complete tough tasks like opening heavy doors while carrying items.
“Reasoning is the best tool we have currently for extending from the set of things that we fed into the AIs for their training into novel new paradigms and environments,” Lebaredian said.
Unlike large language models that can be trained using human knowledge from the internet, such data doesn’t exist for training physical AI models according to Lebaredian. “Real-world data is costly and potentially dangerous to capture, and pre-training only goes so far,” he explained. “To train models like GR00T, we need a scalable and cost-effective way to generate large, diverse and physically accurate data.”
To provide that training, Nvidia announced that new versions of the Cosmos-Predict and Cosmos-Transfer World Foundation models will be available soon.
Cosmos-Predict is a breakthrough in robot training as it generates future states from an initial state, providing the data that’s lacking today. This new release unifies three separate models into one, cutting post-training time, which reduces complexity and that lowers compute cost. It delivers higher quality than previous versions, as well as open-source models of similar size. It now supports multiview outputs for multisensor robots and autonomous vehicles, and can produce videos of up to 30 seconds.
“Cosmos-Transfer performs world-to-world style transfer, and the latest version is 3.5 times smaller than before,” Lebaredian continued. “This smaller footprint lowers compute cost and makes it easier for developers to augment and scale training data. Together, these models enable the generation of hundreds of virtual sensor-rich environments for robot training, reducing reliance on real-world data collection.”
The role of supercomputers
As “the era of robot reasoning” begins, Lebaredian said “robots need a supercomputer that can power the entire system from brain to body.” Nvidia’s announcement last month of Jetson Thor, which is designed for physical AI and robotics and is powered by a Nvidia Blackwell GPU and 128 gigabytes of memory, delivers “the AI performance to run the latest models, including the Isaac GR00T and Cosmos World Foundation models,” he said.
Leveraging open source
In his briefing, Lebaredian addressed how Nvidia’s open-source approach to Newton and Nvidia Isaac GR00T benefits the robotics community, particularly researchers and startups.
“The very nature of how research happens, how we advance the frontier of human knowledge, is about openness,” he said. “It’s about sharing information between all of the researchers so they can advance together. The only way to really do that well in the computing world is by also sharing the same software, algorithms and techniques that are both developed and the tools and pipelines behind them used to actually conduct the research.”
With robotics and physical AI, Larabedian added, “we’re still on the frontiers. It’s critical that we advance these frontiers together. To do that, we have to contribute to open source and do this all in the open so everyone can move together. Nvidia, being in the position that it’s in, can disproportionately help by putting our vast resources behind the software development and the technology necessary to power all of this, and we’re doing so.”
Zeus Kerravala is a principal analyst at ZK Research, a division of Kerravala Consulting. He wrote this article for News.
Image: Nvidia
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
- 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
- 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About News Media
Founded by tech visionaries John Furrier and Dave Vellante, News Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.