Fei-Fei Li, co-director of the Human-Centered AI Institute at Stanford University, during the Bloomberg Technology Summit in San Francisco, California, US, on Thursday, May 9, 2024. Bloomberg Tech is a forward-looking meeting that aims to spark conversations about cutting-edge technologies and their future applications for business. Photographer: David Paul Morris/Bloomberg
© 2024 Bloomberg Finance LP
For most people, the AI boom has unfolded on a screen. Tools like ChatGPT, Claude, and Copilot have changed the way we compose emails, summarize documents, write code, and communicate. But while generative AI has changed the way we work, it has not yet reshaped the physical world, where most economic activity actually takes place.
Fei-Fei Li believes this will change.
The Stanford professor – and one of the most influential voices in AI – argues that the next era of AI will not be defined by chat boxes. It will be defined by systems that understand and act on the real world. She calls this the next generation of AI “spatial intelligence”, and her new venture, World laboratoriesis betting that this will unlock the next big wave of industrial and economic value.
When AI leaves the screen
Imagine a utility facing wildfire-level winds. A spatially aware AI doesn’t wait for a prompt. It predicts how conditions will evolve, reroutes power, sends a drone to inspect a transformer that is expected to fail, and alerts first responders: for everything burns.
Or imagine a hospital where demand peaks in winter. A global model anticipates bottlenecks, simulates staffing scenarios, reorganizes beds and controls autonomous robots that deliver medicine. This isn’t just data analysis; it is real-time coordination.
These are the kinds of scenarios Li envisions – and the basis of World Labs’ mission. Together with co-founders Justin Johnson, Christoph Lassner and Ben Mildenhall, all influential figures in the field of computer vision and graphics, she develops models that understand objects, motion, cause and effect and physical constraints. In other words: the ingredients of reality.
The company’s first product, Marbleprovides a first impression of how spatial intelligence works. Provide a short text description and an explorable 3D environment is created. It feels less like turning on an AI and more like entering a world set up on demand.
Why the physical economy is the real prize
LLMs dominate the headlines, but most of the global economy operates in places that language models cannot see: factories, warehouses, fields, hospitals, energy grids, construction sites.
These environments are determined by physics, timing and uncertainty. People understand these intuitively because we navigate them our entire lives. Machines don’t have that. As Li often notes, people are “embodied agents”: we learn through movement, interaction, and consequences. AI systems trained solely on text lack that foundation, creating a gap between what they can describe and what they can actually do.
World models try to close that gap by giving machines intuition about how the world works. In this new paradigm, AI is not defined by language, but by space.
If she explains in an interview with Lenny Rachitsky“A language model reads a book and spits out the following sentence. A World Model looks at a movie, predicts the entire plot twist and lets you rewrite the ending in no time. It’s not just describing; it simulates physics, emotions, chaos and all the messy stuff of real life.”
The consequences for business are significant. Companies will be able to model decisions before acting on them, reducing risk and accelerating execution. A production line can be digitally redesigned before any equipment is moved. A logistics network can be tested virtually before trucks or containers are rerouted. Hospitals can simulate patient flows before adjusting staffing or capacity. Construction companies can explore hundreds of design variations before deploying materials.
In each case, decision-making becomes more informed and less reactive. Instead of building first and adapting later, organizations can explore possibilities in a virtual environment where mistakes are free and insights accumulate quickly.
The markets for this technology are enormous, and the scope of its capabilities rivals the early days of cloud computing, mobile telephony, and the commercial Internet.
A look inside Marble and the world of embodied AI
One of the most transformative applications of world models is “embodied AI” – the intelligence layer behind robots, drones, autonomous vehicles and industrial automation. Today, these systems learn slowly because real-world training is expensive and error-prone.
World models change the equation by giving machines a safe, rich environment in which they can learn thousands of hours of behavior. It is the foundation that robotics has lacked for decades.
World Labs’ debut product, Marble, is already being used to generate virtual sets and 3D scenes in hours instead of weeks. But Li sees it as a precursor to much broader business opportunities: modeling facilities before construction, testing operational strategies before rollout, rehearsing safety scenarios before incidents, and designing customer experiences before physical investments. In any case, spatial intelligence becomes the bridge between digital planning and physical execution.
AI at the service of humanity
With every AI breakthrough led by Li, from ImageNet to LLMs to spatial intelligence and the launch of Marble, her work continues to return to the same principle: intelligence is only useful if it serves humanity. And as world models push AI from words to the physical world, that principle becomes the real competitive advantage.
According to Li, the next era of innovation will not be won by companies simply deploying larger models. It will be won by leaders who understand that the power of AI – and its risk – grows as its reach increases. Li is betting that spatial intelligence will redefine how industries operate, how we design and build, and how we respond to the world’s most complex challenges. But she is equally clear about something that Silicon Valley too often forgets: the future of AI is not predetermined. It depends on the choices we make.
If world models deliver on their promise, they won’t just help machines understand our world. They will help us redesign it with more foresight, more possibility and, if we choose wisely, with more humanity.
