By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Meta Introduces V-JEPA 2, a Video-Based World Model for Physical Reasoning
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Meta Introduces V-JEPA 2, a Video-Based World Model for Physical Reasoning
News

Meta Introduces V-JEPA 2, a Video-Based World Model for Physical Reasoning

News Room
Last updated: 2025/06/13 at 3:31 PM
News Room Published 13 June 2025
Share
SHARE

Meta has introduced V-JEPA 2, a new video-based world model designed to improve machine understanding, prediction, and planning in physical environments. The model extends the Joint Embedding Predictive Architecture (JEPA) framework and is trained to predict outcomes in embedding space using video data.

The model is trained in two phases. In the first, over one million hours of video and one million images are used for self-supervised pretraining without any action labels. This enables the model to learn representations of motion, object dynamics, and interaction patterns. In the second phase, it is fine-tuned on 62 hours of robot data that includes both video and action sequences. This stage allows the model to make action-conditioned predictions and support planning.

One Reddit user commented on the approach:

Predicting in embedding space is going to be more compute efficient, and also it is closer to how humans reason… Really feeling the AGI with this approach, regardless of the current results using the system.

Others have noted the limits of the approach. Dorian Harris, who focuses on AI strategy and education, wrote:

AGI requires broader capabilities than V-JEPA 2’s specialised focus. It is a significant yet narrow breakthrough, and the AGI milestone is overstated.

In robotic applications, V-JEPA 2 is used for short- and long-horizon manipulation tasks. For example, when given a goal in the form of an image, the robot uses the model to simulate possible actions and select those that move it closer to the goal. The system replans at each step, using a model-predictive control loop. Meta reports task success rates between 65% and 80% for pick-and-place tasks involving novel objects and settings.

The model has also been evaluated on benchmarks such as Something-Something v2, Epic-Kitchens-100, and Perception Test. When used with lightweight readouts, it performs competitively on tasks related to motion recognition and future action prediction.

Meta is also releasing three new benchmarks focused on physical reasoning from video: IntPhys 2, which tests for recognition of physically implausible events; MVPBench, which assesses video-question answering under minimal changes; and CausalVQA, which focuses on cause-effect reasoning and planning.

David Eberle, CEO of Typewise, noted:

The ability to anticipate and adapt to dynamic situations is exactly what is needed to make AI agents more context-aware in real-world customer interactions, too, not just in robotics.

Model weights, code, and datasets are available via GitHub and Hugging Face. A leaderboard has been launched for community benchmarking.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Apple Says iOS 26 Can Reserve Space for Automatic Software Updates
Next Article How to Write Custom Training Loops in Keras with GradientTape | HackerNoon
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Our boss let us spend the day visiting porn sites in the name of research
News
Comparing Pattern-Matching Across Different Languages: Java, Scala, and More | HackerNoon
Computing
M5 iPad Pro: Four new features are coming later this year – 9to5Mac
News
Tea App Breach Exposes 72,000 Selfies, ID Photos and Other User Images
News

You Might also Like

News

Our boss let us spend the day visiting porn sites in the name of research

11 Min Read
News

M5 iPad Pro: Four new features are coming later this year – 9to5Mac

5 Min Read
News

Tea App Breach Exposes 72,000 Selfies, ID Photos and Other User Images

3 Min Read
News

Satellite tracking startup Spaceflux scores £5.4m investment – UKTN

2 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?