You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Navigation World Models, CVPR 2025 (Oral)

Paper

This repo contains pretrained models of Navigation World Models- the Conditional Diffusion Transformer (CDiT) model training code. See the project page for additional results.

Navigation World Models
Amir Bar, Gaoyue "Kathy" Zhou, Danny Tran, Trevor Darrell, Yann LeCun
AI at Meta, UC Berkeley, New York University

Pretrained Models

Model type # Parameters Training Steps Datasets Link
CDiT/XL 1B 100k RECON, SCAND, TartanDrive, HuRoN Link
CDiT/XL 1B 200k RECON, SCAND, TartanDrive, HuRoN, +Ego4D Link

Note: All models were retrained after face blurring on the training data. Thus, results might vary compared to the main paper.

BibTeX

@article{bar2024navigation,
  title={Navigation world models},
  author={Bar, Amir and Zhou, Gaoyue and Tran, Danny and Darrell, Trevor and LeCun, Yann},
  journal={arXiv preprint arXiv:2412.03572},
  year={2024}
}

Acknowledgments

We thank Noriaki Hirose for his help with the HuRoN dataset and for sharing his insights, and to Manan Tomar, David Fan, Sonia Joseph, Angjoo Kanazawa, Ethan Weber, Nicolas Ballas, and the anonymous reviewers for their helpful discussions and feedback.

License

The code and model weights are licensed under Creative Commons Attribution-NonCommercial 4.0 International. See LICENSE.txt for details.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support