--- base_model: - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B datasets: - thuml/webarena-world-model-cot license: mit tags: - web-agent - webarena - world-model pipeline_tag: text-generation library_name: transformers --- See https://github.com/thuml/RLVR-World for examples for using this model. ## Citation ``` @article{wu2025rlvr, title={RLVR-World: Training World Models with Reinforcement Learning}, author={Jialong Wu and Shaofeng Yin and Ningya Feng and Mingsheng Long}, journal={arXiv preprint arXiv:2505.13934}, year={2025}, } ```