--- license: mit datasets: - IPEC-COMMUNITY/libero_object_no_noops_lerobot language: - en base_model: - Hume-vla/Hume-System2 pipeline_tag: robotics library_name: transformers tags: - VLA --- # Model Card for Hume-Libero_Object A Dual-System Visual-Language-Action model with System-2 thinking trained on Libero-Object. - Paper: [https://arxiv.org/abs/2505.21432](https://arxiv.org/abs/2505.21432) - Homepage: [https://hume-vla.github.io](https://hume-vla.github.io) - Codebase: [🦾 Hume: A Dual-System VLA with System2 Thinking](https://github.com/hume-vla/hume) ![GitHub Repo stars](https://img.shields.io/github/stars/hume-vla/hume) ## Optimal TTS Args ```bash s2_candidates_num=5 noise_temp_lower_bound=1.0 noise_temp_upper_bound=1.2 time_temp_lower_bound=1.0 time_temp_upper_bound=1.0 ``` ## Uses - If you want to reproduce the results in paper, follow the [instruction](https://github.com/hume-vla/hume/tree/main/experiments/libero) - If you want to directly use the model: ```python from hume import HumePolicy import numpy as np # load policy hume = HumePolicy.from_pretrained("/path/to/checkpoints") # config Test-Time Computing args hume.init_infer( infer_cfg=dict( replan_steps=8, s2_replan_steps=16, s2_candidates_num=5, noise_temp_lower_bound=1.0, noise_temp_upper_bound=1.0, time_temp_lower_bound=0.9, time_temp_upper_bound=1.0, post_process_action=True, device="cuda", ) ) # prepare observations observation = { "observation.images.image": np.zeros((1,224,224,3), dtype = np.uint8), # (B, H, W, C) "observation.images.wrist_image": np.zeros((1,224,224,3), dtype = np.uint8), # (B, H, W, C) "observation.state": np.zeros((1, 7)), # (B, state_dim) "task": ["Lift the papper"], } # Infer the action action = hume.infer(observation) # (B, action_dim) ``` ## Training and Evaluation Details ```bash # source ckpts 2025-05-01/19-56-05_libero_object_ck8-16-1_sh-4_gpu8_lr5e-5_1e-5_1e-5_2e-5_bs16_s1600k/0150000 # original logs 2025-06-13/00-18-26+19-56-05_libero_object_ck8-16-1_sh-4_gpu8_lr5e-5_1e-5_1e-5_2e-5_bs16_s1600k_0150000_s1-8_s2-16_s2cand-5_ntl-1.0_ntu-1.2_ttl-1.0_ttu-1.0.log ``` ## Citation ```BibTeX @article{song2025hume, title={Hume: Introducing System-2 Thinking in Visual-Language-Action Model}, author={Anonimous Authors}, journal={arXiv preprint arXiv:2505.21432}, year={2025} } ```