SmolVLA
Collection
Small, efficient and light-weight VLAs pretrained on community datasets
•
1 item
•
Updated
•
14
SmolVLA: A vision-language-action model for affordable and efficient robotics
Designed by Hugging Face.
This model has 450M parameters in total. You can use inside the LeRobot library.
Install smolvla extra dependencies:
pip install -e ".[smolvla]"
Example of finetuning the smolvla pretrained model (smolvla_base
):
python lerobot/scripts/train.py \
--policy.path=lerobot/smolvla_base \
--dataset.repo_id=danaaubakirova/svla_so100_task1_v3 \
--batch_size=64 \
--steps=200000
Example of finetuning the smolvla neural network with pretrained VLM and action expert intialized from scratch:
python lerobot/scripts/train.py \
--policy.type=smolvla \
--dataset.repo_id=danaaubakirova/svla_so100_task1_v3 \
--batch_size=64 \
--steps=200000
Example of using the smolvla pretrained model outside LeRobot training framework:
policy = SmolVLAPolicy.from_pretrained("lerobot/smolvla_base")