SmolVLA SO100 Screw-Lid Model
A Vision-Language-Action (VLA) model fine-tuned on the SO100 Screw-Lid Dataset for robotic manipulation tasks.
Model Description
This model is a SmolVLA variant trained specifically on the SO100 screw-lid manipulation task. It learns to perform the complete sequence: picking up a jar, placing it on a silicone puck, seating the lid with a half-turn, and transporting the assembled jar to a goal location.
- Developed by: Tomas0413
- Model type: Vision-Language-Action (VLA)
- Base architecture: SmolVLA
- Training data: SO100 Screw-Lid Dataset (v0)
- Task domain: Robotic manipulation (screw-lid assembly)
Training Details
Training Data
The model was trained on 51 teleoperated demonstrations from the SO100 Screw-Lid Dataset, featuring:
- Dual camera views (wrist + overhead) at 1280×720 @ 30 FPS
- 6-DOF joint positions, velocities, and gripper states
- Synchronized action sequences for pick-place-assemble-transport tasks
- Total of ~45k training frames
Training Procedure
Training regime: Fine-tuned from SmolVLA base model on SO100 screw-lid demonstrations
Intended Uses
Direct Use
- Robotic manipulation: Deploy on SO100 or similar 6-DOF robotic arms for screw-lid assembly tasks
- Research: Study vision-language-action learning for fine manipulation
- Benchmarking: Evaluate VLA performance on multi-step manipulation sequences
Downstream Use
- Transfer learning to related assembly tasks
- Few-shot adaptation to different jar/lid combinations
- Integration into larger robotic task planning systems
Limitations and Bias
- Domain-specific: Trained only on screw-lid assembly with specific objects
- Robot morphology: Optimized for SO100 arm kinematics and gripper
- Environmental constraints: Single lighting condition, fixed camera positions
- Limited generalization: May not transfer well to significantly different manipulation tasks
Usage
# Example usage with LeRobot
from lerobot.common.policies import load_policy
# Load the trained model
policy = load_policy("Tomas0413/so100_screw_lid_smolvla")
# Run inference on robot observations
action = policy.select_action(observation)
Training Dataset
This model was trained on the SO100 Screw-Lid Dataset (v0), which contains 51 teleoperated episodes of the complete screw-lid manipulation sequence recorded during the LeRobot Worldwide Hackathon (June 15-16, 2025).
Model Card Contact
Tomas0413
- Downloads last month
- 9
Model tree for Tomas0413/so100_screw_lid_smolvla
Base model
lerobot/smolvla_base