SmolVLA SO100 Screw-Lid Model

A Vision-Language-Action (VLA) model fine-tuned on the SO100 Screw-Lid Dataset for robotic manipulation tasks.

Model Description

This model is a SmolVLA variant trained specifically on the SO100 screw-lid manipulation task. It learns to perform the complete sequence: picking up a jar, placing it on a silicone puck, seating the lid with a half-turn, and transporting the assembled jar to a goal location.

  • Developed by: Tomas0413
  • Model type: Vision-Language-Action (VLA)
  • Base architecture: SmolVLA
  • Training data: SO100 Screw-Lid Dataset (v0)
  • Task domain: Robotic manipulation (screw-lid assembly)

Training Details

Training Data

The model was trained on 51 teleoperated demonstrations from the SO100 Screw-Lid Dataset, featuring:

  • Dual camera views (wrist + overhead) at 1280×720 @ 30 FPS
  • 6-DOF joint positions, velocities, and gripper states
  • Synchronized action sequences for pick-place-assemble-transport tasks
  • Total of ~45k training frames

Training Procedure

Training regime: Fine-tuned from SmolVLA base model on SO100 screw-lid demonstrations

Intended Uses

Direct Use

  • Robotic manipulation: Deploy on SO100 or similar 6-DOF robotic arms for screw-lid assembly tasks
  • Research: Study vision-language-action learning for fine manipulation
  • Benchmarking: Evaluate VLA performance on multi-step manipulation sequences

Downstream Use

  • Transfer learning to related assembly tasks
  • Few-shot adaptation to different jar/lid combinations
  • Integration into larger robotic task planning systems

Limitations and Bias

  • Domain-specific: Trained only on screw-lid assembly with specific objects
  • Robot morphology: Optimized for SO100 arm kinematics and gripper
  • Environmental constraints: Single lighting condition, fixed camera positions
  • Limited generalization: May not transfer well to significantly different manipulation tasks

Usage

# Example usage with LeRobot
from lerobot.common.policies import load_policy

# Load the trained model
policy = load_policy("Tomas0413/so100_screw_lid_smolvla")

# Run inference on robot observations
action = policy.select_action(observation)

Training Dataset

This model was trained on the SO100 Screw-Lid Dataset (v0), which contains 51 teleoperated episodes of the complete screw-lid manipulation sequence recorded during the LeRobot Worldwide Hackathon (June 15-16, 2025).

Model Card Contact

Tomas0413

Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Tomas0413/so100_screw_lid_smolvla

Finetuned
(197)
this model

Dataset used to train Tomas0413/so100_screw_lid_smolvla