SmolVLA Fine-Tuned on for Food Stacking

Summary: This is a fine-tuned version of lerobot/smolvla_base for stacking food objects (e.g., burgers, sandwiches). It was fine-tuned on the GetSoloTech/FoodStack dataset using the LeRobot framework.

Model details

  • Base model: lerobot/smolvla_base
  • Task: Vision-Language-Action control for manipulation (stacking)
  • Domain: Food item stacking (burger, sandwich, etc.)
  • Params: ~450M (SmolVLA)
  • Library: LeRobot (lerobot)

Quick start

Install LeRobot with SmolVLA extras:

git clone https://github.com/huggingface/lerobot.git
cd lerobot
pip install -e ".[smolvla]"

Load the policy from this repo and run inference:

from lerobot.common.policies.smolvla.modeling_smolvla import SmolVLAPolicy

# Replace with your actual model ID on the Hub
model_id = "GetSoloTech/SmolVLA-FoodStack"

policy = SmolVLAPolicy.from_pretrained(model_id)

# Example placeholders for observation and instruction
observation = {
    "image": ... ,  # BGR/RGB frame or processed observation per your setup
    "state": ... ,  # optional proprio/scene state if used
}
instruction = "Stack the burger: bun, patty, cheese, lettuce, bun."

# Depending on your pipeline, you may wrap this in your control loop
actions = policy(observation, instruction)

# Send actions to your robot controller
# send_actions_to_robot(actions)

For end-to-end examples (policy loops, camera/robot IO), see the LeRobot docs and examples.

Notes:

  • Tune batch size/steps and augmentation to your hardware and dataset split.
  • Ensure your observation preprocessing at train-time matches inference.

Limitations

  • Specializes in food stacking; may not generalize to unseen objects/layouts.
  • Sensitive to perception domain shift (lighting, textures, camera intrinsics).
  • Requires correct observation normalization consistent with training.

Dataset

  • Training data: GetSoloTech/FoodStack

Resources and references

  • SmolVLA base: https://huggingface.co/lerobot/smolvla_base
  • SmolVLA overview: https://smolvla.net/index_en.html
  • LeRobot: https://github.com/huggingface/lerobot
Downloads last month
11
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GetSoloTech/SmolVLA-FoodStack

Finetuned
(996)
this model

Dataset used to train GetSoloTech/SmolVLA-FoodStack