SmolVLA Fine-Tuned on for Food Stacking

Summary: This is a fine-tuned version of lerobot/smolvla_base for stacking food objects (e.g., burgers, sandwiches). It was fine-tuned on the GetSoloTech/FoodStack dataset using the LeRobot framework.

Model details

Base model: lerobot/smolvla_base
Task: Vision-Language-Action control for manipulation (stacking)
Domain: Food item stacking (burger, sandwich, etc.)
Params: ~450M (SmolVLA)
Library: LeRobot (lerobot)

Quick start

Install LeRobot with SmolVLA extras:

git clone https://github.com/huggingface/lerobot.git
cd lerobot
pip install -e ".[smolvla]"

Load the policy from this repo and run inference:

from lerobot.common.policies.smolvla.modeling_smolvla import SmolVLAPolicy

# Replace with your actual model ID on the Hub
model_id = "GetSoloTech/SmolVLA-FoodStack"

policy = SmolVLAPolicy.from_pretrained(model_id)

# Example placeholders for observation and instruction
observation = {
    "image": ... ,  # BGR/RGB frame or processed observation per your setup
    "state": ... ,  # optional proprio/scene state if used
}
instruction = "Stack the burger: bun, patty, cheese, lettuce, bun."

# Depending on your pipeline, you may wrap this in your control loop
actions = policy(observation, instruction)

# Send actions to your robot controller
# send_actions_to_robot(actions)

For end-to-end examples (policy loops, camera/robot IO), see the LeRobot docs and examples.

Notes:

Tune batch size/steps and augmentation to your hardware and dataset split.
Ensure your observation preprocessing at train-time matches inference.

Limitations

Specializes in food stacking; may not generalize to unseen objects/layouts.
Sensitive to perception domain shift (lighting, textures, camera intrinsics).
Requires correct observation normalization consistent with training.

Dataset

Training data: GetSoloTech/FoodStack

Resources and references

SmolVLA base: https://huggingface.co/lerobot/smolvla_base
SmolVLA overview: https://smolvla.net/index_en.html
LeRobot: https://github.com/huggingface/lerobot

Downloads last month: 11

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GetSoloTech/SmolVLA-FoodStack

Base model

lerobot/smolvla_base

Finetuned

(996)

this model

GetSoloTech
/

SmolVLA-FoodStack