LeRobot documentation
LeRobotDataset v3.0
LeRobotDataset v3.0
LeRobotDataset v3.0
is a standardized format for robot learning data. It provides unified access to multi-modal time-series data, sensorimotor signals and multi‑camera video, as well as rich metadata for indexing, search, and visualization on the Hugging Face Hub.
This docs will guide you to:
- Understand the v3.0 design and directory layout
- Record a dataset and push it to the Hub
- Load datasets for training with
LeRobotDataset
- Stream datasets without downloading using
StreamingLeRobotDataset
- Migrate existing
v2.1
datasets tov3.0
What’s new in v3
- File-based storage: Many episodes per Parquet/MP4 file (v2 used one file per episode).
- Relational metadata: Episode boundaries and lookups are resolved through metadata, not filenames.
- Hub-native streaming: Consume datasets directly from the Hub with
StreamingLeRobotDataset
. - Lower file-system pressure: Fewer, larger files ⇒ faster initialization and fewer issues at scale.
- Unified organization: Clean directory layout with consistent path templates across data and videos.
Installation
LeRobotDataset v3.0
will be included in lerobot >= 0.4.0
.
Until that stable release, you can use the main branch by following the build from source instructions.
Record a dataset
Run the command below to record a dataset with the SO-101 and push to the Hub:
lerobot-record \
--robot.type=so101_follower \
--robot.port=/dev/tty.usbmodem585A0076841 \
--robot.id=my_awesome_follower_arm \
--robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 1920, height: 1080, fps: 30}}" \
--teleop.type=so101_leader \
--teleop.port=/dev/tty.usbmodem58760431551 \
--teleop.id=my_awesome_leader_arm \
--display_data=true \
--dataset.repo_id=${HF_USER}/record-test \
--dataset.num_episodes=5 \
--dataset.single_task="Grab the black cube"
See the recording guide for more details.
Format design
A core v3 principle is decoupling storage from the user API: data is stored efficiently (few large files), while the public API exposes intuitive episode-level access.
v3
has three pillars:
- Tabular data: Low‑dimensional, high‑frequency signals (states, actions, timestamps) stored in Apache Parquet. Access is memory‑mapped or streamed via the
datasets
stack. - Visual data: Camera frames concatenated and encoded into MP4. Frames from the same episode are grouped; videos are sharded per camera for practical sizes.
- Metadata: JSON/Parquet records describing schema (feature names, dtypes, shapes), frame rates, normalization stats, and episode segmentation (start/end offsets into shared Parquet/MP4 files).
To scale to millions of episodes, tabular rows and video frames from multiple episodes are concatenated into larger files. Episode‑specific views are reconstructed via metadata, not file boundaries.

Directory layout (simplified)
meta/info.json
: canonical schema (features, shapes/dtypes), FPS, codebase version, and path templates to locate data/video shards.meta/stats.json
: global feature statistics (mean/std/min/max) used for normalization; exposed asdataset.meta.stats
.meta/tasks.jsonl
: natural‑language task descriptions mapped to integer IDs for task‑conditioned policies.meta/episodes/
: per‑episode records (lengths, tasks, offsets) stored as chunked Parquet for scalability.data/
: frame‑by‑frame Parquet shards; each file typically contains many episodes.videos/
: MP4 shards per camera; each file typically contains many episodes.
Load a dataset for training
LeRobotDataset
returns Python dictionaries of PyTorch tensors and integrates with torch.utils.data.DataLoader
. Here is a code example showing its use:
import torch
from lerobot.datasets.lerobot_dataset import LeRobotDataset
repo_id = "yaak-ai/L2D-v3"
# 1) Load from the Hub (cached locally)
dataset = LeRobotDataset(repo_id)
# 2) Random access by index
sample = dataset[100]
print(sample)
# {
# 'observation.state': tensor([...]),
# 'action': tensor([...]),
# 'observation.images.front_left': tensor([C, H, W]),
# 'timestamp': tensor(1.234),
# ...
# }
# 3) Temporal windows via delta_timestamps (seconds relative to t)
delta_timestamps = {
"observation.images.front_left": [-0.2, -0.1, 0.0] # 0.2s and 0.1s before current frame
}
dataset = LeRobotDataset(repo_id, delta_timestamps=delta_timestamps)
# Accessing an index now returns a stack for the specified key(s)
sample = dataset[100]
print(sample["observation.images.front_left"].shape) # [T, C, H, W], where T=3
# 4) Wrap with a DataLoader for training
batch_size = 16
data_loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size)
device = "cuda" if torch.cuda.is_available() else "cpu"
for batch in data_loader:
observations = batch["observation.state"].to(device)
actions = batch["action"].to(device)
images = batch["observation.images.front_left"].to(device)
# model.forward(batch)
Stream a dataset (no downloads)
Use StreamingLeRobotDataset
to iterate directly from the Hub without local copies. This allows to stream large datasets without the need to downloading them onto disk or loading them onto memory, and is a key feature of the new dataset format.
from lerobot.datasets.streaming_dataset import StreamingLeRobotDataset
repo_id = "yaak-ai/L2D-v3"
dataset = StreamingLeRobotDataset(repo_id) # streams directly from the Hub

Migrate v2.1 → v3.0
A converter aggregates per‑episode files into larger shards and writes episode offsets/metadata. Convert your dataset using the instructions below.
# Pre-release build with v3 support:
pip install "https://github.com/huggingface/lerobot/archive/33cad37054c2b594ceba57463e8f11ee374fa93c.zip"
# Convert an existing v2.1 dataset hosted on the Hub:
python -m lerobot.datasets.v30.convert_dataset_v21_to_v30 --repo-id=<HF_USER/DATASET_ID>
What it does
- Aggregates parquet files:
episode-0000.parquet
,episode-0001.parquet
, … →file-0000.parquet
, … - Aggregates mp4 files:
episode-0000.mp4
,episode-0001.mp4
, … →file-0000.mp4
, … - Updates
meta/episodes/*
(chunked Parquet) with per‑episode lengths, tasks, and byte/frame offsets.