Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm
Introduction
NVIDIA Isaac GR00T (Generalist Robot 00 Technology) is a research and development platform for building robot foundation models and data pipelines, designed to accelerate the creation of intelligent, adaptable robots.
Today, we announced the availability of Isaac GR00T N1.5, the first major update to Isaac GR00T N1, the worldβs first open foundation model for generalized humanoid robot reasoning and skills. This cross-embodiment model processes multimodal inputs, including language and images, to perform manipulation tasks across diverse environments. It is adaptable through post-training for specific embodiments, tasks, and environments.
In this blog, weβll demonstrate how to post-train (fine-tune) GR00T N1.5 using teleoperation data from a single SO-101 robot arm.
Step-by-step tutorial
Now accessible to developers working with a wide range of robot form factors, GR00T N1.5 can be easily fine-tuned and adapted using the affordable, open-source LeRobot SO-101 arm.
This flexibility is enabled by the EmbodimentTag
system, which allows seamless customization of the model for different robotic platforms, empowering hobbyists, researchers, and engineers to tailor advanced humanoid reasoning and manipulation capabilities to their own hardware.
Step 1: Dataset Preparation
Users can fine-tune GROOT N1.5 with any LeRobot dataset. For this tutorial, we will be using the table cleanup task as an example for fine-tuning.
It's important to note that datasets for the SO-100 or SO-101 are not in GR00T N1.5's initial pre-training. As a result of this, we will be training it as a new_embodiment
.
1.1 Create or Download Your Dataset
For this tutorial, you can either begin by creating your own custom dataset by following these instructions (recommended) or by downloading the so101-table-cleanup dataset from Hugging Face. The --local-dir
argument specifies where the dataset will be saved on your machine.
huggingface-cli download \
--repo-type dataset youliangtan/so101-table-cleanup \
--local-dir ./demo_data/so101-table-cleanup
1.2 Configure Modality File
The modality.json
file provides additional information about the state and action modalities to make it "GR00T-compatible". Copy over the examples/so100__modality.json
to the dataset <DATASET_PATH>/meta/modality.json
using this command:
cp examples/so100__modality.json ./demo_data/so101-table-cleanup/meta/modality.json
NOTE: For a dual-camera setup like the so101-table-cleanup dataset, run:
cp examples/so100__dualcam_modality.json ./demo_data/so100-table-cleanup/meta/modality.json
After these steps, the dataset can be loaded using the GR00T LeRobotSingleDataset
class. An example script for loading the dataset is shown here:
python scripts/load_dataset.py --dataset-path datasets/so101-table-cleanup/ --plot-state-action --video-backend torchvision_av
Step 2: Fine-tuning the Model
Fine-tuning GR00T N1.5 can be executed using the Python script, scripts/gr00t_finetune.py
. To begin finetuning, execute the following command from your terminal:
python scripts/gr00t_finetune.py \
--dataset-path /datasets/so101-table-cleanup/ \
--num-gpus 1 \
--output-dir ~/so101-checkpoints \
--max-steps 10000 \
--data-config so100_dualcam \
--video-backend torchvision_av
Step 3: Open-loop Evaluation
Once the training is complete and your fine-tuned policy is generated, you can visualize its performance in an open-loop setting by running the following command:
python scripts/eval_policy.py --plot \
--embodiment_tag new_embodiment \
--model_path <YOUR_CHECKPOINT_PATH> \
--data_config so100_dualcam \
--dataset_path /datasets/so101-table-cleanup/ \
--video_backend torchvision_av \
--modality_keys single_arm gripper
Congratulations! You have successfully finetuned GR00T-N1.5 on a new embodiment.
Step 4: Deployment
After successfully fine-tuning and evaluating your policy, the final step is to deploy it onto your physical robot for real-world execution.
To connect your SO-101 robot and begin the evaluation, execute the following command in your terminal:
python eval_lerobot.py \
--robot.type=so100_follower \
--robot.port=/dev/ttyACM0 \
--robot.id=lil_guy \
--robot.cameras="{ wrist: {type: opencv, index_or_path: 9, width: 640, height: 480, fps: 30}, front: {type: opencv, index_or_path: 15, width: 640, height: 480, fps: 30}}" \
--policy_host=10.112.209.136 \
--lang_instruction="Grab pens and place into pen holder."
Since we finetuned GRO0T-N1.5 with different language instructions, the user can steer the policy by using one of the task prompts in the dataset such as:
"Grab tapes and place into pen holder".
π Happy Hacking! π»π οΈ
Get Started Today
Ready to elevate your robotics projects with NVIDIA's GR00T N1.5? Dive in with these essential resources:
- GR00T N1.5 Model: Download the latest models directly from Hugging Face.
- Fine-Tuning Resources: Find sample datasets and PyTorch scripts for fine-tuning on our GitHub.
- Contribute Datasets: Empower the robotics community by contributing your own datasets to Hugging Face.
- LeRobot Hackathon: Join the global community and participate in the upcoming LeRobot hackathon to apply your skills.
Stay up-to-date with the latest developments by following NVIDIA on Hugging Face.