Pi-0 Bolt Nut Sort Model

This is a Pi-0 (Pi-Zero) model trained for bolt and nut sorting tasks using the OpenPI framework.

Model Description

  • Architecture: Pi-0 (diffusion-based vision-language-action model)
  • Base Model: PaLiGemma 3B with SigLIP vision encoder
  • Task: Sorting bolts and nuts into separate baskets
  • Robot: Dual-arm ALOHA setup
  • Action Space: 14-DoF (7 per arm: 6 joints + 1 gripper)
  • Training Steps: 29,999
  • Action Horizon: 50 steps
  • Image Resolution: 224x224

Dataset

Trained on the naungth/pi0_bolt_nut_sort dataset with the task instruction: "sort the bolts and the nuts into separate baskets"

Usage

With OpenPI

from openpi.policies import policy_config
from openpi.training import config

# Load the model configuration
config_name = "pi0_bns" 
train_config = config.get_config(config_name)

# Create policy from your local checkpoint
policy = policy_config.create_trained_policy(
    train_config, 
    "path/to/checkpoint",
    default_prompt="sort the bolts and the nuts into separate baskets"
)

# Use for inference
observation = {
    "images": {
        "cam_high": image_array,  # [H, W, 3] uint8
        "cam_left_wrist": left_wrist_image,  # [H, W, 3] uint8  
        "cam_right_wrist": right_wrist_image,  # [H, W, 3] uint8
    },
    "state": joint_positions,  # [14] float32
    "prompt": "sort the bolts and the nuts into separate baskets"
}

actions = policy.infer(observation)["actions"]  # [50, 14]

With Policy Server

# Start the policy server
uv run scripts/serve_policy.py policy:checkpoint \
    --policy.config=pi0_bns \
    --policy.dir=path/to/checkpoint

# Use with client
from openpi_client import websocket_client_policy
client = websocket_client_policy.WebsocketClientPolicy("localhost", 8000)
actions = client.infer(observation)

Model Architecture

  • Vision Encoder: SigLIP-So400m/14
  • Language Model: Gemma 2B + Gemma 300M (action expert)
  • Training: Diffusion-based action prediction
  • Input: Multi-camera RGB + proprioception + language instruction
  • Output: Future action sequence (50 timesteps)

Training Details

  • Framework: JAX/Flax with OpenPI
  • Optimizer: AdamW
  • Base Checkpoint: Pi-0 base model from Google
  • Fine-tuning: Task-specific fine-tuning on bolt nut sort data
  • Normalization: Dataset-specific state/action normalization

License

MIT License

Citation

If you use this model, please cite:

@article{pi0,
  title={Pi-Zero: A Diffusion-Based Policy for Robot Manipulation},
  author={TODO: Add authors},
  year={2024}
}

Acknowledgments

  • Built using the OpenPI framework
  • Based on the Pi-0 architecture
  • Training data from bolt nut sorting demonstrations
Downloads last month
1
Video Preview
loading

Model tree for naungth/pi0_dart

Finetuned
(69)
this model