Skip to content

jeongeun980906/Lerobot-MujoCo-VLA-Tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

38 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– LeRobot MuJoCo VLA Tutorial

A comprehensive tutorial for training and evaluating custom robotic manipulation policies using LeRobot and MuJoCo simulation.

πŸ“‹ Table of Contents


πŸš€ Installation

pip install -r requirements.txt
pip install flash-attn==2.7.3 --no-build-isolation

Installing flash attention takes long time!

πŸ“ Dataset: Teleoperation and Visualization

⌨️ Keyboard Teleoperation Demo

File: 0.teleop.ipynb

Interactive keyboard teleoperation for manual robot control and data collection.

Controls:

  • WASD - XY plane movement
  • R/F - Z-axis movement
  • Q/E - Tilt adjustment
  • Arrow Keys - Rotation control
  • Spacebar - Toggle gripper state
  • Z - Reset environment (discard episode data)

πŸ“Š Dataset Visualization

File: 1.Visualize.ipynb

Download and visualize datasets from Hugging Face Hub.

Quick Start:

python download_data.py

Python Usage:

from lerobot.datasets.lerobot_dataset import LeRobotDataset

root = './dataset/leader_data'
dataset = LeRobotDataset('Jeongeun/tutorial_v2', root=root)

Running this code will automatically download the dataset.

πŸ‹οΈ VLA Model Training

πŸ“Œ Pi-0.5 Training

File: 2.train.ipynb (First Section)

Train Pi-0.5 model on your dataset using the LeRobot training pipeline.

Prerequisites:

pip uninstall -y transformers
pip install git+https://github.com/huggingface/transformers.git@fix/lerobot_openpi

Training Command:

lerobot-train \
    --dataset.repo_id=Jeongeun/tutorial_v2 \
    --dataset.root=dataset/leader_data \
    --policy.type=pi05 \
    --policy.push_to_hub=true \
    --policy.repo_id={YOUR REPO} \
    --output_dir=./ckpt/tutorial_v2_pi05 \
    --job_name=tutorial_v2_pi05 \
    --policy.pretrained_path=lerobot/pi05_base \
    --policy.compile_model=true \
    --policy.gradient_checkpointing=true \
    --wandb.enable=false \
    --policy.dtype=bfloat16 \
    --policy.freeze_vision_encoder=false \
    --policy.train_expert_only=false \
    --steps=10000 \
    --log_freq=50 \
    --eval_freq=-1 \
    --policy.device=cuda \
    --policy.chunk_size=20 \
    --policy.n_action_steps=20 \
    --batch_size=32

Key Parameters:

Parameter Value Description
policy.type pi05 Policy architecture type
policy.pretrained_path lerobot/pi05_base Pre-trained model checkpoint
policy.compile_model false Disable model compilation for faster inference and training
policy.dtype bfloat16 Use bfloat16 for memory efficiency
steps 5000 Total training steps
batch_size 32 Batch size for training
chunk_size 20 Action chunk size
n_action_steps 20 Number of action prediction steps

Output:

  • Trained model saved to ./ckpt/tutorial_v2_pi05
  • Optional: Pushed to Hugging Face Hub if push_to_hub=true

⏱️ Training Time: ~2-4 hours on single GPU


πŸš€ GR00TN1.5 Training

File: 2.train.ipynb (Second Section)

Train GR00T N 1.5 model on your dataset.

Prerequisites:

pip install ninja "packaging>=24.2,<26.0"
pip install peft
pip install dm-tree==0.1.9
pip install -U transformers
pip install flash-attn==2.7.3 --no-build-isolation

Training Command:

lerobot-train \
    --dataset.repo_id=Jeongeun/tutorial_v2 \
    --dataset.root=dataset/leader_data \
    --policy.type=groot \
    --policy.push_to_hub=true \
    --policy.repo_id={YOUR REPO} \
    --policy.tune_diffusion_model=true \
    --policy.tune_llm=false \
    --policy.tune_visual=false \
    --policy.tune_projector=true \
    --output_dir=ckpt/tutorial_v2_groot \
    --job_name=tutorial_v2_groot \
    --wandb.enable=false \
    --steps=3000 \
    --policy.chunk_size=20 \
    --policy.n_action_steps=20 \
    --batch_size=32

Key Parameters:

Parameter Value Description
policy.type groot Policy architecture type (Generalist Robot Transformer)
policy.tune_diffusion_model true Enable diffusion model fine-tuning
policy.tune_projector true Enable projector for new embodiment training
policy.tune_visual false Disable vision encoder fine-tuning
policy.tune_llm false Disable LLM backbone fine-tuning
steps 3000 Total training steps
batch_size 32 Batch size for training
chunk_size 20 Action chunk size
n_action_steps 20 Number of action prediction steps

Output:

  • Trained model saved to ckpt/tutorial_v2_groot
  • Optional: Pushed to Hugging Face Hub if configured

⏱️ Training Time: ~1-2 hours on single GPU

πŸ“ˆ Baseline Model Evaluation

Model Pi0.5 GR00T N1.5 Pi0
Success Rate 80% 75% 100%
Repository πŸ€— Jeongeun/tutorial_v2_pi05 [Jeongeun/tutorial_v2_grootn15](https://huggingface.co/Jeongeun/tutorial_v2_grootn15 Jeongeun/tutorial_v2_pi0

πŸ“Š Pi-0.5 Evaluation

File: 3.eval_pi05.ipynb

Evaluate the trained Pi-0.5 model on your environment.

Quick Start

python download_data.py --type pi05
python download_data.py --type groot
# python download_data.py --type pi0

Prerequisites:

pip uninstall -y transformers
pip install git+https://github.com/huggingface/transformers.git@fix/lerobot_openpi

Key Setup:

from lerobot.policies.pi05.modeling_pi05 import PI05Policy
from lerobot.processor import PolicyProcessorPipeline
from src.env.env import RILAB_OMY_ENV

# Load model
repo_id_or_path = 'Jeongeun/tutorial_v2_pi05'
policy = PI05Policy.from_pretrained(repo_id_or_path)
policy.to('cuda')

# Load preprocessor/postprocessor
preprocessor = PolicyProcessorPipeline.from_pretrained(repo_id_or_path, ...)
postprocessor = PolicyProcessorPipeline.from_pretrained(repo_id_or_path, ...)

# Load environment
env_conf = json.load(open('./configs/train.json'))
omy_env = RILAB_OMY_ENV(cfg=env_conf, action_type='joint', obs_type='joint_pos')

Evaluation Configuration:

TEST_EPISODES = 20
MAX_EPISODE_STEPS = 10_000

Run Evaluation:

  • Loops through episodes
  • Captures agent and wrist camera images (256Γ—256)
  • Preprocesses observations
  • Selects actions via policy
  • Postprocesses actions and steps environment
  • Reports success rate

Output: Average success rate over 20 episodes

πŸ“Š GR00T N1.5 Evaluation

File: 4.eval_groot.ipynb

Prereq:

pip install ninja "packaging>=24.2,<26.0" peft dm-tree==0.1.9 -U transformers
pip install flash-attn==2.7.3 --no-build-isolation

Minimal setup:

from lerobot.policies.groot.modeling_groot import GrootPolicy
from lerobot.processor import PolicyProcessorPipeline
from lerobot.utils.constants import POLICY_PREPROCESSOR_DEFAULT_NAME, POLICY_POSTPROCESSOR_DEFAULT_NAME
from src.env.env import RILAB_OMY_ENV
from lerobot.processor.converters import *
import json, torch

repo_id_or_path = "Jeongeun/tutorial_v2_groot"
device = "cuda"

policy = GrootPolicy.from_pretrained(repo_id_or_path).to(device)

# overrides to normalize/unnormalize with dataset stats and slice to env action dim
pre = PolicyProcessorPipeline.from_pretrained(
    repo_id_or_path,
    config_filename=f"{POLICY_PREPROCESSOR_DEFAULT_NAME}.json",
    overrides={"groot_pack_inputs_v3": {"stats": None, "normalize_min_max": True}},
    to_transition=batch_to_transition,
    to_output=transition_to_batch,
)
post = PolicyProcessorPipeline.from_pretrained(
    repo_id_or_path,
    config_filename=f"{POLICY_POSTPROCESSOR_DEFAULT_NAME}.json",
    overrides={"groot_action_unpack_unnormalize_v1": {"stats": None, "normalize_min_max": True, "env_action_dim": policy.config.output_features["action"].shape[0]}},
    to_transition=policy_action_to_transition,
    to_output=transition_to_policy_action,
)

env_conf = json.load(open("./configs/train.json"))
omy_env = RILAB_OMY_ENV(cfg=env_conf, action_type="joint", obs_type="joint_pos", vis_mode="teleop")

Run: Same loop as Pi-0.5 β€” 20 episodes, max 10k steps, report average success.

πŸ“Š Pi-0 Evaluation

File: 5.eval_pi0.ipynb

Custom Policy Training and Evaluation

πŸ”„ Data Transformation

File: 10.transform.ipynb

Define action and observation spaces, then transform your dataset for training.

Configuration:

action_type = 'delta_joint'      # 'joint' | 'delta_joint' | 'delta_eef_pose' | 'eef_pose'
proprio_type = 'eef_pose'        # 'joint' | 'eef_pose'
observation_type = 'image'       # 'image' | 'object_pose'
image_aug_num = 2                # Number of augmented images per original image
transformed_dataset_path = './dataset/transformed_data'

Configuration Details:

Parameter Description Options
action_type Action representation format joint, delta_joint, eef_pose, delta_eef_pose
proprio_type Proprioceptive information representation joint, eef_pose
observation_type Input modality image, object_pose
image_aug_num Augmented trajectories for image features Integer

Command Line Usage:

python transform.py \
  --action_type delta_eef_pose \
  --proprio_type eef_pose \
  --observation_type image \
  --image_aug_num 2

πŸŽ“ Custom Policy Training

File: 11.train_custom.ipynb

Train MLP or Transformer models with your transformed dataset.

Configuration Example:

@PreTrainedConfig.register_subclass("omy_baseline")
@dataclass
class BaselineConfig(PreTrainedConfig):
    # Input / output structure
    n_obs_steps: int = 1
    chunk_size: int = 5
    n_action_steps: int = 5

    # Architecture
    backbone: str = 'mlp'  # 'mlp' or 'transformer'
    vision_backbone: str = "facebook/dinov3-vitb16-pretrain-lvd1689m"
    projection_dim: int = 128
    freeze_backbone: bool = True

    # Model dimensions
    n_hidden_layers: int = 5
    hidden_dim: int = 512

    # Transformer-specific parameters
    n_heads: int = 4
    dim_feedforward: int = 2048
    feedforward_activation: str = "gelu"
    dropout: float = 0.1
    pre_norm: bool = True
    n_encoder_layers: int = 6

    # Training parameters
    optimizer_lr: float = 1e-3
    optimizer_weight_decay: float = 1e-6
    lr_warmup_steps: int = 1000
    total_training_steps: int = 500000

# Initialize policy configuration
cfg = BaselineConfig(
    chunk_size=10,
    n_action_steps=10,
    backbone='mlp',
    optimizer_lr=5e-4,
    n_hidden_layers=10,
    hidden_dim=512,
    vision_backbone='facebook/dinov3-vitb16-pretrain-lvd1689m',
    projection_dim=128,
    freeze_backbone=True,
)

Command Line Training:

python train_custom.py \
  --dataset_path DATASET_PATH \
  --batch_size BATCH_SIZE \
  --num_epochs NUM_EPOCHS \
  --ckpt_path CKPT_PATH \
  --chunk_size CHUNK_SIZE \
  --n_action_steps N_ACTION_STEPS \
  --learning_rate LEARNING_RATE \
  --backbone BACKBONE \
  --n_hidden_layers N_HIDDEN_LAYERS \
  --hidden_dim HIDDEN_DIM \
  --vision_backbone {facebook/dinov3-vitb16-pretrain-lvd1689m,facebook/dinov2-base} \
  --projection_dim PROJECTION_DIM \
  --freeze_backbone FREEZE_BACKBONE

βœ… Custom Policy Evaluation

File: 12.eval_custom.ipynb

Evaluate your trained policies in simulation environment.


πŸ“Š Model Performance

Model Clean Image Noisy Color Image
🎯 MLP with GT Object Pose 65% βœ… 65% βœ…
πŸ–ΌοΈ MLP with Image (DINOv3) 50% 40%
πŸš€ SmolVLA with Image 65% βœ… 10% ⚠️

Note: Action: Target Joint Position | State: Current Joint Position

⚠️ Color augmentation was not applied during vision model training.


πŸ”§ Custom Policy Implementation

πŸ‘‰ Refer to src/policies/README.md for detailed instructions.

πŸ“ Training Your Custom Policy

In 11.train_custom.ipynb, update the first cell:

from src.policies.your_policy.configuration import YourPolicyConfig
from src.policies.baseline.processor import make_baseline_pre_post_processors
from src.policies.your_policy.modeling import YourPolicy

Update the third cell to instantiate your configuration:

cfg = YourPolicyConfig(
    chunk_size=10,
    n_action_steps=10,
    # Your custom parameters
)

Update the fifth cell to build preprocessor and postprocessor

preprocessor, postprocessor = make_baseline_pre_post_processors(
        config=cfg,
        dataset_stats= ds_meta.stats
    )

Update the sixth cell to instantiate your policy:

policy = YourPolicy(**kwargs)

πŸ“ Evaluating Your Custom Policy

In 12.eval_custom.ipynb, update the first cell:

from src.policies.your_policy.modeling import YourPolicy

Update the third cell to load your trained model:

policy = YourPolicy.from_pretrained(CKPT, **kwargs)

πŸ“‘ Data Collection with Leader Arm

βœ‹ Prerequisites

  • βœ… ROS2 installed on your system
  • βœ… ROBOTIS Open Manipulator hardware
  • βœ… Leader arm setup complete

πŸ”§ Procedure

Terminal 1: Launch ROS2 hardware driver

ros2 launch open_manipulator_bringup hardware_y_leader.launch.py

Terminal 2: Run leader arm interface

python leader.py

Terminal 3: Start data collection

python collect_data.py

πŸ’Ύ Your collected data will be saved in the dataset directory!


πŸ’¬ Contact

πŸ‘€ Jeongeun Park
πŸ“§ Email: [email protected]


Made with ❀️ for robot learning research

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors