A comprehensive tutorial for training and evaluating custom robotic manipulation policies using LeRobot and MuJoCo simulation.
- π€ LeRobot MuJoCo VLA Tutorial
pip install -r requirements.txt
pip install flash-attn==2.7.3 --no-build-isolationInstalling flash attention takes long time!
File: 0.teleop.ipynb
Interactive keyboard teleoperation for manual robot control and data collection.
Controls:
- WASD - XY plane movement
- R/F - Z-axis movement
- Q/E - Tilt adjustment
- Arrow Keys - Rotation control
- Spacebar - Toggle gripper state
- Z - Reset environment (discard episode data)
File: 1.Visualize.ipynb
Download and visualize datasets from Hugging Face Hub.
Quick Start:
python download_data.pyPython Usage:
from lerobot.datasets.lerobot_dataset import LeRobotDataset
root = './dataset/leader_data'
dataset = LeRobotDataset('Jeongeun/tutorial_v2', root=root)Running this code will automatically download the dataset.
File: 2.train.ipynb (First Section)
Train Pi-0.5 model on your dataset using the LeRobot training pipeline.
Prerequisites:
pip uninstall -y transformers
pip install git+https://github.com/huggingface/transformers.git@fix/lerobot_openpiTraining Command:
lerobot-train \
--dataset.repo_id=Jeongeun/tutorial_v2 \
--dataset.root=dataset/leader_data \
--policy.type=pi05 \
--policy.push_to_hub=true \
--policy.repo_id={YOUR REPO} \
--output_dir=./ckpt/tutorial_v2_pi05 \
--job_name=tutorial_v2_pi05 \
--policy.pretrained_path=lerobot/pi05_base \
--policy.compile_model=true \
--policy.gradient_checkpointing=true \
--wandb.enable=false \
--policy.dtype=bfloat16 \
--policy.freeze_vision_encoder=false \
--policy.train_expert_only=false \
--steps=10000 \
--log_freq=50 \
--eval_freq=-1 \
--policy.device=cuda \
--policy.chunk_size=20 \
--policy.n_action_steps=20 \
--batch_size=32Key Parameters:
| Parameter | Value | Description |
|---|---|---|
policy.type |
pi05 |
Policy architecture type |
policy.pretrained_path |
lerobot/pi05_base |
Pre-trained model checkpoint |
policy.compile_model |
false |
Disable model compilation for faster inference and training |
policy.dtype |
bfloat16 |
Use bfloat16 for memory efficiency |
steps |
5000 |
Total training steps |
batch_size |
32 |
Batch size for training |
chunk_size |
20 |
Action chunk size |
n_action_steps |
20 |
Number of action prediction steps |
Output:
- Trained model saved to
./ckpt/tutorial_v2_pi05 - Optional: Pushed to Hugging Face Hub if
push_to_hub=true
β±οΈ Training Time: ~2-4 hours on single GPU
File: 2.train.ipynb (Second Section)
Train GR00T N 1.5 model on your dataset.
Prerequisites:
pip install ninja "packaging>=24.2,<26.0"
pip install peft
pip install dm-tree==0.1.9
pip install -U transformers
pip install flash-attn==2.7.3 --no-build-isolationTraining Command:
lerobot-train \
--dataset.repo_id=Jeongeun/tutorial_v2 \
--dataset.root=dataset/leader_data \
--policy.type=groot \
--policy.push_to_hub=true \
--policy.repo_id={YOUR REPO} \
--policy.tune_diffusion_model=true \
--policy.tune_llm=false \
--policy.tune_visual=false \
--policy.tune_projector=true \
--output_dir=ckpt/tutorial_v2_groot \
--job_name=tutorial_v2_groot \
--wandb.enable=false \
--steps=3000 \
--policy.chunk_size=20 \
--policy.n_action_steps=20 \
--batch_size=32Key Parameters:
| Parameter | Value | Description |
|---|---|---|
policy.type |
groot |
Policy architecture type (Generalist Robot Transformer) |
policy.tune_diffusion_model |
true |
Enable diffusion model fine-tuning |
policy.tune_projector |
true |
Enable projector for new embodiment training |
policy.tune_visual |
false |
Disable vision encoder fine-tuning |
policy.tune_llm |
false |
Disable LLM backbone fine-tuning |
steps |
3000 |
Total training steps |
batch_size |
32 |
Batch size for training |
chunk_size |
20 |
Action chunk size |
n_action_steps |
20 |
Number of action prediction steps |
Output:
- Trained model saved to
ckpt/tutorial_v2_groot - Optional: Pushed to Hugging Face Hub if configured
β±οΈ Training Time: ~1-2 hours on single GPU
| Model | Pi0.5 | GR00T N1.5 | Pi0 |
|---|---|---|---|
| Success Rate | 80% | 75% | 100% |
| Repository π€ | Jeongeun/tutorial_v2_pi05 | [Jeongeun/tutorial_v2_grootn15](https://huggingface.co/Jeongeun/tutorial_v2_grootn15 | Jeongeun/tutorial_v2_pi0 |
File: 3.eval_pi05.ipynb
Evaluate the trained Pi-0.5 model on your environment.
Quick Start
python download_data.py --type pi05
python download_data.py --type groot
# python download_data.py --type pi0Prerequisites:
pip uninstall -y transformers
pip install git+https://github.com/huggingface/transformers.git@fix/lerobot_openpiKey Setup:
from lerobot.policies.pi05.modeling_pi05 import PI05Policy
from lerobot.processor import PolicyProcessorPipeline
from src.env.env import RILAB_OMY_ENV
# Load model
repo_id_or_path = 'Jeongeun/tutorial_v2_pi05'
policy = PI05Policy.from_pretrained(repo_id_or_path)
policy.to('cuda')
# Load preprocessor/postprocessor
preprocessor = PolicyProcessorPipeline.from_pretrained(repo_id_or_path, ...)
postprocessor = PolicyProcessorPipeline.from_pretrained(repo_id_or_path, ...)
# Load environment
env_conf = json.load(open('./configs/train.json'))
omy_env = RILAB_OMY_ENV(cfg=env_conf, action_type='joint', obs_type='joint_pos')Evaluation Configuration:
TEST_EPISODES = 20
MAX_EPISODE_STEPS = 10_000Run Evaluation:
- Loops through episodes
- Captures agent and wrist camera images (256Γ256)
- Preprocesses observations
- Selects actions via policy
- Postprocesses actions and steps environment
- Reports success rate
Output: Average success rate over 20 episodes
File: 4.eval_groot.ipynb
Prereq:
pip install ninja "packaging>=24.2,<26.0" peft dm-tree==0.1.9 -U transformers
pip install flash-attn==2.7.3 --no-build-isolationMinimal setup:
from lerobot.policies.groot.modeling_groot import GrootPolicy
from lerobot.processor import PolicyProcessorPipeline
from lerobot.utils.constants import POLICY_PREPROCESSOR_DEFAULT_NAME, POLICY_POSTPROCESSOR_DEFAULT_NAME
from src.env.env import RILAB_OMY_ENV
from lerobot.processor.converters import *
import json, torch
repo_id_or_path = "Jeongeun/tutorial_v2_groot"
device = "cuda"
policy = GrootPolicy.from_pretrained(repo_id_or_path).to(device)
# overrides to normalize/unnormalize with dataset stats and slice to env action dim
pre = PolicyProcessorPipeline.from_pretrained(
repo_id_or_path,
config_filename=f"{POLICY_PREPROCESSOR_DEFAULT_NAME}.json",
overrides={"groot_pack_inputs_v3": {"stats": None, "normalize_min_max": True}},
to_transition=batch_to_transition,
to_output=transition_to_batch,
)
post = PolicyProcessorPipeline.from_pretrained(
repo_id_or_path,
config_filename=f"{POLICY_POSTPROCESSOR_DEFAULT_NAME}.json",
overrides={"groot_action_unpack_unnormalize_v1": {"stats": None, "normalize_min_max": True, "env_action_dim": policy.config.output_features["action"].shape[0]}},
to_transition=policy_action_to_transition,
to_output=transition_to_policy_action,
)
env_conf = json.load(open("./configs/train.json"))
omy_env = RILAB_OMY_ENV(cfg=env_conf, action_type="joint", obs_type="joint_pos", vis_mode="teleop")Run: Same loop as Pi-0.5 β 20 episodes, max 10k steps, report average success.
File: 5.eval_pi0.ipynb
File: 10.transform.ipynb
Define action and observation spaces, then transform your dataset for training.
Configuration:
action_type = 'delta_joint' # 'joint' | 'delta_joint' | 'delta_eef_pose' | 'eef_pose'
proprio_type = 'eef_pose' # 'joint' | 'eef_pose'
observation_type = 'image' # 'image' | 'object_pose'
image_aug_num = 2 # Number of augmented images per original image
transformed_dataset_path = './dataset/transformed_data'Configuration Details:
| Parameter | Description | Options |
|---|---|---|
action_type |
Action representation format | joint, delta_joint, eef_pose, delta_eef_pose |
proprio_type |
Proprioceptive information representation | joint, eef_pose |
observation_type |
Input modality | image, object_pose |
image_aug_num |
Augmented trajectories for image features | Integer |
Command Line Usage:
python transform.py \
--action_type delta_eef_pose \
--proprio_type eef_pose \
--observation_type image \
--image_aug_num 2File: 11.train_custom.ipynb
Train MLP or Transformer models with your transformed dataset.
Configuration Example:
@PreTrainedConfig.register_subclass("omy_baseline")
@dataclass
class BaselineConfig(PreTrainedConfig):
# Input / output structure
n_obs_steps: int = 1
chunk_size: int = 5
n_action_steps: int = 5
# Architecture
backbone: str = 'mlp' # 'mlp' or 'transformer'
vision_backbone: str = "facebook/dinov3-vitb16-pretrain-lvd1689m"
projection_dim: int = 128
freeze_backbone: bool = True
# Model dimensions
n_hidden_layers: int = 5
hidden_dim: int = 512
# Transformer-specific parameters
n_heads: int = 4
dim_feedforward: int = 2048
feedforward_activation: str = "gelu"
dropout: float = 0.1
pre_norm: bool = True
n_encoder_layers: int = 6
# Training parameters
optimizer_lr: float = 1e-3
optimizer_weight_decay: float = 1e-6
lr_warmup_steps: int = 1000
total_training_steps: int = 500000
# Initialize policy configuration
cfg = BaselineConfig(
chunk_size=10,
n_action_steps=10,
backbone='mlp',
optimizer_lr=5e-4,
n_hidden_layers=10,
hidden_dim=512,
vision_backbone='facebook/dinov3-vitb16-pretrain-lvd1689m',
projection_dim=128,
freeze_backbone=True,
)Command Line Training:
python train_custom.py \
--dataset_path DATASET_PATH \
--batch_size BATCH_SIZE \
--num_epochs NUM_EPOCHS \
--ckpt_path CKPT_PATH \
--chunk_size CHUNK_SIZE \
--n_action_steps N_ACTION_STEPS \
--learning_rate LEARNING_RATE \
--backbone BACKBONE \
--n_hidden_layers N_HIDDEN_LAYERS \
--hidden_dim HIDDEN_DIM \
--vision_backbone {facebook/dinov3-vitb16-pretrain-lvd1689m,facebook/dinov2-base} \
--projection_dim PROJECTION_DIM \
--freeze_backbone FREEZE_BACKBONEFile: 12.eval_custom.ipynb
Evaluate your trained policies in simulation environment.
| Model | Clean Image | Noisy Color Image |
|---|---|---|
| π― MLP with GT Object Pose | 65% β | 65% β |
| πΌοΈ MLP with Image (DINOv3) | 50% | 40% |
| π SmolVLA with Image | 65% β | 10% |
Note: Action: Target Joint Position | State: Current Joint Position
β οΈ Color augmentation was not applied during vision model training.
π Refer to src/policies/README.md for detailed instructions.
In 11.train_custom.ipynb, update the first cell:
from src.policies.your_policy.configuration import YourPolicyConfig
from src.policies.baseline.processor import make_baseline_pre_post_processors
from src.policies.your_policy.modeling import YourPolicyUpdate the third cell to instantiate your configuration:
cfg = YourPolicyConfig(
chunk_size=10,
n_action_steps=10,
# Your custom parameters
)Update the fifth cell to build preprocessor and postprocessor
preprocessor, postprocessor = make_baseline_pre_post_processors(
config=cfg,
dataset_stats= ds_meta.stats
)Update the sixth cell to instantiate your policy:
policy = YourPolicy(**kwargs)In 12.eval_custom.ipynb, update the first cell:
from src.policies.your_policy.modeling import YourPolicyUpdate the third cell to load your trained model:
policy = YourPolicy.from_pretrained(CKPT, **kwargs)- β ROS2 installed on your system
- β ROBOTIS Open Manipulator hardware
- β Leader arm setup complete
Terminal 1: Launch ROS2 hardware driver
ros2 launch open_manipulator_bringup hardware_y_leader.launch.pyTerminal 2: Run leader arm interface
python leader.pyTerminal 3: Start data collection
python collect_data.pyπΎ Your collected data will be saved in the dataset directory!
π€ Jeongeun Park
π§ Email: [email protected]
Made with β€οΈ for robot learning research

