-
Notifications
You must be signed in to change notification settings - Fork 57
Issue with G1 + Inspire Dexterous Hands #21
Description
Environment
- Robot: Unitree G1 with Inspire Dexterous Hands
- Hardware: 1x RTX 4090
Problem Summary
I'm attempting to adapt VideoMimic for a G1 robot equipped with Inspire Dexterous hands, but encountering issues both with the pretrained model and retraining from scratch.
Issue 1: Pretrained Sim2Real Model Unstable with Dexterous Hands
Observation: When loading the pretrained sim2real policy on the modified URDF (G1 + Inspire hands), the robot leans forward significantly and cannot maintain balance.
Using g1_with_inspire_hands.urdf with fixed wrist and hand joints.
Question: Are the pretrained models sensitive to mass distribution changes? Should the models be retrained even for minor changes like adding hands?
Issue 2: Stage 1 Retraining Stagnates at Low Performance
Training attempts (both failed similarly):
Attempt 1: Original train_stage_1.sh script
Used default training script with modified URDF and only walking clips (lafan_replay_data/*walk*.pkl)
Attempt 2: Modified config using g1_deepmimic_mocap task and paper's rewards
- Task:
g1_deepmimic_mocap(noticed this task exists but isn't used in original scripts) - Config changes based on paper recommendations:
REWARD_ACTION_RATE = -8.0
REW_POS_ERROR_THRESHOLD = 0.3
CONTACT_NO_VEL = -100.0
ALIVE_REWARD = 300.0
USE_ALT_FILES = False
AMASS_TERRAIN_DIFFICULTY = 1
...
- Walking clips only
- Added hand links to
penalize_contacts_on
Both attempts show identical stagnation:
- Success rate: Plateaus at ~0.02 (2%)
- Episode length: Stuck around 30 steps
- Training duration: ~50,000 iterations over 3 days each
Questions for Authors
- URDF Compatibility: Should Stage 1 policies be retrained when modifying the robot URDF (adding hands)? Or should the pretrained model transfer better?
- Task Selection: I notice
g1_deepmimic_mocaptask exists but isn't used intrain_stage_1.sh. What's the intended use case for this task? Should it be used for retraining with modified morphologies? - What should I do to resolve this stagnation? Any guidance on debugging approach, configuration changes, or expected training behavior?