Megatron-HF Bridge Backend

AReaL currently supports two bridge backends for MegatronEngine:

Set the backend with:

actor:
  megatron:
    bridge_type: mbridge

Why this feature exists

For new GPU training workflows, prefer megatron-bridge.
Keep mbridge for backward compatibility and environments that still depend on it.
Prefer mbridge when using disk-based weight broadcast as it has optimized HF load/save path.
If you use XCCL for weight broadcast, load/save time is less important.

Tree-attention training in MegatronEngine currently supports only mbridge.
The megatron-bridge backend is not supported in the tree-attention path yet.
megatron-bridge does support faster/optimized HF model load/save implementations.