-
Notifications
You must be signed in to change notification settings - Fork 4k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix CLI arg name mismatch for inference max sequence length
community-request
#4836
opened May 17, 2026 by
Liamu-Lin
Loading…
5 tasks
varlendataset for thd e2e and benchmark
#4832
opened May 16, 2026 by
xiaoyao0115
Contributor
Loading…
5 tasks
chore: nightly sync main into dev (16_05_2026)
Run functional tests
Run MBridge tests
Attach this for testing this PR against MBridge main
#4831
opened May 16, 2026 by
svcnvidia-nemo-ci
•
Draft
fix(mtp): bypass save_for_backward for non-tensor kwargs in _checkpointed_forward (Fixes #3643)
community-request
#4830
opened May 16, 2026 by
PavelPaha
Loading…
4 of 5 tasks
[1/4] Hyperparameter Transfer: add scaling policy infrastructure
community-request
#4829
opened May 16, 2026 by
plugyawn
Contributor
Loading…
3 of 5 tasks
Fix CP RNG tracker semantics across dropout and checkpoint loading
community-request
#4828
opened May 16, 2026 by
ewan0x79
Loading…
1 of 5 tasks
Avoid GPU0 CUDA context growth from checkpoint common-step broadcast
community-request
Final Review
PR is in the "final review" stage
#4827
opened May 16, 2026 by
yhgalaxy
Loading…
2 of 5 tasks
Add opt-in MXFP8 LM-head output projection
complexity: medium
#4825
opened May 15, 2026 by
gdengk
Contributor
Loading…
3 of 5 tasks
Allow YAML MoE configs to use model specs
community-request
#4822
opened May 15, 2026 by
chawkins-nvidia
Loading…
Move bert and t5 pretrain files
complexity: low
#4820
opened May 15, 2026 by
Phlip79
Member
Loading…
Move *_builders.py into megatron/training/builders
complexity: medium
#4819
opened May 15, 2026 by
Phlip79
Member
Loading…
[MXFP8] Mirror fixes in Mbridge for mxfp8 param gather
dev branch
Dev branch related issues and development
#4818
opened May 15, 2026 by
zhongbozhu
Contributor
Loading…
5 tasks
[Dev][opt] Optimize e_proj and h_proj TP communication for MTP with mHC
community-request
#4817
opened May 15, 2026 by
Baibaifan
Loading…
[Dev] add support for deepep/hybridep dispatcher under thd format training
dev branch
Dev branch related issues and development
Expert Review
[deprecated] Apply this label to indicate that your PR is ready for expert review.
module: moe
#4816
opened May 15, 2026 by
HaochenYuan
Contributor
Loading…
5 tasks done
chore: Bump nvrx to 0.6.0
Run functional tests
#4814
opened May 15, 2026 by
chtruong814
Contributor
Loading…
5 tasks
ci: declare workflow-level
contents: read on 7 read-only workflows
community-request
#4811
opened May 14, 2026 by
arpitjain099
•
Draft
3 tasks done
Fix CUDA IMA in fsdp_double_buffer when an FSDP unit's bucket doesn't fit the pool
complexity: low
Run tests
#4810
opened May 14, 2026 by
wujingyue
Contributor
Loading…
1 of 2 tasks
Modernize post-training modelopt example scripts
complexity: high
#4807
opened May 14, 2026 by
kevalmorabia97
Contributor
Loading…
5 tasks done
test: add inference performance test harness for GPT 583M, hybrid 2B,…
complexity: low
Run functional tests
#4806
opened May 14, 2026 by
shanmugamr1992
Contributor
Loading…
5 tasks
Fix bug with Megatron-FSDP zero counter not working with decoupled gradients.
complexity: low
Final Review
PR is in the "final review" stage
nemotron
#4802
opened May 14, 2026 by
cspades
Member
Loading…
5 tasks
Enhance MimoOptimizer and tests for distributed checkpointing support
#4801
opened May 14, 2026 by
kamran-nvidia
•
Draft
2 of 5 tasks
Previous Next
ProTip!
no:milestone will show everything without a milestone.