Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add minimal DBuffer prototype
#4835 opened May 17, 2026 by wujingyue Contributor Draft
Fix tokenizers bug in nightly
#4833 opened May 16, 2026 by Phlip79 Member Draft
varlendataset for thd e2e and benchmark
#4832 opened May 16, 2026 by xiaoyao0115 Contributor Loading…
5 tasks
chore: nightly sync main into dev (16_05_2026) Run functional tests Run MBridge tests Attach this for testing this PR against MBridge main
#4831 opened May 16, 2026 by svcnvidia-nemo-ci Draft
[1/4] Hyperparameter Transfer: add scaling policy infrastructure community-request
#4829 opened May 16, 2026 by plugyawn Contributor Loading…
3 of 5 tasks
Avoid GPU0 CUDA context growth from checkpoint common-step broadcast community-request Final Review PR is in the "final review" stage
#4827 opened May 16, 2026 by yhgalaxy Loading…
2 of 5 tasks
Add opt-in MXFP8 LM-head output projection complexity: medium
#4825 opened May 15, 2026 by gdengk Contributor Loading…
3 of 5 tasks
Add Auto Quantize in ModelOpt quantize example
#4821 opened May 15, 2026 by jenchen13 Contributor Draft
5 tasks
Move bert and t5 pretrain files complexity: low
#4820 opened May 15, 2026 by Phlip79 Member Loading…
[MXFP8] Mirror fixes in Mbridge for mxfp8 param gather dev branch Dev branch related issues and development
#4818 opened May 15, 2026 by zhongbozhu Contributor Loading…
5 tasks
[Dev] add support for deepep/hybridep dispatcher under thd format training dev branch Dev branch related issues and development Expert Review [deprecated] Apply this label to indicate that your PR is ready for expert review. module: moe
#4816 opened May 15, 2026 by HaochenYuan Contributor Loading…
5 tasks done
chore: Bump nvrx to 0.6.0 Run functional tests
#4814 opened May 15, 2026 by chtruong814 Contributor Loading…
5 tasks
Modernize post-training modelopt example scripts complexity: high
#4807 opened May 14, 2026 by kevalmorabia97 Contributor Loading…
5 tasks done
Fix bug with Megatron-FSDP zero counter not working with decoupled gradients. complexity: low Final Review PR is in the "final review" stage nemotron
#4802 opened May 14, 2026 by cspades Member Loading…
5 tasks
ProTip! no:milestone will show everything without a milestone.