Skip to content

Pull requests: alibaba/Pai-Megatron-Patch

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix DeepSeek related bugs (#724, #548, #633)
#731 opened Mar 14, 2026 by Nyquist24 Loading…
3
1
Fix qwen_vl key error in dist converter
#706 opened Nov 12, 2025 by ShareLer Loading…
fix the sequence packing
#663 opened Sep 9, 2025 by Raymondssssss Loading…
fix error when convert qwen2.5_vl
#660 opened Sep 3, 2025 by chenhaiq Loading…
Fix MEGATRON_PATCH_PATH in qwen2_5 run_8xH20.sh
#646 opened Aug 11, 2025 by wyjBot Loading…
fix broken path in run_build_idxmap_sft_dataset.sh
#627 opened Jun 26, 2025 by Opdoop Loading…
Fix MEGATRON_PATCH_PATH
#622 opened Jun 23, 2025 by xu-song Loading…
fix typo
#582 opened May 5, 2025 by 1195343015 Loading…
fix typo for hidden_size
#580 opened Apr 30, 2025 by ShareLer Loading…
Update hf2mcore_deepseek_v3_moe.py
#495 opened Mar 7, 2025 by lmc8133 Loading…
Fix typo
#478 opened Feb 24, 2025 by xu-song Loading…
support moe_layer_freq@qwen_moe
#440 opened Jan 20, 2025 by laohur Loading…
Update utils.py
#373 opened Oct 31, 2024 by enze5088 Loading…
Mixtral ggemm to hf format
#230 opened May 15, 2024 by vlad-karpuhin Loading…
ProTip! Filter pull requests by the default branch with base:main.