Skip to content

fix the sequence packing#663

Open
Raymondssssss wants to merge 2 commits intoalibaba:mainfrom
Raymondssssss:pai-lzh
Open

fix the sequence packing#663
Raymondssssss wants to merge 2 commits intoalibaba:mainfrom
Raymondssssss:pai-lzh

Conversation

@Raymondssssss
Copy link
Copy Markdown

当打开sequence packing做sft时,若一个pack中的子序列长度恰好为上下文窗口长度等边界情况时,原代码会在训练中遇到该问题时报错:
[rank1]: tokens, labels, loss_mask, attention_mask, position_ids, num_seqs, packed_seq_params = get_batch(data_iterator)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/volume/data/lzh/Pai-Megatron-Patch-main/megatron_patch/template/helper.py", line 103, in get_batch
[rank1]: max_seqlen = torch.max(seqlens.max(), position_ids.max() + 1)
[rank1]: ^^^^^^^^^^^^^
[rank1]: RuntimeError: max(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.
修改后的代码解决了边界问题且为更健壮的实现。

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Sep 9, 2025

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants