What was the training dataset/data mix used to train AngelSlim/Qwen3-8B_eagle3 and other qwen eagle3 drafters?

Hi, 

I was looking to run some experiments and was curious what data was used to train AngelSlim/Qwen3-8B_eagle3 - was it just ultrachat200k or sharegpt or a mix?

Is this information documented/available somewhere? Thanks in advance.

Edit: I saw in other issues that both ultrachat200k and shareGPT were used. I am curious what the exact splits were

i.e. was it full concatenation of https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k and https://huggingface.co/datasets/Aeala/ShareGPT_Vicuna_unfiltered (i.e. 200k samples from ultrachat and 121+k from ShareGPT or just 68k for ShareGPT)

More details on this would be really helpful, thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What was the training dataset/data mix used to train AngelSlim/Qwen3-8B_eagle3 and other qwen eagle3 drafters? #287

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What was the training dataset/data mix used to train AngelSlim/Qwen3-8B_eagle3 and other qwen eagle3 drafters? #287

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions