Hi,
I was looking to run some experiments and was curious what data was used to train AngelSlim/Qwen3-8B_eagle3 - was it just ultrachat200k or sharegpt or a mix?
Is this information documented/available somewhere? Thanks in advance.
Edit: I saw in other issues that both ultrachat200k and shareGPT were used. I am curious what the exact splits were
i.e. was it full concatenation of https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k and https://huggingface.co/datasets/Aeala/ShareGPT_Vicuna_unfiltered (i.e. 200k samples from ultrachat and 121+k from ShareGPT or just 68k for ShareGPT)
More details on this would be really helpful, thanks!
Hi,
I was looking to run some experiments and was curious what data was used to train AngelSlim/Qwen3-8B_eagle3 - was it just ultrachat200k or sharegpt or a mix?
Is this information documented/available somewhere? Thanks in advance.
Edit: I saw in other issues that both ultrachat200k and shareGPT were used. I am curious what the exact splits were
i.e. was it full concatenation of https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k and https://huggingface.co/datasets/Aeala/ShareGPT_Vicuna_unfiltered (i.e. 200k samples from ultrachat and 121+k from ShareGPT or just 68k for ShareGPT)
More details on this would be really helpful, thanks!