Tokenizer difference between deepseek and qwen3
#227
by
yangsketch
- opened
hi,
The tokenizer of deepseek is different from qwen2.5? When you use deepseek r1 to distill qwen 2.5, how to align the two tokenizers? Could you describe the details? Thank you!!!