Qwen2.5-1.5B-Malaysian-STT
Continue pretraining Qwen/Qwen2.5-1.5B on malaysia-ai/Malaysian-STT, natively,
- Streaming mode by using
<|streaming|>
prefix. - Semantic VAD by predicting
<|endofspeech|>
token probability for streaming mode. - Whole mode by using
<|whole|>
prefix. - Beyond 30 seconds audio prediction.
- Plug and play in any continuous batching serving framework such as vLLM, just another Qwen2.5 model.
- Use GLM4 Speech Tokenizer, 12.5 TPS. Discrete tokens work like a charm with prefix caching, especially for streaming.
How do we train
- Multipacking with proper document masking on 10240 context length.
- FP32-BF16 mixed precision training.
- Full parameter finetuning.
- WanDB at https://wandb.ai/huseinzol05/Qwen-Qwen2.5-1.5B-STT-10k
Source code
Source code at https://github.com/malaysia-ai/cooking/tree/main/qwen-stt
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for malaysia-ai/Qwen2.5-1.5B-Malaysian-STT
Base model
Qwen/Qwen2.5-1.5B