Qwen2.5-1.5B-Malaysian-STT

Continue pretraining Qwen/Qwen2.5-1.5B on malaysia-ai/Malaysian-STT, natively,

  1. Streaming mode by using <|streaming|> prefix.
  2. Semantic VAD by predicting <|endofspeech|> token probability for streaming mode.
  3. Whole mode by using <|whole|> prefix.
  4. Beyond 30 seconds audio prediction.
  5. Plug and play in any continuous batching serving framework such as vLLM, just another Qwen2.5 model.
  6. Use GLM4 Speech Tokenizer, 12.5 TPS. Discrete tokens work like a charm with prefix caching, especially for streaming.

How do we train

  1. Multipacking with proper document masking on 10240 context length.
  2. FP32-BF16 mixed precision training.
  3. Full parameter finetuning.
  4. WanDB at https://wandb.ai/huseinzol05/Qwen-Qwen2.5-1.5B-STT-10k

Source code

Source code at https://github.com/malaysia-ai/cooking/tree/main/qwen-stt

Downloads last month
-
Safetensors
Model size
1.57B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for malaysia-ai/Qwen2.5-1.5B-Malaysian-STT

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(147)
this model

Dataset used to train malaysia-ai/Qwen2.5-1.5B-Malaysian-STT