malaysia-ai
/

Qwen2.5-1.5B-Malaysian-STT

Text Generation

text-generation-inference

Model card Files Files and versions

Qwen2.5-1.5B-Malaysian-STT

Continue pretraining Qwen/Qwen2.5-1.5B on malaysia-ai/Malaysian-STT, natively,

Streaming mode by using <|streaming|> prefix.
Semantic VAD by predicting <|endofspeech|> token probability for streaming mode.
Whole mode by using <|whole|> prefix.
Beyond 30 seconds audio prediction.
Plug and play in any continuous batching serving framework such as vLLM, just another Qwen2.5 model.
Use GLM4 Speech Tokenizer, 12.5 TPS. Discrete tokens work like a charm with prefix caching, especially for streaming.

How do we train

Multipacking with proper document masking on 10240 context length.
FP32-BF16 mixed precision training.
Full parameter finetuning.
WanDB at https://wandb.ai/huseinzol05/Qwen-Qwen2.5-1.5B-STT-10k

Source code

Source code at https://github.com/malaysia-ai/cooking/tree/main/qwen-stt

Downloads last month: -

Safetensors

Model size

1.57B params

Tensor type

F32

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for malaysia-ai/Qwen2.5-1.5B-Malaysian-STT

Base model

Qwen/Qwen2.5-1.5B

Finetuned

(147)

this model

Dataset used to train malaysia-ai/Qwen2.5-1.5B-Malaysian-STT