Chuanming Liu's picture

In a Training Loop 🔄

Chuanming Liu

Chuanming

·

Chuanming

AI & ML interests

Artificial Intelligence, AGI, NLP, LLMs, Multimodality, MLSys. Python/Golang/C/C++/Shell/awk&sed

Recent Activity

liked a model 3 days ago

mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-8bit

liked a model 3 days ago

Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

liked a dataset 3 days ago

Qwen/DeepPlanning

View all activity

Organizations

upvoted an article 8 days ago

Article

SigLIP 2: A better multilingual vision language encoder

+1

Feb 21, 2025

•

199

upvoted an article 13 days ago

Article

Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR

Jan 5

•

77

upvoted an article 27 days ago

Article

Open Responses: What you need to know

+2

28 days ago

•

105

upvoted an article about 2 months ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+4

Dec 18, 2025

•

119

upvoted a paper about 2 months ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 104

upvoted 2 articles 4 months ago

Article

Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers

Nov 3, 2022

•

356

Article

Supercharge your OCR Pipelines with Open Models

+5

Oct 21, 2025

•

301

upvoted 2 papers 4 months ago

MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder

Paper • 2505.07916 • Published May 12, 2025 • 134

Finite Scalar Quantization Enables Redundant and Transmission-Robust Neural Audio Compression at Low Bit-rates

Paper • 2509.09550 • Published Sep 11, 2025 • 3

upvoted 2 collections 5 months ago

Qwen3Guard

7 items • Updated Dec 31, 2025 • 62

Qwen3-Omni

6 items • Updated Dec 31, 2025 • 182

upvoted an article 5 months ago

Article

Understanding Vector Quantization in VQ-VAE

Aug 28, 2024

•

53

upvoted a paper 5 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 316

upvoted an article 5 months ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

Aug 9, 2025

•

83

upvoted 2 collections 5 months ago

PP-StructureV3

PP-StructureV3 is a SOTA document parsing solution on OmniDocBench, supporting the conversion of PDFs and do cument images to Markdown and JSON. • 17 items • Updated Sep 15, 2025 • 13

PP-OCRv5

PP-OCRv5 is the latest text recognition solution, supporting Simplified Chinese, Chinese Pinyin, Traditional Chinese, English, and Japanese • 13 items • Updated Sep 15, 2025 • 52

upvoted a paper 6 months ago

Step-Audio 2 Technical Report

Paper • 2507.16632 • Published Jul 22, 2025 • 73

upvoted 3 collections 6 months ago

Marvis-TTS-250m-v0.1

5 items • Updated Aug 26, 2025 • 26

AFM-Datasets

Training datasets for OPPO Personal AI Lab’s family of Agent Foundation Models. • 6 items • Updated 8 days ago • 6

AFM-Models

Models for OPPO Personal AI Lab’s family of Agent Foundation Models. • 13 items • Updated 8 days ago • 17