✨ M2-Base: 3.5TB web data (EN/ZH), with LLM-augmented content, APACHE2.0 ✨ M2-CoT: 4.2TB of auto-synthesized CoT reasoning data ✨ M2-Extra: domain-specific knowledge
DeepSeek, Alibaba, Skywork, Xiaomi, Bytedance..... And that’s just part of the companies from the Chinese community that released open models in April 🤯
🎬 Video > MAGI-1 by SandAI > SkyReels-A2 & SkyReels-V2 by Skywork > Wan2.1-FLF2V by Alibaba-Wan
🎨 Image > HiDream-I1 by Vivago AI > Kimi-VL by Moonshot AI > InstantCharacter by InstantX & Tencent-Hunyuan > Step1X-Edit by StepFun > EasyControl by Shanghai Jiaotong University
🧠 Reasoning > MiMo by Xiaomi > Skywork-R1V 2.0 by Skywork > ChatTS by ByteDance > Kimina by Moonshot AI & Numina > GLM-Z1 by Zhipu AI > Skywork OR1 by Skywork > Kimi-VL-Thinking by Moonshot AI
🔊 Audio > Kimi-Audio by Moonshot AI > IndexTTS by BiliBili > MegaTTS3 by ByteDance > Dolphin by DataOceanAI
🔢 Math > DeepSeek Prover V2 by Deepseek
🌍 LLM > Qwen by Alibaba-Qwen > InternVL3 by Shanghai AI lab > Ernie4.5 (demo) by Baidu
📊 Dataset > PHYBench by Eureka-Lab > ChildMandarin & Seniortalk by BAAI
Kimi-Audio 🚀🎧 an OPEN audio foundation model released by Moonshot AI moonshotai/Kimi-Audio-7B-Instruct ✨ 7B ✨ 13M+ hours of pretraining data ✨ Novel hybrid input architecture ✨ Universal audio capabilities (ASR, AQA, AAC, SER, SEC/ASC, end-to-end conversation)
✨Skywork OR1-Math-7B > Optimized for math reasoning ✨Skywork-OR1-7B-preview > Excels in math & coding ✨Skywork-OR1-32B-preview > Matches Deepseek-R1 on math (AIME24/25) and coding (LiveCodeBench)
Released under the Apache 2.0 license 🥳 Final version coming in 2 weeks!
✨ 1/2/8/9/14/38/28B with MIT license ✨ Stronger perception & reasoning vs InternVL 2.5 ✨ Native Multimodal Pre-Training for even better language performance
✨3B with MIT license ✨Long context windows up to 128K ✨Strong multimodal reasoning (36.8% on MathVision, on par with 10x larger models) and agent skills (34.5% on ScreenSpot-Pro)
IndexTTS 📢 a TTS built on XTTS + Tortoise, released by BiliBili - a Chinese video sharing platform/community. Model: IndexTeam/Index-TTS Demo: IndexTeam/IndexTTS
✨Chinese pronunciation correction via pinyin ✨Pause control via punctuation ✨Improved speaker conditioning & audio quality (BigVGAN2) ✨Trained on 10k+ hours
MegaTTS3 📢 an open TTS released by ByteDance ✨ 0.45B with Apache2.0 ✨ Support English & Chinese ✨ High quality voice cloning ✨ Accent Intensity Control ByteDance/MegaTTS3