Hugging Face Chinese Localization
HuggingFace-CN-community 's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Articles
view post
OpenAudio S1-mini 🔊 a new OPEN multilingual TTS model trained on 2M+ hours of data, by FishAudio
fishaudio/openaudio-s1-mini ✨ Supports 14 languages ✨ 50+ emotions & tones ✨ RLHF-optimized ✨ Special effects: laughing, crying, shouting, etc.
See translation
1 reply
·
Reply
view post
SynLogic 🧠 logical reasoning model & dataset by MiniMax.
MiniMaxAI/synlogic-6836c3246fca0277657ff032 ✨ 3 models: 7B/32B/ Mix-3-32B (MIT license) ✨ Dataset: 35 verifiable logic tasks (Sudoku, Cipher, Arrow Maze etc.) ✨ RL training with auto-verifiable rewards ✨ Generalizes to math without explicit math training ✨ +6 pts on BBEH, +9.5 on KOR-Bench vs baselines
See translation
view post
Video-XL-2 🔥 long video understanding model by BAAI & Shanghai Jiaotong University
BAAI/Video-XL-2 ✨ Apache 2.0 ✨ Handles up to 10,000+ frames on a single GPU ✨ 2048-frame encoding in just 12s ✨ Efficient Chunk-based Prefilling & Bi-granularity KV decoding
See translation
view post
May highlights from China’s open source ecosystem 🔥
zh-ai-community/may-2025-open-works-from-the-chinese-community-681a3494145f2914dc679b7c ✨ DeepSeek dropped R1 updates - Both R1 & 8B distralled smol model ✨ Bytedance goes big on open source: - BAGEL, Dolphin, Seedcoder, Dream0... ✨ Multimodal is on fire! - HuyuanCustom / HunyuanVideo-Avatar / HunyuanPortrait - MiniMax: SynLogic / Orsta-7B - Xiaomi: MiMo VL - Alibaba Wan: Wan2.1-VACE - OpenGVlab: ZeroGUI - StepFun: ACE-Step-v1/Step1X-3D ✨ Specialized models/datasets excels - Alibaba Qwen: World PM 72B - BAAI:RobotBrain (MLLM for robotic) - HiThink Research: BizFinBench (dataset) - OpenBMB: Ultra FineWeb (dataset) - Bilibili: Index-anisora (Anime/ACG) - Skywork:Matrix-Game (game) More awesome releases: Alibaba QwenLong-L1-32B, SkyWork OR1, OpenS2V-5M etc...
See translation
view post
MiMo-VL 🔥 smol & mighty vision language model by Xiaomi
XiaomiMiMo/mimo-vl-68382ccacc7c2875500cd212 ✨ 7B with RL & SFT ✨ Native resolution ViT for fine grained perception ✨ MORL = smarter alignment across perception, grounding & reasoning
See translation
view post
🔥 New benchmark & dataset for Subject-to-Video generation OPENS2V-NEXUS by Pekin University ✨ Fine-grained evaluation for subject consistency
BestWishYsh/OpenS2V-Eval ✨ 5M-scale dataset:
BestWishYsh/OpenS2V-5M ✨ New metrics – automatic scores for identity, realism, and text match
See translation
2 replies
·
Reply
view post
HunyuanVideo-Avatar 🔥 another image to video model byTencent Hunyuan
tencent/HunyuanVideo-Avatar ✨Emotion-controlled, high-dynamic avatar videos ✨Multi-character support with separate audio control ✨Works with any style: cartoon, 3D, real face, while keeping identity consistent
See translation
view post
Orsta 🔥 vision language models trained with V-Triune, a unified reinforcement learning system by MiniMax AI
One-RL-to-See-Them-All/one-rl-to-see-them-all-6833d27abce23898b2f9815a ✨ 7B & 32B with MIT license ✨ Masters 8 visual tasks: math, science QA, charts, puzzles, object detection, grounding, OCR, and counting ✨ Uses Dynamic IoU rewards for better visual understanding ✨Strong performance in visual reasoning and perception
See translation
view post
ByteDance is absolutely cooking lately🔥 BAGEL 🥯 7B active parameter open multimodal foundation model by Bytedance Seed team.
ByteDance-Seed/BAGEL-7B-MoT ✨ Apache 2.0 ✨ Outperforms top VLMs (Qwen2.5-VL & InternVL-2.5) ✨ Mixture-of-Transformer-Experts + dual encoders ✨ Trained on trillions of interleaved tokens
See translation
view post
Dolphin 🔥 A multimodal document image parsing model from ByteDance , built on an analyze-then-parse paradigm.
ByteDance/Dolphin ✨ MIT licensed ✨ Handles text, tables, figures & formulas via: - Reading-order layout analysis - Parallel parsing with smart prompts
See translation
view post
Index-AniSora 🎬 an open anime video model released by Bilibili 👉https://huggingface.co/IndexTeam/Index-anisora ✨ Apache2.0 ✨ Supports many 2D styles: anime, manga, VTubers, and more ✨ Fine control over characters and actions with smart masking
See translation
view post
Data quality is the new frontier for LLM performance. Ultra-FineWeb 📊 a high-quality bilingual dataset released by OpenBMB
openbmb/Ultra-FineWeb ✨ MIT License ✨ 1T English + 120B Chinese tokens ✨ Efficient model-driven filtering
See translation
2 replies
·
Reply