Qwen

Team

company

https://qwen.ai/

alibaba_qwen

QwenLM

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

littlebird13 updated a model about 1 hour ago

Qwen/Qwen3-Embedding-4B-GGUF

littlebird13 updated a model about 1 hour ago

Qwen/Qwen3-Embedding-8B-GGUF

littlebird13 updated a model about 4 hours ago

Qwen/Qwen3-Embedding-0.6B-GGUF

View all activity

littlebird13

updated 2 models about 1 hour ago

Qwen/Qwen3-Embedding-4B-GGUF

4B • Updated about 1 hour ago • 9.79k • 30

Qwen/Qwen3-Embedding-8B-GGUF

8B • Updated about 1 hour ago • 14.3k • 53

littlebird13

updated a model about 4 hours ago

Qwen/Qwen3-Embedding-0.6B-GGUF

0.6B • Updated about 4 hours ago • 32.1k • 405

danielhanchen

posted an update about 12 hours ago

Post

110

Made some 245GB (80% size reduction) 1.8bit quants for Kimi K2!

unsloth/Kimi-K2-Instruct-GGUF

merve

posted an update about 14 hours ago

Post

147

past week had huuuge releases 💗
here's our picks 🔥 find more models, datasets, demos here merve/releases-july-11-68750452c358c98b0fa663f7

> moonshotai/Kimi-K2-Instruct is the new sota LLM with 1T total 32B active parameters 🤯

> HuggingFaceTB/SmolLM3-3B is the new best LM for it's size, offers thinking mode 💭 as well as the dataset HuggingFaceTB/smoltalk2

> Alibaba-NLP/WebSailor-3B is the new agentic LLM for complex browsing

> Google DeepMind released medical vision LMs with an agentic doctor-patient app google/medgemma-release-680aade845f90bec6a3f60c4

> fal released a LoRA to improve details on face images fal/Realism-Detailer-Kontext-Dev-LoRA

littlebird13

updated a Space about 16 hours ago

Qwen TTS Demo

💻

Generate speech from text with different voices

littlebird13

published a Space about 16 hours ago

Qwen TTS Demo

💻

Generate speech from text with different voices

AdinaY

posted an update 4 days ago

Post

3114

Kimi-K2 is now available on the hub🔥🚀
This is a trillion-parameter MoE model focused on long context, code, reasoning, and agentic behavior.

moonshotai/kimi-k2-6871243b990f2af5ba60617d

✨ Base & Instruct
✨ 1T total / 32B active - Modified MIT License
✨ 128K context length
✨ Muon optimizer for stable trillion-scale training

1 reply

zsytony

authored 3 papers 5 days ago

PM4Bench: A Parallel Multilingual Multi-Modal Multi-task Benchmark for Large Vision Language Model

Paper • 2503.18484 • Published Mar 24

Coding Triangle: How Does Large Language Model Understand Code?

Paper • 2507.06138 • Published 6 days ago • 19

Rethinking Verification for LLM Code Generation: From Generation to Testing

Paper • 2507.06920 • Published 6 days ago • 27

louisbrulenaudet

posted an update 6 days ago

Post

2621

Because hackathons are often the starting point for many AI projects, I've created a Python-backend template incorporating my feedback to streamline collaboration and urgent deployments 🏎️

Within a year, I had the opportunity to participate in hackathons organized by Mistral, OpenAI, and DeepMind and this GitHub template is structured around several fundamental building blocks and recommendations I offer developers eager to participate in their first hackathon, whether as part of a team or individually. Its emphasis is on rapid setup and deployment through:
- uv as a package manager, simplifying usage via a series of pre-configured make commands.
- FastAPI for API management, structured in a modular architecture designed to minimize branch conflicts during merges to main branches (using minimal health-check and ping routes to verify Docker’s proper execution and backend accessibility on the local network).
- Pydantic for validation and type handling, which simplifies debugging and enhances understanding of data objects.
- A set of custom instructions tailored for agents (Cline and GitHub Copilot), aimed at improving overall comprehension of the application and optimizing the vibe-coding experience.

This template includes unit tests with a 100% success rate and test coverage, as well as a minimal CI file ensuring that the FastAPI application runs correctly. Thus, merging code that breaks the server into production becomes impossible ⛔️

In general, I would reiterate an essential piece of advice: your two main adversaries are branch conflicts—particularly when the same file is modified concurrently within a brief period, especially if your architecture isn’t built for scalability—and deployment issues under urgent circumstances ⏱️

Link to GitHub: https://github.com/louisbrulenaudet/hackathon-backend

Simply issue these commands and you can ship your code at the speed of light:

make init
make dev

chujiezheng

authored a paper 6 days ago

A Survey on Latent Reasoning

Paper • 2507.06203 • Published 6 days ago • 73

merve

posted an update 6 days ago

Post

2994

GitHub refuses to render notebooks for a long time now 💔

so smol-vision now lives in Hugging Face model repository 🤗 merve/smol-vision

1 reply

ehartford

in Qwen/Qwen2.5-VL-3B-Instruct 6 days ago

Change to Apache 2.0 Licnse

❤️ 1

#36 opened 6 days ago by

ehartford

License

#35 opened 6 days ago by

ehartford

AdinaY

posted an update 7 days ago

Post

443

The tech report of RoboBrain 2.0 is now available on the Daily Papers page🔥

It's an embedded brain model that sees, thinks, and plans for many robots.

Leave your insights or questions, the authors are happy to respond.
RoboBrain 2.0 Technical Report (2507.02029)

AdinaY

posted an update 7 days ago

Post

310

Skywork-Reward-V2🔥 Reward models by Skywork AI.

Skywork/skywork-reward-v2-685cc86ce5d9c9e4be500c84

✨ 0.6B - 8B
✨ Trained on 26M human-LLM preference pairs
✨ 0.6B > 27B in many tasks

AdinaY

posted an update 7 days ago

Post

257

POLAR🐻‍❄️ New reward modeling by Shanghai AI Lab

internlm/polar-68693f829d2e83ac5e6e124a

✨ 1.8B/7B - Apache 2.0
✨ Scalable policy discriminative pretraining
✨ Easy RLHF with minimal preference data

merve

posted an update 7 days ago

Post

3351

ByteDance released Tar 1.5B and 7B: image-text in image-text out models, fully open-source 👏 ByteDance-Seed/tar-6864cf0d9fe59a3b91cc4260

They have an image tokenizer unified with text, and they de-tokenize using either of two models (LLM and diffusion)
The model is actually a full LLM (Qwen2), the tokenizer converts image tokens 🤯

AI & ML interests

Recent Activity

Team members 139

Qwen's activity

Qwen TTS Demo

Qwen TTS Demo

Change to Apache 2.0 Licnse

License