Clem ๐Ÿค—'s picture

Clem ๐Ÿค— PRO

clem

AI & ML interests

multi-modal, time-series, biology and chemistry

Recent Activity

liked a model about 20 hours ago
google/derm-foundation
upvoted a collection 1 day ago
PaliGemma 2 Mix
liked a dataset 1 day ago
google/smol
View all activity

Organizations

Hugging Face's profile picture Pied Piper's profile picture Objective Function's profile picture Society & Ethics's profile picture Organization's profile picture Text Generation Inference's profile picture testifly's profile picture HugGAN Community's profile picture Hugging Face Fellows's profile picture Gradio-Blocks-Party's profile picture HuggingFaceM4's profile picture Open-Source AI Meetup's profile picture Hugging Face OSS Metrics's profile picture Hugging Face Smol Cluster's profile picture huggingPartyParis's profile picture Unofficial Mistral Community's profile picture Journalists on Hugging Face's profile picture Major TOM's profile picture MLX Community's profile picture Miami AI Hub's profile picture Social Post Explorers's profile picture Paris AI Running Club's profile picture Hugging Face for Legal's profile picture Hugging Face Party @ PyTorch Conference's profile picture Nerdy Face's profile picture open/ acc's profile picture Bluesky Community's profile picture

clem's activity

reacted to merterbak's post with ๐Ÿš€ 2 days ago
view post
Post
3146
๐Ÿ”ฅ Meet Muse: that can generate a game environment based on visuals or playersโ€™ controller actions. It was developed by Microsoft Research in collaboration with Ninja Theory (Hellblade developer). Itโ€™s built on something called the World and Human Action Model (WHAM-1.6B model). They trained on 7 years of Bleeding Edge gameplay and it can generate 2 minute long 3D game sequences with consistent physics and character behaviors all from just a second of input. Theyโ€™ve gone and open-sourced it too. Open weights, the WHAM Demonstrator, and sample data on Azure AI Foundry for anyone to play with. Hope so soon on Hugging Face ๐Ÿค—.

๐Ÿ“„ Paper: https://www.nature.com/articles/s41586-025-08600-3
Blog Post: https://www.microsoft.com/en-us/research/blog/introducing-muse-our-first-generative-ai-model-designed-for-gameplay-ideation/

  • 1 reply
ยท
reacted to AdinaY's post with ๐Ÿ‘€ 3 days ago
reacted to AdinaY's post with โค๏ธ 3 days ago
view post
Post
4062
๐Ÿš€ StepFun้˜ถ่ทƒๆ˜Ÿ่พฐ is making BIG open moves!

Last year, their GOT-OCR 2.0 took the community by storm ๐Ÿ”ฅbut many didnโ€™t know they were also building some amazing models. Now, theyโ€™ve just dropped something huge on the hub!

๐Ÿ“บ Step-Video-T2V: a 30B bilingual open video model that generates 204 frames (8-10s) at 540P resolution with high information density & consistency.
stepfun-ai/stepvideo-t2v

๐Ÿ”Š Step-Audio-TTS-3B : a TTS trained with the LLM-Chat paradigm on a large synthetic dataset, capable of generating RAP & Humming
stepfun-ai/step-audio-67b33accf45735bb21131b0b
ยท
reacted to fdaudens's post with โค๏ธ 3 days ago
view post
Post
5466
๐ŸŽฏ Perplexity drops their FIRST open-weight model on Hugging Face: A decensored DeepSeek-R1 with full reasoning capabilities. Tested on 1000+ examples for unbiased responses.

Check it out: perplexity-ai/r1-1776
Blog post: https://perplexity.ai/hub/blog/open-sourcing-r1-1776
  • 1 reply
ยท
posted an update 3 days ago
view post
Post
2491
What are the best organizations to follow on @huggingface ?

On top of my head:
- Deepseek (35,000 followers): https://huggingface.co/deepseek-ai
- Meta Llama (27,000 followers): https://huggingface.co/meta-llama
- Black Forrest Labs (11,000 followers): https://huggingface.co/black-forest-labs
- OpenAI (5,000 followers): https://huggingface.co/openai
- Nvidia (16,000 followers): https://huggingface.co/nvidia
- MIcrosoft (9,000 followers): https://huggingface.co/microsoft
- AllenAI (2,000 followers): https://huggingface.co/allenai
- Mistral (5,000 followers): https://huggingface.co/mistralai
- XAI (600 followers): https://huggingface.co/xai-org
- Stability AI (16,000 followers): https://huggingface.co/stabilityai
- Qwen (16,000 followers): https://huggingface.co/Qwen
- GoogleAI (8,000 followers): https://huggingface.co/google
- Unsloth (3,000 followers): https://huggingface.co/unsloth
- Bria AI (4,000 followers): https://huggingface.co/briaai
- NousResearch (1,300 followers): https://huggingface.co/NousResearch

Bonus, the agent course org with 17,000 followers: https://huggingface.co/agents-course
  • 1 reply
ยท
posted an update 4 days ago
view post
Post
3252
We crossed 1B+ tokens routed to inference providers partners on HF, that we released just a few days ago.

Just getting started of course but early users seem to like it & always happy to be able to partner with cool startups in the ecosystem.

Have you been using any integration and how can we make it better?

https://huggingface.co/blog/inference-providers
reacted to AtAndDev's post with ๐Ÿ˜Ž 4 days ago
view post
Post
2340
@nroggendorff is that you sama?
  • 2 replies
ยท
reacted to merve's post with ๐Ÿ‘๐Ÿ”ฅโค๏ธ 7 days ago
view post
Post
4569
Your weekly recap of open AI is here, and it's packed with models! merve/feb-14-releases-67af876b404cc27c6d837767

๐Ÿ‘€ Multimodal
> OpenGVLab released InternVideo 2.5 Chat models, new video LMs with long context
> AIDC released Ovis2 model family along with Ovis dataset, new vision LMs in different sizes (1B, 2B, 4B, 8B, 16B, 34B), with video and OCR support
> ColQwenStella-2b is a multilingual visual retrieval model that is sota in it's size
> Hoags-2B-Exp is a new multilingual vision LM with contextual reasoning, long context video understanding

๐Ÿ’ฌ LLMs
A lot of math models!
> Open-R1 team released OpenR1-Math-220k large scale math reasoning dataset, along with Qwen2.5-220K-Math fine-tuned on the dataset, OpenR1-Qwen-7B
> Nomic AI released new Nomic Embed multilingual retrieval model, a MoE with 500 params with 305M active params, outperforming other models
> DeepScaleR-1.5B-Preview is a new DeepSeek-R1-Distill fine-tune using distributed RL on math
> LIMO is a new fine-tune of Qwen2.5-32B-Instruct on Math

๐Ÿ—ฃ๏ธ Audio
> Zonos-v0.1 is a new family of speech recognition models, which contains the model itself and embeddings

๐Ÿ–ผ๏ธ Vision and Image Generation
> We have ported DepthPro of Apple to transformers for your convenience!
> illustrious-xl-v1.0 is a new illustration generation model
ยท
reacted to albertvillanova's post with ๐Ÿค—๐Ÿ”ฅ 16 days ago
view post
Post
3355
๐Ÿš€ Introducing @huggingface Open Deep-Research๐Ÿ’ฅ

In just 24 hours, we built an open-source agent that:
โœ… Autonomously browse the web
โœ… Search, scroll & extract info
โœ… Download & manipulate files
โœ… Run calculations on data

55% on GAIA validation set! Help us improve it!๐Ÿ’ก
https://huggingface.co/blog/open-deep-research
  • 3 replies
ยท
reacted to fdaudens's post with โค๏ธ 25 days ago
view post
Post
8534
Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:

- Original release: 8 models, 540K downloads. Just the beginning...

- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5Mโ€”nearly 5X the originals.

The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.

When you empower builders, innovation explodes. For everyone. ๐Ÿš€

The most popular community model? @bartowski 's DeepSeek-R1-Distill-Qwen-32B-GGUF version โ€” 1M downloads alone.
ยท
reacted to mitkox's post with ๐Ÿ‘๐Ÿš€ 25 days ago
view post
Post
2352
llama.cpp is 26.8% faster than ollama.
I have upgraded both, and using the same settings, I am running the same DeepSeek R1 Distill 1.5B on the same hardware. It's an Apples to Apples comparison.

Total duration:
llama.cpp 6.85 sec <- 26.8% faster
ollama 8.69 sec

Breakdown by phase:
Model loading
llama.cpp 241 ms <- 2x faster
ollama 553 ms

Prompt processing
llama.cpp 416.04 tokens/s with an eval time 45.67 ms <- 10x faster
ollama 42.17 tokens/s with an eval time of 498 ms

Token generation
llama.cpp 137.79 tokens/s with an eval time 6.62 sec <- 13% faster
ollama 122.07 tokens/s with an eval time 7.64 sec

llama.cpp is LLM inference in C/C++; ollama adds abstraction layers and marketing.

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
ยท
posted an update 25 days ago
view post
Post
7169
AI is not a zero-sum game. Open-source AI is the tide that lifts all boats!
reacted to merve's post with ๐Ÿ‘€๐Ÿค—๐Ÿ”ฅ 28 days ago
view post
Post
5179
Oof, what a week! ๐Ÿฅต So many things have happened, let's recap! merve/jan-24-releases-6793d610774073328eac67a9

Multimodal ๐Ÿ’ฌ
- We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG ๐Ÿ’—
- UI-TARS are new models by ByteDance to unlock agentic GUI control ๐Ÿคฏ in 2B, 7B and 72B
- Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B
- MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context
- Dataset: Yale released a new benchmark called MMVU
- Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark

LLMs ๐Ÿ“–
- DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! ๐Ÿคฏ
- Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B
- NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!)

Audio ๐Ÿ—ฃ๏ธ
- Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B
- TangoFlux is a new audio generation model trained from scratch and aligned with CRPO

Image/Video/3D Generation โฏ๏ธ
- Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux
- tencent released Hunyuan3D-2, new 3D asset generation from images
ยท
posted an update 28 days ago