Clem ๐Ÿค—'s picture

Clem ๐Ÿค— PRO

clem

AI & ML interests

multi-modal, time-series, biology and chemistry

Recent Activity

liked a Space 1 day ago
Tonic/audiocraft
liked a model 1 day ago
LatitudeGames/Wayfarer-12B
upvoted a collection 2 days ago
2025 Artifacts
View all activity

Organizations

Hugging Face's profile picture Objective Function's profile picture Pied Piper's profile picture Society & Ethics's profile picture Organization's profile picture Text Generation Inference's profile picture testifly's profile picture HugGAN Community's profile picture Hugging Face Fellows's profile picture Gradio-Blocks-Party's profile picture HuggingFaceM4's profile picture Open-Source AI Meetup's profile picture Hugging Face OSS Metrics's profile picture Hugging Face Smol Cluster's profile picture huggingPartyParis's profile picture Unofficial Mistral Community's profile picture Journalists on Hugging Face's profile picture Major TOM's profile picture MLX Community's profile picture Miami AI Hub's profile picture Social Post Explorers's profile picture Paris AI Running Club's profile picture Hugging Face for Legal's profile picture Hugging Face Party @ PyTorch Conference's profile picture Nerdy Face's profile picture open/ acc's profile picture Bluesky Community's profile picture

clem's activity

reacted to AdinaY's post with ๐Ÿ”ฅ 3 days ago
view post
Post
2886
MiniMax, the company behind Hailuo_AI, has joined the open source community by releasing both models and demos of MiniMax-Text-01 & MiniMax-VL-01๐Ÿ”ฅ
- Model
MiniMaxAI/MiniMax-VL-01
MiniMaxAI/MiniMax-Text-01
- Demo
MiniMaxAI/MiniMax-VL-01
MiniMaxAI/MiniMax-Text-01

โœจ MiniMax-text-01:
- 456B with 45.9B activated per token
- Combines Lightning Attention, Softmax Attention, and MoE for optimal performance
- Training context up to 1M tokens, inference handles 4M tokens

โœจ MiniMax-VL-01:
- ViT-MLP-LLM framework ( non-transformer๐Ÿ‘€)
- Handles image inputs from 336ร—336 to 2016ร—2016
- 694M image-caption pairs + 512B tokens processed across 4 stages
  • 1 reply
ยท
reacted to lianghsun's post with ๐Ÿ‘ 3 days ago
view post
Post
1586
๐Ÿ–– Let me introduce the work I've done over the past three months: ๐—Ÿ๐—น๐—ฎ๐—บ๐—ฎ-๐Ÿฏ.๐Ÿฎ-๐—ง๐—ฎ๐—ถ๐˜„๐—ฎ๐—ป-๐Ÿฏ๐—• and ๐—Ÿ๐—น๐—ฎ๐—บ๐—ฎ-๐Ÿฏ.๐Ÿฎ-๐—ง๐—ฎ๐—ถ๐˜„๐—ฎ๐—ป-๐Ÿฏ๐—•-๐—œ๐—ป๐˜€๐˜๐—ฟ๐˜‚๐—ฐ๐˜, now open-sourced on ๐Ÿค— Hugging Face.

๐—น๐—ถ๐—ฎ๐—ป๐—ด๐—ต๐˜€๐˜‚๐—ป/๐—Ÿ๐—น๐—ฎ๐—บ๐—ฎ-๐Ÿฏ.๐Ÿฎ-๐—ง๐—ฎ๐—ถ๐˜„๐—ฎ๐—ป-๐Ÿฏ๐—•: This model is built on top of ๐—บ๐—ฒ๐˜๐—ฎ-๐—น๐—น๐—ฎ๐—บ๐—ฎ/๐—Ÿ๐—น๐—ฎ๐—บ๐—ฎ-๐Ÿฏ.๐Ÿฎ-๐Ÿฏ๐—• with continual pretraining. The training dataset consists of a mixture of Traditional Chinese and multilingual texts in specific proportions, including 20B tokens of Traditional Chinese text.

๐—น๐—ถ๐—ฎ๐—ป๐—ด๐—ต๐˜€๐˜‚๐—ป/๐—Ÿ๐—น๐—ฎ๐—บ๐—ฎ-๐Ÿฏ.๐Ÿฎ-๐—ง๐—ฎ๐—ถ๐˜„๐—ฎ๐—ป-๐Ÿฏ๐—•-๐—œ๐—ป๐˜€๐˜๐—ฟ๐˜‚๐—ฐ๐˜: This is a fine-tuned conversational model based on the foundation model.

This Llama-3.2-Taiwan open-source project is currently a one-person effort (yes, I did everything from text preparation โ€” so exhausting!). If you're interested, feel free to join the Discord server for discussions.

๐Ÿ…ฑ๐Ÿ…ด๐Ÿ…ฝ๐Ÿ…ฒ๐Ÿ…ท๐Ÿ…ผ๐Ÿ…ฐ๐Ÿ†๐Ÿ…บ๐Ÿ…ธ๐Ÿ…ฝ๐Ÿ…ถ

The evaluation was conducted using ikala/tmmluplus, though the README page does not yet reflect the latest results. The performance is close to the previous versions, indicating that further improvements might require adding more specialized knowledge in the datasets.

๐Ÿ…ฐ ๐Ÿ…ฒ๐Ÿ…ฐ๐Ÿ…ป๐Ÿ…ป ๐Ÿ…ต๐Ÿ…พ๐Ÿ† ๐Ÿ†‚๐Ÿ†„๐Ÿ…ฟ๐Ÿ…ฟ๐Ÿ…พ๐Ÿ†๐Ÿ†ƒ

If anyone is willing to provide compute resources, it would be greatly appreciated to help this project continue and grow. ๐Ÿ’ช

---
๐Ÿ”๏ธ Foundation model: lianghsun/Llama-3.2-Taiwan-3B
๐Ÿค– Instruction model: lianghsun/Llama-3.2-Taiwan-3B-Instruct
โšก GGUF: lianghsun/Llama-3.2-Taiwan-3B-Instruct-GGUF
  • 4 replies
ยท
reacted to AdinaY's post with ๐Ÿ”ฅ 3 days ago
reacted to MoritzLaurer's post with ๐Ÿš€ 3 days ago
view post
Post
1759
Microsoft's rStar-Math paper claims that ๐Ÿค ~7B models can match the math skills of o1 using clever train- and test-time techniques. You can now download their prompt templates from Hugging Face !

๐Ÿ“ The paper introduces rStar-Math, which claims to rival OpenAI o1's math reasoning capabilities by integrating Monte Carlo Tree Search (MCTS) with step-by-step verified reasoning trajectories.
๐Ÿค– A Process Preference Model (PPM) enables fine-grained evaluation of intermediate steps, improving training data quality.
๐Ÿงช The system underwent four rounds of self-evolution, progressively refining both the policy and reward models to tackle Olympiad-level math problemsโ€”without GPT-4-based data distillation.
๐Ÿ’พ While we wait for the release of code and datasets, you can already download the prompts they used from the HF Hub!

Details and links here ๐Ÿ‘‡
Prompt-templates docs: https://moritzlaurer.github.io/prompt_templates/
Templates on the hub: MoritzLaurer/rstar-math-prompts
Prompt-templates collection: MoritzLaurer/prompt-templates-6776aa0b0b8a923957920bb4
Paper: https://arxiv.org/pdf/2501.04519
reacted to fdaudens's post with โค๏ธ๐Ÿ‘€ 3 days ago
view post
Post
1653
AI agents are coming. But who's in control?

@meg , one of the best researchers in AI ethics, makes a critical point about autonomy: fully autonomous systems carry unknowable risks because they operate on computer logic rather than human logic.

The solution? Build systems that support & assist rather than override human decisions.

I highly recommend reading the blog post written by Meg, @evijit @sasha and @giadap . They define different levels of agent autonomy & provide a values-based analysis of risks, benefits, and uses of AI agents to help you make better decisions.

๐Ÿ‘‰ https://huggingface.co/blog/ethics-soc-7

reacted to burtenshaw's post with ๐Ÿ”ฅ 3 days ago
view post
Post
22300
Weโ€™re launching a FREE and CERTIFIED course on Agents!

We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.

Here's what you'll learn:

- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience

This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.

Enroll today and start building the next generation of AI agent applications!

https://bit.ly/hf-learn-agents
ยท
reacted to tomaarsen's post with โค๏ธ 3 days ago
view post
Post
4082
๐ŸŽ๏ธ Today I'm introducing a method to train static embedding models that run 100x to 400x faster on CPU than common embedding models, while retaining 85%+ of the quality! Including 2 fully open models: training scripts, datasets, metrics.

We apply our recipe to train 2 Static Embedding models that we release today! We release:
2๏ธโƒฃ an English Retrieval model and a general-purpose Multilingual similarity model (e.g. classification, clustering, etc.), both Apache 2.0
๐Ÿง  my modern training strategy: ideation -> dataset choice -> implementation -> evaluation
๐Ÿ“œ my training scripts, using the Sentence Transformers library
๐Ÿ“Š my Weights & Biases reports with losses & metrics
๐Ÿ“• my list of 30 training and 13 evaluation datasets

The 2 Static Embedding models have the following properties:
๐ŸŽ๏ธ Extremely fast, e.g. 107500 sentences per second on a consumer CPU, compared to 270 for 'all-mpnet-base-v2' and 56 for 'gte-large-en-v1.5'
0๏ธโƒฃ Zero active parameters: No Transformer blocks, no attention, not even a matrix multiplication. Super speed!
๐Ÿ“ No maximum sequence length! Embed texts at any length (note: longer texts may embed worse)
๐Ÿ“ Linear instead of exponential complexity: 2x longer text takes 2x longer, instead of 2.5x or more.
๐Ÿช† Matryoshka support: allow you to truncate embeddings with minimal performance loss (e.g. 4x smaller with a 0.56% perf. decrease for English Similarity tasks)

Check out the full blogpost if you'd like to 1) use these lightning-fast models or 2) learn how to train them with consumer-level hardware: https://huggingface.co/blog/static-embeddings

The blogpost contains a lengthy list of possible advancements; I'm very confident that our 2 models are only the tip of the iceberg, and we may be able to get even better performance.

Alternatively, check out the models:
* sentence-transformers/static-retrieval-mrl-en-v1
* sentence-transformers/static-similarity-mrl-multilingual-v1
  • 1 reply
ยท
upvoted an article 4 days ago
upvoted an article 6 days ago
view article
Article

๐Ÿฆธ๐Ÿป#7: From Agentic AI to Physical AI

By Kseniase โ€ข
โ€ข 4
reacted to reddgr's post with ๐Ÿ”ฅ 8 days ago
view post
Post
2298
Major update on the Talking to Chatbots dataset! Expanded the 'wrapped' dataset (one row per chat) to 2.86k records, and the 'unwrapped' version (one row per conversation turn) to 11k records. The main source is my ChatGPT archive with nearly 2 years of chats. It is still a work in progress as I incorporate chats from other sources and qualitative metrics (SCBN) for responses.

reddgr/talking-to-chatbots-unwrapped-chats

reddgr/talking-to-chatbots-chats