clefourrier (Clémentine Fourrier)

reacted to danaaubakirova's post with ❤️ 8 months ago

Post

3085

We just dropped SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics!

check out the blog: https://huggingface.co/blog/smolvla
read the technical report: SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics (2506.01844)
access the model weights: lerobot/smolvla_base

reacted to AdinaY's post with 🔥 8 months ago

Post

2225

May highlights from China’s open source ecosystem 🔥

zh-ai-community/may-2025-open-works-from-the-chinese-community-681a3494145f2914dc679b7c

✨ DeepSeek dropped R1 updates
- Both R1 & 8B distralled smol model

✨ Bytedance goes big on open source:
- BAGEL, Dolphin, Seedcoder, Dream0...

✨ Multimodal is on fire!
- HuyuanCustom / HunyuanVideo-Avatar / HunyuanPortrait
- MiniMax: SynLogic / Orsta-7B
- Xiaomi: MiMo VL
- Alibaba Wan: Wan2.1-VACE
- OpenGVlab: ZeroGUI
- StepFun: ACE-Step-v1/Step1X-3D

✨ Specialized models/datasets excels
- Alibaba Qwen: World PM 72B
- BAAI:RobotBrain (MLLM for robotic)
- HiThink Research: BizFinBench (dataset)
- OpenBMB: Ultra FineWeb (dataset)
- Bilibili: Index-anisora (Anime/ACG)
- Skywork:Matrix-Game (game)

More awesome releases: Alibaba QwenLong-L1-32B, SkyWork OR1, OpenS2V-5M etc...

reacted to cgeorgiaw's post with 🚀 8 months ago

Post

529

Just dropped two bigger physics datasets (both on photonics)!

NUMBA 1: SIB-CL
This dataset of Surrogate- and Invariance-Boosted Contrastive Learning (SIB-CL) datasets for two scientific problems:
- PhC2D: 2D photonic crystal density-of-states (DOS) and bandstructure data.
- TISE: 3D time-independent Schrödinger equation eigenvalue and eigenvector solutions.

NUMBA2: 2D Photonic Topology
Symmetry-driven analysis of 2D photonic crystals: 10k random unit cells across 11 symmetries, 2 polarizations, 5 contrasts. Includes time-reversal breaking cases for 4 symmetries at high contrast.

Check them out: cgeorgiaw/sib-cl & cgeorgiaw/2d-photonic-topology

reacted to fdaudens's post with ❤️ 8 months ago

Post

2519

Here’s what happens when a national institution builds its own digital intelligence: France’s Ministry of Culture just released 17K+ real users testing 30+ chatbots in French. Raw, diverse, and a goldmine for studying LLMs in the wild.

ministere-culture/comparia-conversations

replied to their post 8 months ago

Hi! The official GAIA leaderboard is unrelated to the test leaderboard for the agents classes :)
You should contact @burtenshaw who's managing the classes courses :)

reacted to m-ric's post with ❤️ 8 months ago

Post

3031

𝗚𝗿𝗲𝗮𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲 𝗮𝗹𝗲𝗿𝘁: you can now share agents to the Hub! 🥳🥳

And any agent pushed to Hub get a cool Space interface to directly chat with it.

This was a real technical challenge: for instance, serializing tools to export them meant that you needed to get all the source code for a tool, verify that it was standalone (not relying on external variables), and gathering all the packages required to make it run.

Go try it out! 👉 https://github.com/huggingface/smolagents

2 replies

·

reacted to nicolay-r's post with 🔥 8 months ago

Post

2458

🚀 For those who interested in minimalistic integration of LLMs inferece with predefined reasoning shema, excited to share the latest bulk chain 1.1.0. It represents a no-string solution for deploying your LLM for efficient inference over data iterators.
✨ Key Features:
- Full async inference support + Including streaming mode for real-time output
- simplified inference API
🔗 Check out the repo: https://github.com/nicolay-r/bulk-chain

💡 Special thanks to @RicardoLee for his work on effective async LLaMA-3 deployment that helped shape this release:
https://github.com/RicardoLeeV587/Llama3-FastInference

posted an update 8 months ago

Post

2201

Always surprised that so few people actually read the FineTasks blog, on
✨how to select training evals with the highest signal✨

If you're serious about training models without wasting compute on shitty runs, you absolutely should read it!!

An high signal eval actually tells you precisely, during training, how wel & what your model is learning, allowing you to discard the bad runs/bad samplings/...!

The blog covers in depth prompt choice, metrics, dataset, across languages/capabilities, and my fave section is "which properties should evals have"👌
(to know on your use case how to select the best evals for you)

Blog: HuggingFaceFW/blogpost-fine-tasks

2 replies

·

reacted to fdaudens's post with ❤️ 8 months ago

Post

6228

Tried something new: an AI-generated podcast that breaks down the top research paper each day. Fully automated, now live on Spotify.

I built this prototype to help keep up with the rapid pace of AI developments and, hopefully, make cutting-edge research more accessible. I don’t know about you, but just listening to a conversation about a paper really helps the content sink in for me.

This build taught me a lot about full automation. If you’re into the technical weeds: Qwen3 runs on Inference to handle the script, Kokoro does the voice, and the whole thing gets published automatically thanks to the Hugging Face Jobs API and Gradio deployment.

It’s not perfect yet — I’ll be monitoring for hallucinations and incoherence. The voice model still needs polish, but it’s a promising start. Would love to build this with the community — submit a PR or send feedback. It’s just a beta of an experimental idea!

Big kudos to @m-ric , whose Open NotebookLM this is based on, and to @nielsr for his terrific work making research papers more accessible.

- Podcast on Spotify: https://open.spotify.com/show/3PTucIW1w1GIkqTYm32ka7?si=c7a851f83e6d4331 (Apple Podcasts soon)
- Code: fdaudens/podcast-jobs
- Open NotebookLM: m-ric/open-notebooklm
- Also super helpful, @qgallouedec 's tutorial on HF Jobs API: https://huggingface.co/spaces/qgallouedec/run-hello-world/blob/main/README.md

2 replies

·

posted an update 10 months ago

Post

2674

Gemma3 family is out! Reading the tech report, and this section was really interesting to me from a methods/scientific fairness pov.

Instead of doing over-hyped comparisons, they clearly state that **results are reported in a setup which is advantageous to their models**.
(Which everybody does, but people usually don't say)

For a tech report, it makes a lot of sense to report model performance when used optimally!
On leaderboards on the other hand, comparison will be apples to apples, but in a potentially unoptimal way for a given model family (like some user interact sub-optimally with models)

Also contains a cool section (6) on training data memorization rate too! Important to see if your model will output the training data it has seen as such: always an issue for privacy/copyright/... but also very much for evaluation!

Because if your model knows its evals by heart, you're not testing for generalization.

reacted to m-ric's post with 🔥 12 months ago

Post

3189

Now you can launch a code agent directly from your terminal!
✨ 𝚜𝚖𝚘𝚕𝚊𝚐𝚎𝚗𝚝 "𝚈𝚘𝚞𝚛 𝚝𝚊𝚜𝚔" directly launches a CodeAgent
▶️ This also works with web agents (replace 𝚜𝚖𝚘𝚕𝚊𝚐𝚎𝚗𝚝 with 𝚠𝚎𝚋𝚊𝚐𝚎𝚗𝚝) thanks to @merve !

💾 Another treat from smolagents release 1.7.0:
Now agents have a memory mechanism, enabling many possibilities like replaying the last run with 𝚊𝚐𝚎𝚗𝚝.𝚛𝚎𝚙𝚕𝚊𝚢(), thank you @clefourrier !

Check the release notes here 👉 https://github.com/huggingface/smolagents/releases/tag/v1.7.0

reacted to BrigitteTousi's post with ❤️ about 1 year ago

Post

1357

Community fine-tuned models are more carbon efficient than the models they are derived from! 🥳🌿

@alozowski @clefourrier @SaylorTwift @albertvillanova evaluated CO₂ emissions associated with model inference for over 3000 models on the Open LLM Leaderboard. Interesting trends and new insights emerged...👀

Blog Post: https://huggingface.co/blog/leaderboard-emissions-analysis

Leaderboard: open-llm-leaderboard/open_llm_leaderboard

reacted to fdaudens's post with ❤️ about 1 year ago

Post

1861

Keeping up with open-source AI in 2024 = overwhelming.

Here's help: We're launching our Year in Review on what actually matters, starting today!

Fresh content dropping daily until year end. Come along for the ride - first piece out now with @clem 's predictions for 2025.

Think of it as your end-of-year AI chocolate calendar.

Kudos to @BrigitteTousi @clefourrier @Wauplin @thomwolf for making it happen. We teamed up with aiworld.eu for awesome visualizations to make this digestible—it's a charm to work with their team.

Check it out: huggingface/open-source-ai-year-in-review-2024

reacted to thomwolf's post with 🧠🔥 about 1 year ago

Post

1880

Interesting long read from @evanmiller-anthropic on having a better founded statistical approach to Language Model Evaluations:
https://www.anthropic.com/research/statistical-approach-to-model-evals

Worth a read if you're into LLM evaluations!

Cc @clefourrier

1 reply

·

reacted to malhajar's post with 🔥 about 1 year ago

Post

5293

🇫🇷 Lancement officiel de l'OpenLLM French Leaderboard : initiative open-source pour référencer l’évaluation des LLMs francophones

Après beaucoup d’efforts et de sueurs avec Alexandre Lavallee, nous sommes ravis d’annoncer que le OpenLLMFrenchLeaderboard est en ligne sur Hugging Face (space url: le-leadboard/OpenLLMFrenchLeaderboard) la toute première plateforme dédiée à l’évaluation des grands modèles de langage (LLM) en français. 🇫🇷✨

Ce projet de longue haleine est avant tout une œuvre de passion mais surtout une nécessité absolue. Il devient urgent et vital d'oeuvrer à plus de transparence dans ce domaine stratégique des LLM dits multilingues. La première pièce à l'édifice est donc la mise en place d'une évaluation systématique et systémique des modèles actuels et futurs.

Votre modèle IA français est-il prêt à se démarquer ? Soumettez le dans notre espace, et voyez comment vous vous comparez par rapport aux autres modèles.

❓ Comment ça marche :
Soumettez votre LLM français pour évaluation, et nous le testerons sur des benchmarks de référence spécifiquement adaptés pour la langue française — notre suite de benchmarks comprend :

- BBH-fr : Raisonnement complexe
- IFEval-fr : Suivi d'instructions
- GPQA-fr : Connaissances avancées
- MUSR-fr : Raisonnement narratif
- MATH_LVL5-fr : Capacités mathématiques
- MMMLU-fr : Compréhension multitâche

Le processus est encore manuel, mais nous travaillons sur son automatisation, avec le soutien de la communauté Hugging Face.

@clem , on se prépare pour une mise à niveau de l’espace ? 😏👀

Ce n'est pas qu'une question de chiffres—il s'agit de créer une IA qui reflète vraiment notre langue, notre culture et nos valeurs. OpenLLMFrenchLeaderboard est notre contribution personnelle pour façonner l'avenir des LLM en France.

1 reply

·

reacted to fdaudens's post with ❤️ over 1 year ago

Post

1904

Look at that 👀

Actual benchmarks have become too easy for recent models, much like grading high school students on middle school problems makes little sense. So the team worked on a new version of the Open LLM Leaderboard with new benchmarks.

Stellar work from @clefourrier @SaylorTwift and the team!

👉 Read the blog post: open-llm-leaderboard/blog
👉 Explore the leaderboard: open-llm-leaderboard/open_llm_leaderboard

1 reply

·

reacted to alvdansen's post with 👍 over 1 year ago

Post

6930

I had a backlog of LoRA model weights for SDXL that I decided to prioritize this weekend and publish. I know many are using SD3 right now, however if you have the time to try them, I hope you enjoy them.

I intend to start writing more fully on the thought process behind my approach to curating and training style and subject finetuning, beginning this next week.

Thank you for reading this post! You can find the models on my page and I'll drop a few previews here.

4 replies

·

reacted to fffiloni's post with ❤️ over 1 year ago

Post

19565

🇫🇷
Quel impact de l’IA sur les filières du cinéma, de l’audiovisuel et du jeu vidéo?
Etude prospective à destination des professionnels
— CNC & BearingPoint | 09/04/2024

Si l’Intelligence Artificielle (IA) est utilisée de longue date dans les secteurs du cinéma, de l’audiovisuel et du jeu vidéo, les nouvelles applications de l’IA générative bousculent notre vision de ce dont est capable une machine et possèdent un potentiel de transformation inédit. Elles impressionnent par la qualité de leurs productions et suscitent par conséquent de nombreux débats, entre attentes et appréhensions.

Le CNC a donc décider de lancer un nouvel Observatoire de l’IA Afin de mieux comprendre les usages de l’IA et ses impacts réels sur la filière de l’image. Dans le cadre de cet Observatoire, le CNC a souhaité dresser un premier état des lieux à travers la cartographie des usages actuels ou potentiels de l’IA à chaque étape du processus de création et de diffusion d’une œuvre, en identifiant les opportunités et risques associés, notamment en termes de métiers et d’emploi. Cette étude CNC / Bearing Point en a présenté les principaux enseignements le 6 mars, lors de la journée CNC « Créer, produire, diffuser à l’heure de l’intelligence artificielle ».

Le CNC publie la version augmentée de la cartographie des usages de l’IA dans les filières du cinéma, de l’audiovisuel et du jeu vidéo.

Lien vers la cartographie complète: https://www.cnc.fr/documents/36995/2097582/Cartographie+des+usages+IA_rapport+complet.pdf/96532829-747e-b85e-c74b-af313072cab7?t=1712309387891

4 replies

·

reacted to alielfilali01's post with 🔥 over 1 year ago

Post

1489

Just passed the 25 models milestone on the OALL/Open-Arabic-LLM-Leaderboard 🥳

And now meta-llama/Meta-Llama-3-70B-Instruct is the new hero of the leaderboard beating https://huggingface.co/CohereForAI/c4ai-command-r-v01 by 5.43 points 🔥

Almost another 80 models are still PENDING ! So this might change very fast in the upcoming days

Clémentine Fourrier

AI & ML interests

Recent Activity

Organizations

Clémentine Fourrier

AI & ML interests

Recent Activity

Organizations

clefourrier's activity