Pedro Cuenca
pcuenq
AI & ML interests
None yet
Recent Activity
upvoted
an
article
4 minutes ago
Falcon-Edge: A series of powerful, universal, fine-tunable 1.58bit language models.
liked
a dataset
about 15 hours ago
allenai/tulu-3-sft-mixture
liked
a model
about 16 hours ago
Hcompany/Holo1-3B
Organizations
pcuenq's activity

reacted to
clem's
post with ππ₯
7 days ago

reacted to
merve's
post with π
7 days ago
Post
2518
emerging trend: models that can understand image + text and generate image + text
don't miss out ‡οΈ
> MMaDA: single 8B diffusion model aligned with CoT (reasoning!) + UniGRPO Gen-Verse/MMaDA
> BAGEL: 7B MoT model based on Qwen2.5, SigLIP-so-400M, Flux VAE ByteDance-Seed/BAGEL
both by ByteDance! π±
I keep track of all any input β any output models here https://huggingface.co/collections/merve/any-to-any-models-6822042ee8eb7fb5e38f9b62
don't miss out ‡οΈ
> MMaDA: single 8B diffusion model aligned with CoT (reasoning!) + UniGRPO Gen-Verse/MMaDA
> BAGEL: 7B MoT model based on Qwen2.5, SigLIP-so-400M, Flux VAE ByteDance-Seed/BAGEL
both by ByteDance! π±
I keep track of all any input β any output models here https://huggingface.co/collections/merve/any-to-any-models-6822042ee8eb7fb5e38f9b62

reacted to
jsulz's
post with π₯
13 days ago
Post
2099
Heyo
@RichardErkhov
the
xet-team
at Hugging face was wondering if you wanted to join the fun and jump over to Xet storage. π€
We've been onboarding folks https://huggingface.co/blog/xet-on-the-hub know the backend can scale (Llama 4 and Qwen 3 are on Xet), is great for working with quants (see xet-team/quantization-dedup ), and we're pushing on inviting impactful orgs and users on the Hub. You fit the bill.
We'd love to onboard you, get some feedback, and create some excitement π
The steps are pretty straightforward - join the waitlist at hf.co/join/xet and we'll take care of the rest.
The system is fully backward compatible, so you shouldn't notice a thing. BUT to get the best experience when uploading/downloading, make sure you have
What do you think?

We've been onboarding folks https://huggingface.co/blog/xet-on-the-hub know the backend can scale (Llama 4 and Qwen 3 are on Xet), is great for working with quants (see xet-team/quantization-dedup ), and we're pushing on inviting impactful orgs and users on the Hub. You fit the bill.
We'd love to onboard you, get some feedback, and create some excitement π
The steps are pretty straightforward - join the waitlist at hf.co/join/xet and we'll take care of the rest.
The system is fully backward compatible, so you shouldn't notice a thing. BUT to get the best experience when uploading/downloading, make sure you have
hf_xet
installed alongside the latest huggingface_hub
What do you think?

reacted to
Avelina's
post with π
19 days ago
Post
1225
Found out my ECCV paper is getting rejected because of a LaTeX compile error :(

reacted to
Beegbrain's
post with π
22 days ago
Post
901
Hello, I've just written an article explaining the project I've made with my team at the Mistral AI Robotic Hackathon one week ago : https://huggingface.co/blog/Beegbrain/guess-who-so100-mistral ; Feel free to take a look, we are open-sourcing the code and begin to launch a community project around the idea, reach out to participate

reacted to
jsulz's
post with ππ₯
about 2 months ago
Post
3766
Huge week for
xet-team
as Llama 4 is the first major model on Hugging Face uploaded with Xet providing the backing! Every byte downloaded comes through our infrastructure.
Using Xet on Hugging Face is the fastest way to download and iterate on open source models and we've proved it with Llama 4 giving a boost of ~25% across all models.
We expect builders on the Hub to see even more improvements, helping power innovation across the community.
With the models on our infrastructure, we can peer in and see how well our dedupe performs across the Llama 4 family. On average, we're seeing ~25% dedupe, providing huge savings to the community who iterate on these state-of-the-art models. The attached image shows a few selected models and how they perform on Xet.
Thanks to the
meta-llama
team for launching on Xet!

Using Xet on Hugging Face is the fastest way to download and iterate on open source models and we've proved it with Llama 4 giving a boost of ~25% across all models.
We expect builders on the Hub to see even more improvements, helping power innovation across the community.
With the models on our infrastructure, we can peer in and see how well our dedupe performs across the Llama 4 family. On average, we're seeing ~25% dedupe, providing huge savings to the community who iterate on these state-of-the-art models. The attached image shows a few selected models and how they perform on Xet.
Thanks to the


reacted to
clem's
post with ππ₯
3 months ago
Post
5942
Super happy to welcome Nvidia as our latest enterprise hub customer. They have almost 2,000 team members using Hugging Face, and close to 20,000 followers of their org. Can't wait to see what they'll open-source for all of us in the coming months!
Nvidia's org:
nvidia
Enterprise hub: https://huggingface.co/enterprise
Nvidia's org:

Enterprise hub: https://huggingface.co/enterprise

reacted to
m-ric's
post with π
3 months ago
Post
4869
We now have a Deep Research for academia: SurveyX automatically writes academic surveys nearly indistinguishable from human-written ones π₯
Researchers from Beijing and Shanghai just published the first application of a deep research system to academia: their algorithm, given a question, can give you a survey of all papers on the subject.
To make a research survey, you generally follow two steps, preparation (collect and organize papers) and writing (outline creation, writing, polishing). Researchers followed the same two steps and automated them.
π― For the preparation part, a key part is find all the important references on the given subject.
Researchers first cast a wide net of all relevant papers. But then finding the really important ones is like distilling knowledge from a haystack of information. To solve this challenge, they built an βAttributeTreeβ object that structures key information from citations. Ablating these AttributeTrees significantly decreased structure and synthesis scores, so they were really useful!
π For the writing part, key was to get a synthesis that's both short and true. This is not easy to get with LLMs! So they used methods like LLM-based deduplication to shorten the too verbose listings made by LLMs, and RAG to grab original quotes instead of made-up ones.
As a result, their system outperforms previous approaches by far!
As assessed by LLM-judges, the quality score os SurveyX even approaches this of human experts, with 4.59/5 vs 4.75/5 π
I advise you to read the paper, it's a great overview of the kind of assistants that we'll get in the short future! π SurveyX: Academic Survey Automation via Large Language Models (2502.14776)
Their website shows examples of generated surveys π http://www.surveyx.cn/
Researchers from Beijing and Shanghai just published the first application of a deep research system to academia: their algorithm, given a question, can give you a survey of all papers on the subject.
To make a research survey, you generally follow two steps, preparation (collect and organize papers) and writing (outline creation, writing, polishing). Researchers followed the same two steps and automated them.
π― For the preparation part, a key part is find all the important references on the given subject.
Researchers first cast a wide net of all relevant papers. But then finding the really important ones is like distilling knowledge from a haystack of information. To solve this challenge, they built an βAttributeTreeβ object that structures key information from citations. Ablating these AttributeTrees significantly decreased structure and synthesis scores, so they were really useful!
π For the writing part, key was to get a synthesis that's both short and true. This is not easy to get with LLMs! So they used methods like LLM-based deduplication to shorten the too verbose listings made by LLMs, and RAG to grab original quotes instead of made-up ones.
As a result, their system outperforms previous approaches by far!
As assessed by LLM-judges, the quality score os SurveyX even approaches this of human experts, with 4.59/5 vs 4.75/5 π
I advise you to read the paper, it's a great overview of the kind of assistants that we'll get in the short future! π SurveyX: Academic Survey Automation via Large Language Models (2502.14776)
Their website shows examples of generated surveys π http://www.surveyx.cn/

reacted to
lysandre's
post with π₯πβ€οΈ
3 months ago
Post
6917
SmolVLM-2 and SigLIP-2 are now part of
They're added on top of the v4.49.0 release, and can be installed from the following tags:
This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures).
Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models.
Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.
transformers
in dedicated releases!They're added on top of the v4.49.0 release, and can be installed from the following tags:
v4.49.0-SmolVLM-2
and v4.49.0-SigLIP-2
.This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures).
Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models.
Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.

reacted to
merve's
post with π₯π
4 months ago
Post
3195
Interesting releases in open AI this week, let's recap π€
merve/feb-7-releases-67a5f7d7f172d8bfe0dd66f4
π€ Robotics
> Pi0, first open-source foundation vision-language action model was released in Le Robot (Apache 2.0)
π¬ LLMs
> Groundbreaking: s1 is simpler approach to test-time scaling, the release comes with small s1K dataset of 1k question-reasoning trace pairs (from Gemini-Thinking Exp) they fine-tune Qwen2.5-32B-Instruct to get s1-32B, outperforming o1-preview on math π€― s1-32B and s1K is out!
> Adyen released DABstep, a new benchmark along with it's leaderboard demo for agents doing data analysis
> Krutrim released Krutrim-2 instruct, new 12B model based on NeMo12B trained and aligned on Indic languages, a new multilingual sentence embedding model (based on STSB-XLM-R), and a translation model for Indic languages
π Multimodal
> PKU released Align-DS-V, a model aligned using their new technique called LLF for all modalities (image-text-audio), along with the dataset Align Anything
> OLA-7B is a new any-to-any model by Tencent that can take text, image, video, audio data with context window of 32k tokens and output text and speech in English and Chinese
> Krutrim released Chitrarth, a new vision language model for Indic languages and English
πΌοΈ Vision
> BiRefNet_HR is a new higher resolution BiRefNet for background removal
π£οΈ Audio
> kyutai released Hibiki, it's a real-time speech-to-speech translation model π€― it's available for French-English translation
> Krutrim released Dhwani, a new STT model for Indic languages
> They also release a new dataset for STT-TTS
πΌοΈ Image Generation
> Lumina released Lumina-Image-2.0, a 2B parameter-flow based DiT for text to image generation
> Tencent released Hunyuan3D-2, a 3D asset generation model based on DiT and Hunyuan3D-Paint
> boreal-hl-v1 is a new boring photorealistic image generation LoRA based on Hunyuan
π€ Robotics
> Pi0, first open-source foundation vision-language action model was released in Le Robot (Apache 2.0)
π¬ LLMs
> Groundbreaking: s1 is simpler approach to test-time scaling, the release comes with small s1K dataset of 1k question-reasoning trace pairs (from Gemini-Thinking Exp) they fine-tune Qwen2.5-32B-Instruct to get s1-32B, outperforming o1-preview on math π€― s1-32B and s1K is out!
> Adyen released DABstep, a new benchmark along with it's leaderboard demo for agents doing data analysis
> Krutrim released Krutrim-2 instruct, new 12B model based on NeMo12B trained and aligned on Indic languages, a new multilingual sentence embedding model (based on STSB-XLM-R), and a translation model for Indic languages
π Multimodal
> PKU released Align-DS-V, a model aligned using their new technique called LLF for all modalities (image-text-audio), along with the dataset Align Anything
> OLA-7B is a new any-to-any model by Tencent that can take text, image, video, audio data with context window of 32k tokens and output text and speech in English and Chinese
> Krutrim released Chitrarth, a new vision language model for Indic languages and English
πΌοΈ Vision
> BiRefNet_HR is a new higher resolution BiRefNet for background removal
π£οΈ Audio
> kyutai released Hibiki, it's a real-time speech-to-speech translation model π€― it's available for French-English translation
> Krutrim released Dhwani, a new STT model for Indic languages
> They also release a new dataset for STT-TTS
πΌοΈ Image Generation
> Lumina released Lumina-Image-2.0, a 2B parameter-flow based DiT for text to image generation
> Tencent released Hunyuan3D-2, a 3D asset generation model based on DiT and Hunyuan3D-Paint
> boreal-hl-v1 is a new boring photorealistic image generation LoRA based on Hunyuan

reacted to
burtenshaw's
post with ππ€β€οΈπ₯
4 months ago
Post
49059
Weβre launching a FREE and CERTIFIED course on Agents!
We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.
Here's what you'll learn:
- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience
This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.
Enroll today and start building the next generation of AI agent applications!
https://bit.ly/hf-learn-agents
We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.
Here's what you'll learn:
- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience
This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.
Enroll today and start building the next generation of AI agent applications!
https://bit.ly/hf-learn-agents