Social Post Explorers

community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

social-post-explorers's activity

clem 
posted an update about 15 hours ago
view post
Post
341
Now in Enterprise Hub organizations, you can centralize your billing not only for HF usage but also inference through our inference partners.

Will prevent some headaches for your finance & accounting teams haha (so feel free to share that with them).
clem 
posted an update 2 days ago
view post
Post
3390
Before 2020, most of the AI field was open and collaborative. For me, that was the key factor that accelerated scientific progress and made the impossible possible—just look at the “T” in ChatGPT, which comes from the Transformer architecture openly shared by Google.

Then came the myth that AI was too dangerous to share, and companies started optimizing for short-term revenue. That led many major AI labs and researchers to stop sharing and collaborating.

With OAI and sama now saying they're willing to share open weights again, we have a real chance to return to a golden age of AI progress and democratization—powered by openness and collaboration, in the US and around the world.

This is incredibly exciting. Let’s go, open science and open-source AI!
·
mmhamdy 
posted an update 4 days ago
view post
Post
1478
What inspired the Transformer architecture in the "Attention Is All You Need" paper? And how were various ideas combined to create this groundbreaking model?

In this lengthy article, I explore the story and the origins of some of the ideas introduced in the paper. We'll explore everything from the fundamental attention mechanism that lies at its heart to the surprisingly simple explanation for its name, Transformer.

💡 Examples of ideas explored in the article:

✅ What was the inspiration for the attention mechanism?
✅ How did we go from attention to self-attention?
✅ Did the team have any other names in mind for the model?

and more...

I aim to tell the story of Transformers as I would have wanted to read it, and hopefully, one that appeals to others interested in the details of this fascinating idea. This narrative draws from video interviews, lectures, articles, tweets/Xs, and some digging into the literature. I have done my best to be accurate, but errors are possible. If you find inaccuracies or have any additions, please do reach out, and I will gladly make the necessary updates.

Read the article: https://huggingface.co/blog/mmhamdy/pandemonium-the-transformers-story
clem 
posted an update 5 days ago
view post
Post
2312
What's this cool purple banner haha 😶😶😶
·
clem 
posted an update 7 days ago
diwank 
posted an update 8 days ago
view post
Post
1250
Excited to announce *Open Responses* – a self-hosted alternative to OpenAI's new _Responses API_ that you can run locally, and use with ANY LLM model / provider and not just with OpenAI Responses API. What's more is that this is also compatible with their agents-sdk so everything just works out of the box!

To try it out, just run npx -y open-responses init (or uvx) and that's it! :)

Would love feedback and support for adding local HF models, @akhaliq @bartowski @prithivMLmods @julien-c @clefourrier @philschmid

We’d love feedback from the Hugging Face community on how it integrates with your pipelines (support for Hugging Face models landing soon!). Let’s push open-source AI forward together!

Docs:
https://docs.julep.ai/responses/quickstart

Repo:
https://github.com/julep-ai/open-responses

agents-sdk:
https://platform.openai.com/docs/guides/agents
  • 1 reply
·
clem 
posted an update 8 days ago
clem 
posted an update 14 days ago
view post
Post
3691
Should we assemble affordable open-source robots at Hugging Face for the community. Would you buy them? At what price?
·
clem 
posted an update 14 days ago
view post
Post
2559
Nice new space to see how fast your personal or organization followers are growing on HF:
julien-c/follow-history

As you can see, I still have more followers than @julien-c even if he's trying to change this by building such cool spaces 😝😝😝
clem 
posted an update 21 days ago
view post
Post
4609
We just crossed 1,500,000 public models on Hugging Face (and 500k spaces, 330k datasets, 50k papers). One new repository is created every 15 seconds. Congratulations all!
·
clem 
posted an update 26 days ago
view post
Post
7264
I was chatting with @peakji , one of the cofounders of Manu AI, who told me he was on Hugging Face (very cool!).

He shared an interesting insight which is that agentic capabilities might be more of an alignment problem rather than a foundational capability issue. Similar to the difference between GPT-3 and InstructGPT, some open-source foundation models are simply trained to 'answer everything in one response regardless of the complexity of the question' - after all, that's the user preference in chatbot use cases. Just a bit of post-training on agentic trajectories can make an immediate and dramatic difference.

As a thank you to the community, he shared 100 invite code first-come first serve, just use “HUGGINGFACE” to get access!
·
clem 
posted an update 26 days ago
clem 
posted an update 30 days ago
view post
Post
5912
Super happy to welcome Nvidia as our latest enterprise hub customer. They have almost 2,000 team members using Hugging Face, and close to 20,000 followers of their org. Can't wait to see what they'll open-source for all of us in the coming months!

Nvidia's org: nvidia
Enterprise hub: https://huggingface.co/enterprise
KnutJaegersberg 
posted an update about 1 month ago
mmhamdy 
posted an update about 1 month ago
view post
Post
2748
🎉 We're excited to introduce MemoryCode, a novel synthetic dataset designed to rigorously evaluate LLMs' ability to track and execute coding instructions across multiple sessions. MemoryCode simulates realistic workplace scenarios where a mentee (the LLM) receives coding instructions from a mentor amidst a stream of both relevant and irrelevant information.

💡 But what makes MemoryCode unique?! The combination of the following:

✅ Multi-Session Dialogue Histories: MemoryCode consists of chronological sequences of dialogues between a mentor and a mentee, mirroring real-world interactions between coworkers.

✅ Interspersed Irrelevant Information: Critical instructions are deliberately interspersed with unrelated content, replicating the information overload common in office environments.

✅ Instruction Updates: Coding rules and conventions can be updated multiple times throughout the dialogue history, requiring LLMs to track and apply the most recent information.

✅ Prospective Memory: Unlike previous datasets that cue information retrieval, MemoryCode requires LLMs to spontaneously recall and apply relevant instructions without explicit prompts.

✅ Practical Task Execution: LLMs are evaluated on their ability to use the retrieved information to perform practical coding tasks, bridging the gap between information recall and real-world application.

📌 Our Findings

1️⃣ While even small models can handle isolated coding instructions, the performance of top-tier models like GPT-4o dramatically deteriorates when instructions are spread across multiple sessions.

2️⃣ This performance drop isn't simply due to the length of the context. Our analysis indicates that LLMs struggle to reason compositionally over sequences of instructions and updates. They have difficulty keeping track of which instructions are current and how to apply them.

🔗 Paper: From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions (2502.13791)
📦 Code: https://github.com/for-ai/MemoryCode