Alessandro Ercolani

giux78

AI & ML interests

NLP, Reinforcement Learning, Semantics, Computational Neuroscience

Recent Activity

View all activity

Organizations

Rocket AI's profile picture Spaces-explorers's profile picture Blog-explorers's profile picture FairMind's profile picture Business Operating System's profile picture mii-community's profile picture Social Post Explorers's profile picture mii-llm's profile picture Coloss's profile picture

giux78's activity

reacted to their post with โค๏ธ about 9 hours ago
view post
Post
1474
This is truly an inspirational story please help us spread the word, @clem , @thomwolf and everyone who supports open source AI.

A few weeks ago, @mmuffo94 and @cittiberto from indigo_ai launched the Chatbot Arena for the Italian language: https://indigo.ai/it/chatbot-arena-italia/.

To our surprise, among the top-ranked models is mii-llm/maestrale-chat-v0.4-beta a carefully fine-tuned version of mistralai/Mistral-7B-v0.1, developed by @efederici and @mferraretto from https://huggingface.co/mii-llm, and released nearly a year ago.

At this very moment, as shown in the screenshot, mii-llm/maestrale-chat-v0.4-beta is ranked 8th right between ChatGPT-4.5 and ChatGPT-4o.

It's likely that for several months, the best Italian speaking LLM has been an open source 7B model created by open source contributors and hardly anyone knew it.
  • 1 reply
ยท
reacted to their post with ๐Ÿ‘€๐Ÿค— about 12 hours ago
view post
Post
1474
This is truly an inspirational story please help us spread the word, @clem , @thomwolf and everyone who supports open source AI.

A few weeks ago, @mmuffo94 and @cittiberto from indigo_ai launched the Chatbot Arena for the Italian language: https://indigo.ai/it/chatbot-arena-italia/.

To our surprise, among the top-ranked models is mii-llm/maestrale-chat-v0.4-beta a carefully fine-tuned version of mistralai/Mistral-7B-v0.1, developed by @efederici and @mferraretto from https://huggingface.co/mii-llm, and released nearly a year ago.

At this very moment, as shown in the screenshot, mii-llm/maestrale-chat-v0.4-beta is ranked 8th right between ChatGPT-4.5 and ChatGPT-4o.

It's likely that for several months, the best Italian speaking LLM has been an open source 7B model created by open source contributors and hardly anyone knew it.
  • 1 reply
ยท
posted an update 1 day ago
view post
Post
1474
This is truly an inspirational story please help us spread the word, @clem , @thomwolf and everyone who supports open source AI.

A few weeks ago, @mmuffo94 and @cittiberto from indigo_ai launched the Chatbot Arena for the Italian language: https://indigo.ai/it/chatbot-arena-italia/.

To our surprise, among the top-ranked models is mii-llm/maestrale-chat-v0.4-beta a carefully fine-tuned version of mistralai/Mistral-7B-v0.1, developed by @efederici and @mferraretto from https://huggingface.co/mii-llm, and released nearly a year ago.

At this very moment, as shown in the screenshot, mii-llm/maestrale-chat-v0.4-beta is ranked 8th right between ChatGPT-4.5 and ChatGPT-4o.

It's likely that for several months, the best Italian speaking LLM has been an open source 7B model created by open source contributors and hardly anyone knew it.
  • 1 reply
ยท
reacted to their post with โค๏ธ 4 days ago
view post
Post
2806
@ mii-llm with @efederici @mferraretto @FinancialSupport and @DeepMount00 we just released #Propaganda a framework designed to evaluate and train LLMs on political opinions and bias. We aim to analyze both open-source and closed-source LLMs to understand the political positions and biases expressed in their outputs. Moreover we provide a set of recipes to enforce political positions into the models by creating ad hoc curated datasets and by applying fine tuning techniques. By releasing our work in the open, we hope to foster contributions: https://github.com/mii-llm/propaganda

This framework offers opportunities for expansion in various directions and could become the standard reference for evaluating LLMs on political topics, particularly those that influence public opinion.
posted an update 11 days ago
view post
Post
2806
@ mii-llm with @efederici @mferraretto @FinancialSupport and @DeepMount00 we just released #Propaganda a framework designed to evaluate and train LLMs on political opinions and bias. We aim to analyze both open-source and closed-source LLMs to understand the political positions and biases expressed in their outputs. Moreover we provide a set of recipes to enforce political positions into the models by creating ad hoc curated datasets and by applying fine tuning techniques. By releasing our work in the open, we hope to foster contributions: https://github.com/mii-llm/propaganda

This framework offers opportunities for expansion in various directions and could become the standard reference for evaluating LLMs on political topics, particularly those that influence public opinion.
reacted to their post with ๐Ÿš€ 8 months ago
view post
Post
1700
We https://mii-llm.ai just released a new LLM Italian benchmark and a set of evaluation: MMLU-PRO-ITA

Thanks to @efederici who released efederici/MMLU-Pro-ita a machine translated version of MMLU-PRO and thanks to a community shared computational effort we published in the "Eval Aggiuntive" tab of https://huggingface.co/spaces/FinancialSupport/open_ita_llm_leaderboard the results on Italian open source LLMs.

If you want to deepen read the blog article on hf https://huggingface.co/blog/giux78/mmlu-pro-ita
posted an update 8 months ago
view post
Post
1700
We https://mii-llm.ai just released a new LLM Italian benchmark and a set of evaluation: MMLU-PRO-ITA

Thanks to @efederici who released efederici/MMLU-Pro-ita a machine translated version of MMLU-PRO and thanks to a community shared computational effort we published in the "Eval Aggiuntive" tab of https://huggingface.co/spaces/FinancialSupport/open_ita_llm_leaderboard the results on Italian open source LLMs.

If you want to deepen read the blog article on hf https://huggingface.co/blog/giux78/mmlu-pro-ita
reacted to dvilasuero's post with โค๏ธ๐Ÿค—๐Ÿš€๐Ÿ”ฅ 10 months ago
view post
Post
8194
Today is a huge day in Argillaโ€™s history. We couldnโ€™t be more excited to share this with the community: weโ€™re joining Hugging Face!

Weโ€™re embracing a larger mission, becoming part of a brilliant and kind team and a shared vision about the future of AI.

Over the past year, weโ€™ve been collaborating with Hugging Face on countless projects: launching partner of Docker Spaces, empowering the community to clean Alpaca translations into Spanish and other languages, launching argilla/notus-7b-v1 building on Zephyrโ€™s learnings, the Data is Better Together initiative with hundreds of community contributors, or releasing argilla/OpenHermesPreferences, one of the largest open preference tuning datasets

After more than 2,000 Slack messages and over 60 people collaborating for over a year, it already felt like we were part of the same team, pushing in the same direction. After a week of the smoothest transition you can imagine, weโ€™re now the same team.

To those of you whoโ€™ve been following us, this wonโ€™t be a huge surprise, but it will be a big deal in the coming months. This acquisition means weโ€™ll double down on empowering the community to build and collaborate on high quality datasets, weโ€™ll bring full support for multimodal datasets, and weโ€™ll be in a better place to collaborate with the Open Source AI community. For enterprises, this means that the Enterprise Hub will unlock highly requested features like single sign-on and integration with Inference Endpoints.

As a founder, I am proud of the Argilla team. We're now part of something bigger and a larger team but with the same values, culture, and goals. Grateful to have shared this journey with my beloved co-founders Paco and Amรฉlie.

Finally, huge thanks to the Chief Llama Officer @osanseviero for sparking this and being such a great partner during the acquisition process.

Would love to answer any questions you have so feel free to add them below!
ยท
reacted to their post with โค๏ธ๐Ÿš€ 10 months ago
view post
Post
1501
@FinancialSupport and I just released a new version of the Italian LLMs leaderboard https://huggingface.co/spaces/FinancialSupport/open_ita_llm_leaderboard
using the super useful https://huggingface.co/demo-leaderboard template from @clefourrier .
Weโ€™ve evaluated over 50 models (base, merged, fine-tuned, etc.) from:
- Major companies like Meta, Mistral, Google ...
- University groups such as https://huggingface.co/sapienzanlp or https://huggingface.co/swap-uniba
- Italian Companies like https://huggingface.co/MoxoffSpA , https://huggingface.co/FairMind or https://huggingface.co/raicrits
- Various communities and individuals
All models were tested on #Italian benchmarks #mmlu #arc-c #hellaswag, which we contributed to the opensource lm-evaluation-harness library from https://huggingface.co/EleutherAI.
Plus, you can now submit your model for automatic evaluation, thanks to to https://huggingface.co/seeweb sponsored computation.
Curious about the top Italian models? Check out the leaderboard and submit your model!

https://huggingface.co/spaces/FinancialSupport/open_ita_llm_leaderboard

posted an update 10 months ago
view post
Post
1501
@FinancialSupport and I just released a new version of the Italian LLMs leaderboard https://huggingface.co/spaces/FinancialSupport/open_ita_llm_leaderboard
using the super useful https://huggingface.co/demo-leaderboard template from @clefourrier .
Weโ€™ve evaluated over 50 models (base, merged, fine-tuned, etc.) from:
- Major companies like Meta, Mistral, Google ...
- University groups such as https://huggingface.co/sapienzanlp or https://huggingface.co/swap-uniba
- Italian Companies like https://huggingface.co/MoxoffSpA , https://huggingface.co/FairMind or https://huggingface.co/raicrits
- Various communities and individuals
All models were tested on #Italian benchmarks #mmlu #arc-c #hellaswag, which we contributed to the opensource lm-evaluation-harness library from https://huggingface.co/EleutherAI.
Plus, you can now submit your model for automatic evaluation, thanks to to https://huggingface.co/seeweb sponsored computation.
Curious about the top Italian models? Check out the leaderboard and submit your model!

https://huggingface.co/spaces/FinancialSupport/open_ita_llm_leaderboard

reacted to efederici's post with ๐Ÿ”ฅ 11 months ago
view post
Post
1788
Finally, I can post! ๐Ÿš€

I created a Capybara-inspired Italian dataset by translating the initial instruction and running it through a pipeline to generate conversations. I used Claude Sonnet for translation and instruction generation, and Opus for generating the answers.

I hope this dataset proves useful for people working on ๐Ÿ‡ฎ๐Ÿ‡น language models.

โ› Open sourcing the dataset here: efederici/capybara-claude-15k-ita
  • 1 reply
ยท
reacted to their post with ๐Ÿš€ 11 months ago
view post
Post
1589
@mik3ml just released ReDiX/wikipediaQA-ita an interesting synthetic dataset originated from wikipedia using a fine tuned version of mistral-7B specific for the Italian language ๐Ÿ‡ฎ๐Ÿ‡น .

  • 1 reply
ยท
posted an update 11 months ago
view post
Post
1589
@mik3ml just released ReDiX/wikipediaQA-ita an interesting synthetic dataset originated from wikipedia using a fine tuned version of mistral-7B specific for the Italian language ๐Ÿ‡ฎ๐Ÿ‡น .

  • 1 reply
ยท
reacted to tomaarsen's post with ๐Ÿ”ฅ 11 months ago
view post
Post
2746
I've just stumbled upon some excellent work on (๐Ÿ‡ซ๐Ÿ‡ท French) retrieval models by @antoinelouis . Kudos to him!

- French Embedding Models: https://huggingface.co/collections/antoinelouis/dense-single-vector-bi-encoders-651523c0c75a3d4c44fc864d
- French Reranker Models: antoinelouis/cross-encoder-rerankers-651523f16efa656d1788a239
- French Multi-vector Models: https://huggingface.co/collections/antoinelouis/dense-multi-vector-bi-encoders-6589a8ee6b17c06872e9f075
- Multilingual Models: https://huggingface.co/collections/antoinelouis/modular-retrievers-65d53d0db64b1d644aea620c

A lot of these models use the MS MARCO Hard Negatives dataset, which I'm currently reformatting to be more easily usable. Notably, they should work out of the box without any pre-processing for training embedding models in the upcoming Sentence Transformers v3.
reacted to gsarti's post with ๐Ÿš€ 11 months ago
view post
Post
3218
๐Ÿ” Today's (self-serving) pick in Interpretability & Analysis of LMs:

A Primer on the Inner Workings of Transformer-based Language Models
by @javifer @gsarti @arianna-bis and M. R. Costa-jussร 
( @mt-upc , @GroNLP , @facebook )

This primer can serve as a comprehensive introduction to recent advances in interpretability for Transformer-based LMs for a technical audience, employing a unified notation to introduce network modules and present state-of-the-art interpretability methods.

Interpretability methods are presented with detailed formulations and categorized as either localizing the inputs or model components responsible for a particular prediction or decoding information stored in learned representations. Then, various insights on the role of specific model components are summarized alongside recent work using model internals to direct editing and mitigate hallucinations.

Finally, the paper provides a detailed picture of the open-source interpretability tools landscape, supporting the need for open-access models to advance interpretability research.

๐Ÿ“„ Paper: A Primer on the Inner Workings of Transformer-based Language Models (2405.00208)

๐Ÿ” All daily picks: https://huggingface.co/collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9