41 41 136

Stefano Fiorucci PRO

anakin87

AI & ML interests

Contributing to Haystack LLM framework 🏗️. Language Models: orchestration, post-training, synthetic data...

Recent Activity

upvoted an article 2 days ago

Argunauts Training Phase II: Selfplay Finetuning Line-By-Line

updated a collection 4 days ago

📝 Cool LLM papers

liked a model 8 days ago

NousResearch/DeepHermes-3-Llama-3-8B-Preview

View all activity

Organizations

anakin87's activity

upvoted an article 2 days ago

Article

Argunauts Training Phase II: Selfplay Finetuning Line-By-Line

•

2 days ago

• 2

updated a collection 4 days ago

📝 Cool LLM papers

Collection

Starting from 2024-11-15 • 14 items • Updated 4 days ago • 2

liked a model 8 days ago

NousResearch/DeepHermes-3-Llama-3-8B-Preview

Text Generation • Updated 2 days ago • 5.38k • 246

upvoted an article 11 days ago

Article

Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset

•

11 days ago

• 35

updated a collection 15 days ago

📝 Cool LLM papers

Collection

Starting from 2024-11-15 • 14 items • Updated 4 days ago • 2

updated a collection 19 days ago

📝 Cool LLM papers

Collection

Starting from 2024-11-15 • 14 items • Updated 4 days ago • 2

liked 2 models 20 days ago

mradermacher/Phi-3.5-mini-ITA-i1-GGUF

Updated Jan 11 • 220 • 1

mradermacher/Phi-3.5-mini-ITA-GGUF

Updated Jan 11 • 124 • 1

liked a dataset 23 days ago

open-ita-llms/OpenSFT-ita

Viewer • Updated 18 days ago • 185k • 82 • 3

liked 2 datasets about 1 month ago

anakin87/evol-dpo-ita-reranked

Viewer • Updated Jan 14 • 19.8k • 138 • 4

anakin87/fine-instructions-ita-70k

Viewer • Updated Jan 14 • 69.9k • 136 • 3

New activity in google/gemma-2-9b about 1 month ago

Fine-tuning Hyperparameters

#27 opened 7 months ago by

tanliboy

liked a dataset about 1 month ago

mlabonne/orpo-dpo-mix-40k

Viewer • Updated Oct 17, 2024 • 44.2k • 960 • 277

replied to their post about 1 month ago

Ok, I understand...

In the past, I've also fine-tuned models with different licenses.
You may be interested in https://huggingface.co/anakin87/Phi-3.5-mini-ITA (MIT license).

upvoted an article about 1 month ago

Article

Fine-tune ModernBERT for RAG with Synthetic Data

and 2 others •

Jan 20

• 36

posted an update about 1 month ago

Post

1629

𝐍𝐞𝐰 𝐈𝐭𝐚𝐥𝐢𝐚𝐧 𝐒𝐦𝐚𝐥𝐥 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬: 𝐆𝐞𝐦𝐦𝐚 𝐍𝐞𝐨𝐠𝐞𝐧𝐞𝐬𝐢𝐬 𝐜𝐨𝐥𝐥𝐞𝐜𝐭𝐢𝐨𝐧 💎🌍🇮🇹

I am happy to release two new language models for the Italian Language!

💪 Gemma 2 9B Neogenesis ITA
anakin87/gemma-2-9b-neogenesis-ita
Building on the impressive work by VAGO Solutions, I applied Direct Preference Optimization with a mix of Italian and English data.
Using Spectrum, I trained 20% of model layers.

📊 Evaluated on the Open ITA LLM leaderboard ( mii-llm/open_ita_llm_leaderboard), this model achieves strong performance.
To beat it on this benchmark, you'd need a 27B model 😎

🤏 Gemma 2 2B Neogenesis ITA
anakin87/gemma-2-2b-neogenesis-ita
This smaller variant is fine-tuned from the original Gemma 2 2B it by Google.
Through a combination of Supervised Fine-Tuning and Direct Preference Optimization, I trained 25% of the layers using Spectrum.

📈 Compared to the original model, it shows improved Italian proficiency, good for its small size.

Both models were developed during the recent #gemma competition on Kaggle.
📓 Training code: https://www.kaggle.com/code/anakin87/post-training-gemma-for-italian-and-beyond

🙏 Thanks @FinancialSupport and mii-llm for the help during evaluation.

3 replies

updated 4 Spaces about 1 month ago

Fact Checking rocks!

🎸

Visualize data interactively with Streamlit

Phi 3.5 Mini ITA

💬

Chat with an Italian Small Model

Gemma 2 2B Neogenesis ITA

💎

Chat with an Italian Small Model

Gemma 2 9B Neogenesis ITA

💎

9B Italian strong model 💪