Akim Mousterou

AkimfromParis

AkimfromParis

AI & ML interests

Natural Language Processing for French, English, and Japanese | - - - | NLP in finance and economics, opinion mining, and quantitative research.

Recent Activity

new activity about 1 month ago

mcp-course/README:No Submit button after Unit 1 Quiz

new activity about 2 months ago

llm-jp/ja-vg-vqa-conversation:Which dataset is suitable for JA-VG-VQA conversion?

new activity 2 months ago

llm-jp/open-japanese-llm-leaderboard:Benchmarking Stability Japanese Stable LM variants?

View all activity

Organizations

New activity in mcp-course/README about 1 month ago

No Submit button after Unit 1 Quiz

#9 opened about 1 month ago by

AkimfromParis

New activity in llm-jp/ja-vg-vqa-conversation about 2 months ago

Which dataset is suitable for JA-VG-VQA conversion?

#1 opened about 2 months ago by

sabaridsnfuji

New activity in llm-jp/open-japanese-llm-leaderboard 2 months ago

Benchmarking Stability Japanese Stable LM variants?

#2 opened 2 months ago by

erclee2

New activity in llm-jp/open-japanese-llm-leaderboard 3 months ago

Qwen2.5-72B-Instruct results?

#1 opened 3 months ago by

yellowtown

liked a Space 5 months ago

2.81k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

upvoted a paper 5 months ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 197

liked a Space 5 months ago

Leaderboard LLM FR

🏆

Track, rank and evaluate open LLMs and chatbots in French

reacted to lewtun's post with 🔥 6 months ago

Post

10418

We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1

5 replies

reacted to burtenshaw's post with 🤗 6 months ago

Post

4085

🚧 Work in Progress! 🚧

👷‍♀️ We're working hard on getting the official agents course ready for the 50,000 students that have signed up.

If you want to contribute to the discussion, I started these community posts. Looking forward to hearing from you:

- smolagents unit in the agents course - agents-course/README#7
- LlamaIndex Unit in the agents course - agents-course/README#6
- LangChain and LangGraph unit in the agents course - agents-course/README#5
- Real world use cases in the agents course - agents-course/README#8

reacted to davidberenstein1957's post with 👀 6 months ago

Post

1268

You can now use the "Synthetic Data Generator" at a much larger scale with your preferred inference engine: Ollama, vLLM, TGI, and serverless inference! 🔥

Install, configure, launch!

Space: https://huggingface.co/spaces/argilla/synthetic-data-generator?duplicate=true
Examples: https://github.com/argilla-io/synthetic-data-generator/tree/main/examples

replied to MoritzLaurer's post 6 months ago

OpenAI sales revenues forecasted at $11.6 billion for 2025. So they will probably be positive.
Maybe you can burn cash, when you have a valuation at $157B?! Numbers are really crazy, only history will tell…

New activity in answerdotai/ModernBERT-base 6 months ago

How to use ModernBERT with the AutoModelForQuestionAnswering class?

➕ 3

#15 opened 7 months ago by

sraj

posted an update 6 months ago

Post

1825

💵 Polymarket is leveraging “Chatbot Arena LLM Leaderboard” on HuggingFace for online gambling on the “Top AI model on January 31?”. 🤗

As of January 3rd, 2025:
-1./ Gemini (83%) -2./ ChatGPT (13%) -3./ Other (2%) -4./ Claude (2%) -5./ Grok (1%) -6./ Llama (<1%)

🇺🇸 The market opinion is following historical data. It's clearly bias towards US historical AI giants, yet Polymarket is forbidden in the USA and for US citizens.

🇨🇳 In the “Other”, you might have Chinese AI labs that are probably the future AI leaders (Qwen, DeepSeek, Yi).

⚖️ In the market resolution, if two models are tied in the evaluation, they will take the alphabetical order. (e.g. if both were tied, “Google” would resolve to “Yes”, and “xAI” would resolve to “No”). 🙃

That might be illegal usage of the Chatbot Arena policy? And maybe HuggingFace? @clem
Or maybe authors and contributors should get a cut each month as “market markers”. @weichiang @angelopoulos

1 reply

upvoted an article 6 months ago

Article

🇪🇺✍️ EU AI Act: Systemic Risks in the First CoP Draft Comments ✍️🇪🇺

and 1 other •

Dec 12, 2024

• 14

liked a Space 6 months ago

543