Mert Erbak's picture

Mert Erbak PRO

merterbak

AI & ML interests

NLP and Image Processing

Recent Activity

Organizations

Open-Source AI Meetup's profile picture MLX Community's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture open/ acc's profile picture AI Starter Pack's profile picture

Posts 10

view post
Post
1568
Microsoft released their new fine-tuned phi-4 models with reasoning data yesterday. They outperform/rival much larger models . Check out them if you haven't yet. πŸš€

Phi4 mini reasoning(SFT): microsoft/Phi-4-mini-reasoning
Phi-4 reasoning(SFT): microsoft/Phi-4-reasoning
Phi-4 reasoning plus (SFT + RL): microsoft/Phi-4-reasoning-plus
Demo: https://github.com/marketplace/models/azureml/Phi-4-reasoning/playground
Articles: https://arxiv.org/pdf/2504.21318
https://arxiv.org/pdf/2504.21233
Blog: https://azure.microsoft.com/en-us/blog/one-year-of-phi-small-language-models-making-big-leaps-in-ai/

view post
Post
4761
Qwen 3 models releasedπŸ”₯
It offers 2 MoE and 6 dense models with following parameter sizes: 0.6B, 1.7B, 4B, 8B, 14B, 30B(MoE), 32B, and 235B(MoE).
Models: Qwen/qwen3-67dd247413f0e2e4f653967f
Blog: https://qwenlm.github.io/blog/qwen3/
Demo: Qwen/Qwen3-Demo
GitHub: https://github.com/QwenLM/Qwen3

βœ… Pre-trained 119 languages(36 trillion tokens) and dialects with strong translation and instruction following abilities. (Qwen2.5 was pre-trained on 18 trillion tokens.)
βœ…Qwen3 dense models match the performance of larger Qwen2.5 models. For example, Qwen3-1.7B/4B/8B/14B/32B perform like Qwen2.5-3B/7B/14B/32B/72B.
βœ… Three stage done while pretraining:
β€’ Stage 1: General language learning and knowledge building.
β€’ Stage 2: Reasoning boost with STEM, coding, and logic skills.
β€’ Stage 3: Long context training
βœ… It supports MCP in the model
βœ… Strong agent skills
βœ… Supports seamless between thinking mode (for hard tasks like math and coding) and non-thinking mode (for fast chatting) inside chat template.
βœ… Better human alignment for creative writing, roleplay, multi-turn conversations, and following detailed instructions.