@philschmid on Hugging Face: "Gemini 2.5 Flash is here! We excited launch our first hybrid reasoning Gemini…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

philschmid

posted an update Apr 18

Post

4337

Gemini 2.5 Flash is here! We excited launch our first hybrid reasoning Gemini model. In Flash 2.5 developer can turn thinking off.

**TL;DR:**
- 🧠 Controllable "Thinking" with thinking budget with up to 24k token
- 🌌 1 Million multimodal input context for text, image, video, audio, and pdf
- 🛠️ Function calling, structured output, google search & code execution.
- 🏦 $0.15 1M input tokens; $0.6 or $3.5 (thinking on) per million output tokens (thinking tokens are billed as output tokens)
- 💡 Knowledge cut of January 2025
- 🚀 Rate limits - Free 10 RPM 500 req/day
- 🏅Outperforms 2.0 Flash on every benchmark

Try it ⬇️
https://aistudio.google.com/prompts/new_chat?model=gemini-2.5-flash-preview-04-17

kaveeshwaran

Apr 18

Tech Enthusiast Style
"Whoa, controllable thinking? That’s not just smart — it’s brilliant. Hybrid reasoning just leveled up."

🔹 Developer Style
"Thinking tokens ON or OFF — finally, a model with a switch for my compute bill and creativity at the same time. Gemini 2.5 Flash just rewrote the rules."

🔹 Sassy & Fun
"Gemini 2.5 Flash said: Why think all the time? Take a break. Save money. Stay genius."

🔹 Minimal & Cool
"Multimodal. Million token input. Toggleable thought. Mind. Blown. 💥"

🔹 Product-Led Message
"Controllable cognition with multimodal scale? Gemini 2.5 Flash is the tool we didn’t know we needed. Now it's essential."

🔹 Futuristic Vibe
"AI that knows when to think and when to move fast? Welcome to the age of intelligent restraint."

In this post