@m-ric on Hugging Face: "𝗔𝗜𝟮𝟭 𝗶𝘁𝗲𝗿𝗮𝘁𝗲𝘀 𝘄𝗶𝘁𝗵 𝗻𝗲𝘄 𝗝𝗮𝗺𝗯𝗮 𝟭.𝟱 𝗿𝗲𝗹𝗲𝗮𝘀𝗲:…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

m-ric

posted an update Aug 22, 2024

Post

921

𝗔𝗜𝟮𝟭 𝗶𝘁𝗲𝗿𝗮𝘁𝗲𝘀 𝘄𝗶𝘁𝗵 𝗻𝗲𝘄 𝗝𝗮𝗺𝗯𝗮 𝟭.𝟱 𝗿𝗲𝗹𝗲𝗮𝘀𝗲: 𝗡𝗲𝘄 𝘀𝘁𝗮𝗻𝗱𝗮𝗿𝗱 𝗳𝗼𝗿 𝗹𝗼𝗻𝗴-𝗰𝗼𝗻𝘁𝗲𝘅𝘁 𝘂𝘀𝗲-𝗰𝗮𝘀𝗲𝘀!🏅

@ai21labs used a different architecture to beat the status-quo Transformers models: Jamba architecture combines classic Transformers layers with the new Mamba layers, for which the complexity is a linear (instead of quadratic) function of the context length.

What does this imply?

➡️ Jamba models are much more efficient for long contexts: faster (up to 2.5x faster for long context), takes less memory, and also performs better to recall everything in the prompt.

That means it’s a new go-to model for RAG or agentic applications!

And the performance is not too shabby: Jamba 1.5 models are comparable in perf to similar-sized Llama-3.1 models! The largest model even outperforms Llama-3.1 405B on Arena-Hard.

✌️ Comes in 2 sizes: Mini (12B active/52B) and Large (94B active/399B)
📏 Both deliver 256k context length, for low memory: Jamba-1.5 mini fits 140k context length on one single A100.
⚙️ New quanttization method: Experts Int8 quantizes only the weights parts of the MoE layers, which account for 85% of weights
🤖 Natively supports JSON format generation & function calling.
🔓 Permissive license *if your org makes <$50M revenue*

Available on the Hub 👉 ai21labs/jamba-15-66c44befa474a917fcf55251
Read their release blog post 👉 https://www.ai21.com/blog/announcing-jamba-model-family

kristaller486

Aug 22, 2024

It is not a permissive license if you have revenue restrictions. Permissive license is apache 2.0 or MIT.

m-ric

Aug 22, 2024

Yes that's why I added the big if behind!

In this post