Tolga Cangöz's picture

Tolga Cangöz

tolgacangoz

AI & ML interests

AIGC

Recent Activity

liked a model about 11 hours ago
diffusers/Wan2.1-VAE
reacted to tomaarsen's post with ❤️ about 11 hours ago
An assembly of 18 European companies, labs, and universities have banded together to launch 🇪🇺 EuroBERT! It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc. 🇪🇺 15 Languages: English, French, German, Spanish, Chinese, Italian, Russian, Polish, Portuguese, Japanese, Vietnamese, Dutch, Arabic, Turkish, Hindi 3️⃣ 3 model sizes: 210M, 610M, and 2.1B parameters - very very useful sizes in my opinion ➡️ Sequence length of 8192 tokens! Nice to see these higher sequence lengths for encoders becoming more common. ⚙️ Architecture based on Llama, but with bi-directional (non-causal) attention to turn it into an encoder. Flash Attention 2 is supported. 🔥 A new Pareto frontier (stronger *and* smaller) for multilingual encoder models 📊 Evaluated against mDeBERTa, mGTE, XLM-RoBERTa for Retrieval, Classification, and Regression (after finetuning for each task separately): EuroBERT punches way above its weight. 📝 Detailed paper with all details, incl. data: FineWeb for English and CulturaX for multilingual data, The Stack v2 and Proof-Pile-2 for code. Check out the release blogpost here: https://huggingface.co/blog/EuroBERT/release * https://huggingface.co/EuroBERT/EuroBERT-210m * https://huggingface.co/EuroBERT/EuroBERT-610m * https://huggingface.co/EuroBERT/EuroBERT-2.1B The next step is for researchers to build upon the 3 EuroBERT base models and publish strong retrieval, zero-shot classification, etc. models for all to use. I'm very much looking forward to it!
View all activity

Organizations

Spaces-explorers's profile picture Blog-explorers's profile picture open/ acc's profile picture

tolgacangoz's activity

upvoted an article 1 day ago
upvoted an article 9 days ago
view article
Article

You could have designed state of the art positional encoding

180
upvoted an article 16 days ago
view article
Article

Remote VAEs for decoding with HF endpoints 🤗

34
upvoted an article 28 days ago
view article
Article

Build awesome datasets for video generation

26