Tolga Cangöz's picture

Tolga Cangöz

tolgacangoz

AI & ML interests

AIGC

Recent Activity

liked a model about 11 hours ago
diffusers/Wan2.1-VAE
reacted to tomaarsen's post with ❤️ about 11 hours ago
An assembly of 18 European companies, labs, and universities have banded together to launch 🇪🇺 EuroBERT! It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc. 🇪🇺 15 Languages: English, French, German, Spanish, Chinese, Italian, Russian, Polish, Portuguese, Japanese, Vietnamese, Dutch, Arabic, Turkish, Hindi 3️⃣ 3 model sizes: 210M, 610M, and 2.1B parameters - very very useful sizes in my opinion ➡️ Sequence length of 8192 tokens! Nice to see these higher sequence lengths for encoders becoming more common. ⚙️ Architecture based on Llama, but with bi-directional (non-causal) attention to turn it into an encoder. Flash Attention 2 is supported. 🔥 A new Pareto frontier (stronger *and* smaller) for multilingual encoder models 📊 Evaluated against mDeBERTa, mGTE, XLM-RoBERTa for Retrieval, Classification, and Regression (after finetuning for each task separately): EuroBERT punches way above its weight. 📝 Detailed paper with all details, incl. data: FineWeb for English and CulturaX for multilingual data, The Stack v2 and Proof-Pile-2 for code. Check out the release blogpost here: https://huggingface.co/blog/EuroBERT/release * https://huggingface.co/EuroBERT/EuroBERT-210m * https://huggingface.co/EuroBERT/EuroBERT-610m * https://huggingface.co/EuroBERT/EuroBERT-2.1B The next step is for researchers to build upon the 3 EuroBERT base models and publish strong retrieval, zero-shot classification, etc. models for all to use. I'm very much looking forward to it!
View all activity

Organizations

Spaces-explorers's profile picture Blog-explorers's profile picture open/ acc's profile picture

tolgacangoz's activity

New activity in modelscope/AnyText 14 days ago

Runtime error

2
#8 opened 15 days ago by
tolgacangoz
New activity in pcuenq/mdm 5 months ago
New activity in madebyollin/megalith-10m 5 months ago

Update README.md

1
#7 opened 5 months ago by
tolgacangoz
New activity in pcuenq/mdm-flickr-64 6 months ago

Update the license to MIT

#2 opened 6 months ago by
tolgacangoz
New activity in pcuenq/mdm-flickr-256 6 months ago

Update the license to MIT

#1 opened 6 months ago by
tolgacangoz
New activity in pcuenq/mdm-flickr-1024 6 months ago

Update the license to MIT

#1 opened 6 months ago by
tolgacangoz

FP16 vs FP32?

1
#48 opened 6 months ago by
tolgacangoz
New activity in diffusers/controlnet-zoe-depth-sdxl-1.0 7 months ago

Fix cpu offloading

3
#5 opened 7 months ago by
tolgacangoz
New activity in lllyasviel/sd-controlnet-canny 7 months ago

Update the bird's url

1
#6 opened almost 2 years ago by
lz1oceani
New activity in diffusers/controlnet-zoe-depth-sdxl-1.0 7 months ago

Update README.md

#6 opened 7 months ago by
tolgacangoz
New activity in a-r-r-o-w/AnyText 9 months ago

Fix `eps`

2
#1 opened 9 months ago by
tolgacangoz
New activity in ali-vilab/i2vgen-xl 9 months ago

The link is broken.

1
#14 opened 9 months ago by
tolgacangoz
New activity in diffusers/controlnet-depth-sdxl-1.0 11 months ago

Fix higher vRAM usage

1
#10 opened 11 months ago by
tolgacangoz
New activity in mfidabel/controlnet-segment-anything about 1 year ago

Runtime Error

#2 opened about 1 year ago by
tolgacangoz
New activity in r23/ldm3d-space over 1 year ago

Demo - Runtime error

#1 opened over 1 year ago by
tolgacangoz

Demo - Runtime error

1
#6 opened over 1 year ago by
tolgacangoz