mmBERT: a modern multilingual encoder mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance jhu-clsp/mmBERT-base Fill-Mask • Updated 6 days ago • 17.3k • • 123 jhu-clsp/mmBERT-small Fill-Mask • Updated 6 days ago • 3.98k • • 40 jhu-clsp/mmBERT-checkpoints Updated 10 days ago • 2 jhu-clsp/mmBERT-pretrain-p1-fineweb2-langs Updated 5 days ago • 3.82k • 2
Encoders vs Decoders: the Ettin Suite A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published Jul 15 • 25 jhu-clsp/ettin-encoder-17m Fill-Mask • Updated Jul 16 • 3.48k • 7 jhu-clsp/ettin-encoder-32m Feature Extraction • Updated Jul 18 • 1.64k • • 3 jhu-clsp/ettin-encoder-68m Fill-Mask • Updated Jul 18 • 954 • • 2
mmBERT: a modern multilingual encoder mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance jhu-clsp/mmBERT-base Fill-Mask • Updated 6 days ago • 17.3k • • 123 jhu-clsp/mmBERT-small Fill-Mask • Updated 6 days ago • 3.98k • • 40 jhu-clsp/mmBERT-checkpoints Updated 10 days ago • 2 jhu-clsp/mmBERT-pretrain-p1-fineweb2-langs Updated 5 days ago • 3.82k • 2
Encoders vs Decoders: the Ettin Suite A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published Jul 15 • 25 jhu-clsp/ettin-encoder-17m Fill-Mask • Updated Jul 16 • 3.48k • 7 jhu-clsp/ettin-encoder-32m Feature Extraction • Updated Jul 18 • 1.64k • • 3 jhu-clsp/ettin-encoder-68m Fill-Mask • Updated Jul 18 • 954 • • 2