Create README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,16 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
datasets:
|
3 |
+
- tiiuae/falcon-refinedweb
|
4 |
+
language:
|
5 |
+
- en
|
6 |
+
library_name: transformers.js
|
7 |
+
license: mit
|
8 |
+
pipeline_tag: feature-extraction
|
9 |
+
---
|
10 |
+
|
11 |
+
# NeoBERT
|
12 |
+
|
13 |
+
NeoBERT is a **next-generation encoder** model for English text representation, pre-trained from scratch on the RefinedWeb dataset. NeoBERT integrates state-of-the-art advancements in architecture, modern data, and optimized pre-training methodologies. It is designed for seamless adoption: it serves as a plug-and-play replacement for existing base models, relies on an **optimal depth-to-width ratio**, and leverages an extended context length of **4,096 tokens**. Despite its compact 250M parameter footprint, it is the most efficient model of its kind and achieves **state-of-the-art results** on the massive MTEB benchmark, outperforming BERT large, RoBERTa large, NomicBERT, and ModernBERT under identical fine-tuning conditions.
|
14 |
+
|
15 |
+
- Paper: [paper](https://arxiv.org/abs/2502.19587)
|
16 |
+
- Repository: [github](https://github.com/chandar-lab/NeoBERT).
|