Upload README.md
Browse files
README.md
CHANGED
@@ -53,19 +53,22 @@ A technical report detailing our proposed `LEAF` training procedure will be avai
|
|
53 |
|
54 |
The table below shows the average BEIR benchmark scores (nDCG@10) for `mdbr-leaf-ir` compared to other retrieval models.
|
55 |
|
56 |
-
|
57 |
-
|
58 |
-
|
|
59 |
-
|
60 |
-
|
|
61 |
-
|
|
62 |
-
|
|
63 |
-
|
|
64 |
-
|
|
65 |
-
|
|
66 |
-
|
|
67 |
-
|
68 |
-
|
|
|
|
|
|
|
69 |
|
70 |
|
71 |
# Quickstart
|
@@ -115,7 +118,7 @@ for i, query in enumerate(queries):
|
|
115 |
|
116 |
See full example notebook [here](https://huggingface.co/MongoDB/mdbr-leaf-ir/blob/main/transformers_example.ipynb).
|
117 |
|
118 |
-
## Asymmetric Retrieval Setup
|
119 |
|
120 |
`mdbr-leaf-ir` is *aligned* to [`snowflake-arctic-embed-m-v1.5`](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5), the model it has been distilled from. This enables flexible architectures in which, for example, documents are encoded using the larger model, while queries can be encoded faster and more efficiently with the compact `leaf` model:
|
121 |
```python
|
|
|
53 |
|
54 |
The table below shows the average BEIR benchmark scores (nDCG@10) for `mdbr-leaf-ir` compared to other retrieval models.
|
55 |
|
56 |
+
`mdbr-leaf-ir` ranks #1 on the BEIR public leaderboard, and when run in asymmetric "**(asym.)**" mode as described [here](#asymmetric-retrieval-setup), the results improve even further.
|
57 |
+
|
58 |
+
| Model | Size | BEIR Avg. (nDCG@10) |
|
59 |
+
|------------------------------------|---------|----------------------|
|
60 |
+
| OpenAI text-embedding-3-large | Unknown | 55.43 |
|
61 |
+
| **mdbr-leaf-ir (asym.)** | 23M | **54.03** |
|
62 |
+
| **mdbr-leaf-ir** | 23M | **53.55** |
|
63 |
+
| snowflake-arctic-embed-s | 32M | 51.98 |
|
64 |
+
| bge-small-en-v1.5 | 33M | 51.65 |
|
65 |
+
| OpenAI text-embedding-3-small | Unknown | 51.08 |
|
66 |
+
| granite-embedding-small-english-r2 | 47M | 50.87 |
|
67 |
+
| snowflake-arctic-embed-xs | 23M | 50.15 |
|
68 |
+
| e5-small-v2 | 33M | 49.04 |
|
69 |
+
| SPLADE++ | 110M | 48.88 |
|
70 |
+
| MiniLM-L6-v2 | 23M | 41.95 |
|
71 |
+
| BM25 | – | 41.14 |
|
72 |
|
73 |
|
74 |
# Quickstart
|
|
|
118 |
|
119 |
See full example notebook [here](https://huggingface.co/MongoDB/mdbr-leaf-ir/blob/main/transformers_example.ipynb).
|
120 |
|
121 |
+
## Asymmetric Retrieval Setup
|
122 |
|
123 |
`mdbr-leaf-ir` is *aligned* to [`snowflake-arctic-embed-m-v1.5`](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5), the model it has been distilled from. This enables flexible architectures in which, for example, documents are encoded using the larger model, while queries can be encoded faster and more efficiently with the compact `leaf` model:
|
124 |
```python
|