rvo commited on
Commit
71b1ecb
·
verified ·
1 Parent(s): d2e2fba

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -14
README.md CHANGED
@@ -53,19 +53,22 @@ A technical report detailing our proposed `LEAF` training procedure will be avai
53
 
54
  The table below shows the average BEIR benchmark scores (nDCG@10) for `mdbr-leaf-ir` compared to other retrieval models.
55
 
56
- | Model | Size | BEIR Avg. (nDCG@10) |
57
- |------------------------------------|------|----------------------|
58
- | **mdbr-leaf-ir** | 23M | **53.55** |
59
- | snowflake-arctic-embed-s | 32M | 51.98 |
60
- | bge-small-en-v1.5 | 33M | 51.65 |
61
- | granite-embedding-small-english-r2 | 47M | 50.87 |
62
- | snowflake-arctic-embed-xs | 23M | 50.15 |
63
- | e5-small-v2 | 33M | 49.04 |
64
- | SPLADE++ | 110M | 48.88 |
65
- | MiniLM-L6-v2 | 23M | 41.95 |
66
- | BM25 | | 41.14 |
67
-
68
- [//]: # (| **mdbr-leaf-ir (asym.)** | 23M | **?** | )
 
 
 
69
 
70
 
71
  # Quickstart
@@ -115,7 +118,7 @@ for i, query in enumerate(queries):
115
 
116
  See full example notebook [here](https://huggingface.co/MongoDB/mdbr-leaf-ir/blob/main/transformers_example.ipynb).
117
 
118
- ## Asymmetric Retrieval Setup
119
 
120
  `mdbr-leaf-ir` is *aligned* to [`snowflake-arctic-embed-m-v1.5`](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5), the model it has been distilled from. This enables flexible architectures in which, for example, documents are encoded using the larger model, while queries can be encoded faster and more efficiently with the compact `leaf` model:
121
  ```python
 
53
 
54
  The table below shows the average BEIR benchmark scores (nDCG@10) for `mdbr-leaf-ir` compared to other retrieval models.
55
 
56
+ `mdbr-leaf-ir` ranks #1 on the BEIR public leaderboard, and when run in asymmetric "**(asym.)**" mode as described [here](#asymmetric-retrieval-setup), the results improve even further.
57
+
58
+ | Model | Size | BEIR Avg. (nDCG@10) |
59
+ |------------------------------------|---------|----------------------|
60
+ | OpenAI text-embedding-3-large | Unknown | 55.43 |
61
+ | **mdbr-leaf-ir (asym.)** | 23M | **54.03** |
62
+ | **mdbr-leaf-ir** | 23M | **53.55** |
63
+ | snowflake-arctic-embed-s | 32M | 51.98 |
64
+ | bge-small-en-v1.5 | 33M | 51.65 |
65
+ | OpenAI text-embedding-3-small | Unknown | 51.08 |
66
+ | granite-embedding-small-english-r2 | 47M | 50.87 |
67
+ | snowflake-arctic-embed-xs | 23M | 50.15 |
68
+ | e5-small-v2 | 33M | 49.04 |
69
+ | SPLADE++ | 110M | 48.88 |
70
+ | MiniLM-L6-v2 | 23M | 41.95 |
71
+ | BM25 | – | 41.14 |
72
 
73
 
74
  # Quickstart
 
118
 
119
  See full example notebook [here](https://huggingface.co/MongoDB/mdbr-leaf-ir/blob/main/transformers_example.ipynb).
120
 
121
+ ## Asymmetric Retrieval Setup
122
 
123
  `mdbr-leaf-ir` is *aligned* to [`snowflake-arctic-embed-m-v1.5`](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5), the model it has been distilled from. This enables flexible architectures in which, for example, documents are encoded using the larger model, while queries can be encoded faster and more efficiently with the compact `leaf` model:
124
  ```python