Update README.md
Browse files
README.md
CHANGED
@@ -104,9 +104,12 @@ tags:
|
|
104 |
- sentence-transformers
|
105 |
---
|
106 |
|
107 |
-
#
|
108 |
|
109 |
-
This [Model2Vec](https://github.com/MinishLab/model2vec) model is a distilled version of the
|
|
|
|
|
|
|
110 |
|
111 |
|
112 |
## Installation
|
@@ -117,31 +120,83 @@ pip install model2vec
|
|
117 |
```
|
118 |
|
119 |
## Usage
|
|
|
|
|
|
|
120 |
Load this model using the `from_pretrained` method:
|
|
|
121 |
```python
|
122 |
from model2vec import StaticModel
|
123 |
|
124 |
# Load a pretrained Model2Vec model
|
125 |
-
model = StaticModel.from_pretrained("
|
126 |
|
127 |
# Compute text embeddings
|
128 |
-
embeddings = model.encode(["
|
|
|
|
|
|
|
|
|
|
|
|
|
129 |
```
|
130 |
|
131 |
-
Alternatively, you can distill your own model using the `distill` method:
|
132 |
```python
|
133 |
-
from
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
134 |
|
135 |
-
|
136 |
-
model_name = "BAAI/bge-base-en-v1.5"
|
137 |
|
138 |
-
|
139 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
140 |
|
141 |
-
|
142 |
-
|
|
|
|
|
|
|
|
|
|
|
143 |
```
|
144 |
|
|
|
|
|
|
|
|
|
145 |
## How it works
|
146 |
|
147 |
Model2vec creates a small, fast, and powerful model that outperforms other static embedding models by a large margin on all tasks we could find, while being much faster to create than traditional static embedding models such as GloVe. Best of all, you don't need any data to distill a model using Model2Vec.
|
|
|
104 |
- sentence-transformers
|
105 |
---
|
106 |
|
107 |
+
# alikia2x/jina-embedding-v3-m2v-1024
|
108 |
|
109 |
+
This [Model2Vec](https://github.com/MinishLab/model2vec) model is a distilled version of the
|
110 |
+
[jinaai/jina-embeddings-v3](https://huggingface.co/jinaai/jina-embeddings-v3) Sentence Transformer.
|
111 |
+
It uses static embeddings, allowing text embeddings to be computed orders of magnitude faster on both GPU and CPU.
|
112 |
+
It is designed for applications where computational resources are limited or where real-time performance is critical.
|
113 |
|
114 |
|
115 |
## Installation
|
|
|
120 |
```
|
121 |
|
122 |
## Usage
|
123 |
+
|
124 |
+
### Via `model2vec`
|
125 |
+
|
126 |
Load this model using the `from_pretrained` method:
|
127 |
+
|
128 |
```python
|
129 |
from model2vec import StaticModel
|
130 |
|
131 |
# Load a pretrained Model2Vec model
|
132 |
+
model = StaticModel.from_pretrained("alikia2x/jina-embedding-v3-m2v-1024")
|
133 |
|
134 |
# Compute text embeddings
|
135 |
+
embeddings = model.encode(["Hello"])
|
136 |
+
```
|
137 |
+
|
138 |
+
### Via `sentence-transformers`
|
139 |
+
|
140 |
+
```bash
|
141 |
+
pip install sentence-transformers
|
142 |
```
|
143 |
|
|
|
144 |
```python
|
145 |
+
from sentence_transformers import SentenceTransformer
|
146 |
+
|
147 |
+
model = SentenceTransformer("alikia2x/jina-embedding-v3-m2v-1024")
|
148 |
+
|
149 |
+
# embedding:
|
150 |
+
# array([[ 1.1825741e-01, -1.2899181e-02, -1.0492010e-01, ...,
|
151 |
+
# 1.1131058e-03, 8.2779792e-04, -7.6874542e-08]],
|
152 |
+
# shape=(1, 1024), dtype=float32)
|
153 |
+
embeddings = model.encode(["Hello"])
|
154 |
+
```
|
155 |
+
|
156 |
+
### Via ONNX
|
157 |
+
|
158 |
+
```bash
|
159 |
+
pip install onnxruntime transformers
|
160 |
+
```
|
161 |
|
162 |
+
You need to download `onnx/model.onnx` in this repository first.
|
|
|
163 |
|
164 |
+
```python
|
165 |
+
import onnxruntime
|
166 |
+
from transformers import AutoTokenizer
|
167 |
+
import numpy as np
|
168 |
+
|
169 |
+
tokenizer_model = "alikia2x/jina-embedding-v3-m2v-1024"
|
170 |
+
onnx_embedding_path = "path/to/your/model.onnx"
|
171 |
+
|
172 |
+
texts = ["Hello"]
|
173 |
+
tokenizer = AutoTokenizer.from_pretrained(tokenizer_model)
|
174 |
+
session = onnxruntime.InferenceSession(onnx_embedding_path)
|
175 |
+
|
176 |
+
inputs = tokenizer(texts, add_special_tokens=False, return_tensors="np")
|
177 |
+
input_ids = inputs["input_ids"]
|
178 |
+
lengths = [len(seq) for seq in input_ids[:-1]]
|
179 |
+
offsets = [0] + np.cumsum(lengths).tolist()
|
180 |
+
flattened_input_ids = input_ids.flatten().astype(np.int64)
|
181 |
+
|
182 |
+
inputs = {
|
183 |
+
"input_ids": flattened_input_ids,
|
184 |
+
"offsets": np.array(offsets, dtype=np.int64),
|
185 |
+
}
|
186 |
|
187 |
+
outputs = session.run(None, inputs)
|
188 |
+
embeddings = outputs[0]
|
189 |
+
embeddings = embeddings.flatten()
|
190 |
+
|
191 |
+
# [ 1.1825741e-01 -1.2899181e-02 -1.0492010e-01 ... 1.1131058e-03
|
192 |
+
# 8.2779792e-04 -7.6874542e-08]
|
193 |
+
print(embeddings)
|
194 |
```
|
195 |
|
196 |
+
Note: A quantized (INT8) version of this model is also available, offering reduced memory usage with minimal performance impact.
|
197 |
+
Simply replace `onnx/model.onnx` with the `onnx/model_INT8.onnx` file.
|
198 |
+
Our testing shows less than a 1% drop in the F1 score on a real down-stream task.
|
199 |
+
|
200 |
## How it works
|
201 |
|
202 |
Model2vec creates a small, fast, and powerful model that outperforms other static embedding models by a large margin on all tasks we could find, while being much faster to create than traditional static embedding models such as GloVe. Best of all, you don't need any data to distill a model using Model2Vec.
|