adrien-riaux commited on
Commit
514a4d2
·
verified ·
1 Parent(s): 8970a21

docs: update README

Browse files
Files changed (1) hide show
  1. README.md +9 -12
README.md CHANGED
@@ -6,23 +6,22 @@ tags:
6
  base_model: nomic-ai/modernbert-embed-base
7
  pipeline_tag: sentence-similarity
8
  library_name: sentence-transformers
 
9
  ---
10
 
11
- # SentenceTransformer based on nomic-ai/modernbert-embed-base
12
 
13
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base). It maps sentences & paragraphs to a 256-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
14
 
15
  ## Model Details
16
 
17
  ### Model Description
18
  - **Model Type:** Sentence Transformer
19
  - **Base model:** [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) <!-- at revision d556a88e332558790b210f7bdbe87da2fa94a8d8 -->
20
- - **Maximum Sequence Length:** inf tokens
21
  - **Output Dimensionality:** 256 dimensions
22
  - **Similarity Function:** Cosine Similarity
23
- <!-- - **Training Dataset:** Unknown -->
24
- <!-- - **Language:** Unknown -->
25
- <!-- - **License:** Unknown -->
26
 
27
  ### Model Sources
28
 
@@ -110,19 +109,17 @@ You can finetune this model on your own dataset.
110
 
111
  ## Training Details
112
 
 
 
 
 
113
  ### Framework Versions
114
  - Python: 3.11.9
115
  - Sentence Transformers: 3.4.1
116
  - Transformers: 4.48.3
117
  - PyTorch: 2.2.2
118
- - Accelerate:
119
- - Datasets:
120
  - Tokenizers: 0.21.0
121
 
122
- ## Citation
123
-
124
- ### BibTeX
125
-
126
  <!--
127
  ## Glossary
128
 
 
6
  base_model: nomic-ai/modernbert-embed-base
7
  pipeline_tag: sentence-similarity
8
  library_name: sentence-transformers
9
+ license: mit
10
  ---
11
 
12
+ # ModernBERT Embed Base Distilled
13
 
14
+ This is a [sentence-transformers](https://www.SBERT.net) model distilled from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base). It maps sentences & paragraphs to a 256-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
15
 
16
  ## Model Details
17
 
18
  ### Model Description
19
  - **Model Type:** Sentence Transformer
20
  - **Base model:** [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) <!-- at revision d556a88e332558790b210f7bdbe87da2fa94a8d8 -->
21
+ - **Maximum Sequence Length:** 8 192 tokens
22
  - **Output Dimensionality:** 256 dimensions
23
  - **Similarity Function:** Cosine Similarity
24
+
 
 
25
 
26
  ### Model Sources
27
 
 
109
 
110
  ## Training Details
111
 
112
+ ### Distillation Process
113
+
114
+ The model is distilled using [Model2Vec](https://huggingface.co/blog/Pringled/model2vec) framework. It is a new technique for creating extremely fast and small static embedding models from any Sentence Transformer.
115
+
116
  ### Framework Versions
117
  - Python: 3.11.9
118
  - Sentence Transformers: 3.4.1
119
  - Transformers: 4.48.3
120
  - PyTorch: 2.2.2
 
 
121
  - Tokenizers: 0.21.0
122
 
 
 
 
 
123
  <!--
124
  ## Glossary
125