ekolodin commited on
Commit
3c12a84
·
verified ·
1 Parent(s): fb58adb

3b-september-2025 upload

Browse files
1_Pooling/config.json CHANGED
@@ -1,10 +1,10 @@
1
  {
2
- "word_embedding_dimension": 2048,
3
- "pooling_mode_cls_token": false,
4
- "pooling_mode_mean_tokens": true,
5
- "pooling_mode_max_tokens": false,
6
- "pooling_mode_mean_sqrt_len_tokens": false,
7
- "pooling_mode_weightedmean_tokens": false,
8
- "pooling_mode_lasttoken": false,
9
- "include_prompt": true
10
  }
 
1
  {
2
+ "word_embedding_dimension": 2048,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
  }
README.md CHANGED
@@ -1,138 +1,141 @@
1
  ---
2
- license: mit
3
- language:
4
- - ru
5
- - en
6
- pipeline_tag: feature-extraction
7
  tags:
8
- - MTEB
 
 
 
 
 
9
  ---
10
- ## Giga-Embeddings-instruct
11
- - Base Decoder-only LLM: GigaChat-3b
12
- - Pooling Type: Latent-Attention
13
- - Embedding Dimension: 2048
14
 
15
- ## Использование
16
 
17
- Ниже приведен пример кодирования запросов и текстов.
18
 
19
- ### Requirements
20
 
21
- ```bash
22
- pip install -q transformers==4.46.3 sentence-transformers==3.3.1 datasets langchain_community langchain_huggingface langchain_gigachat
23
- ```
 
 
 
 
 
 
24
 
25
- ### Transformers
26
 
27
- ```python
28
- import os
29
- import torch
30
- import torch.nn.functional as F
31
- from transformers import AutoTokenizer, AutoModel
32
-
33
- # Each query needs to be accompanied by an corresponding instruction describing the task.
34
- task_name_to_instruct = {"example": "Given a question, retrieve passages that answer the question",}
35
-
36
- query_prefix = task_name_to_instruct["example"] + "\nquestion: "
37
- queries = [
38
- 'are judo throws allowed in wrestling?',
39
- 'how to become a radiology technician in michigan?'
40
- ]
41
 
42
- # No instruction needed for retrieval passages
43
- passage_prefix = ""
44
- passages = [
45
- "Since you're reading this, you are probably someone from a judo background or someone who is just wondering how judo techniques can be applied under wrestling rules. So without further ado, let's get to the question. Are Judo throws allowed in wrestling? Yes, judo throws are allowed in freestyle and folkstyle wrestling. You only need to be careful to follow the slam rules when executing judo throws. In wrestling, a slam is lifting and returning an opponent to the mat with unnecessary force.",
46
- "Below are the basic steps to becoming a radiologic technologist in Michigan:Earn a high school diploma. As with most careers in health care, a high school education is the first step to finding entry-level employment. Taking classes in math and science, such as anatomy, biology, chemistry, physiology, and physics, can help prepare students for their college studies and future careers.Earn an associate degree. Entry-level radiologic positions typically require at least an Associate of Applied Science. Before enrolling in one of these degree programs, students should make sure it has been properly accredited by the Joint Review Committee on Education in Radiologic Technology (JRCERT).Get licensed or certified in the state of Michigan."
47
- ]
 
 
48
 
49
- # load model with tokenizer
50
- model = AutoModel.from_pretrained('ai-sage/Giga-Embeddings-instruct', trust_remote_code=True)
51
 
52
- # get the embeddings
53
- query_embeddings = model.encode(queries, instruction=query_prefix)
54
- passage_embeddings = model.encode(passages, instruction=passage_prefix)
55
 
56
- scores = (query_embeddings @ passage_embeddings.T) * 100
57
- print(scores.tolist())
58
- ```
59
 
60
- ### LangChain
 
 
61
 
 
62
  ```python
63
- import torch
64
-
65
- from langchain_huggingface import HuggingFaceEmbeddings
66
-
67
- # Load model
68
- embeddings = HuggingFaceEmbeddings(
69
- model_name='ai-sage/Giga-Embeddings-instruct',
70
- encode_kwargs={},
71
- model_kwargs={
72
- 'device': 'cuda', # or 'cpu'
73
- 'trust_remote_code': True,
74
- 'model_kwargs': {'torch_dtype': torch.bfloat16},
75
- 'prompts': {'query': 'Given a question, retrieve passages that answer the question\nquestion: '}
76
- }
77
- )
 
 
 
 
78
 
79
- # Tokenizer
80
- embeddings._client.tokenizer.tokenize("Hello world! I am GigaChat")
81
 
82
- # Query embeddings
83
- query_embeddings = embeddings.embed_query("Hello world!")
84
- print(f"Your embeddings: {query_embeddings[0:20]}...")
85
- print(f"Vector size: {len(query_embeddings)}")
86
 
87
- # Document embeddings
88
- documents = ["foo bar", "bar foo"]
89
- documents_embeddings = embeddings.embed_documents(documents)
90
- print(f"Vector size: {len(documents_embeddings)} x {len(documents_embeddings[0])}")
91
- ```
92
 
93
- ## Инструктивность
 
94
 
95
- **Использование инструкций для улучшения качества эмбеддингов**
96
 
97
- Для достижения более точных результатов при работе с эмбеддингами, особенно в задачах поиска и извлечения информации (retrieval), рекомендуется добавлять инструкцию на естественном языке перед текстовым запросом (query). Это помогает модели лучше понять контекст и цель запроса, что положительно сказывается на качестве результатов. Важно отметить, что инструкцию нужно добавлять только перед запросом, а не перед документом.
98
 
99
- Для **симметричных задач**, таких как классификация (classification) или семантическое сравнение текстов (semantic text similarity), инструкцию необходимо добавлять перед каждым запросом. Это связано с тем, что такие задачи требуют одинакового контекста для всех входных данных, чтобы модель могла корректно сравнивать или классифицировать их.
 
100
 
101
- **Примеры инструкций для симметричных задач:**
102
- - `"Retrieve semantically similar text \ntext: {query}"`
103
- - `"Given a text, retrieve semantically similar text \ntext: {query}"`
104
- - `"Дано предложение, необходимо найти его парафраз \nпредложение: {query}"`
105
- - `"Классифицируй отзыв на товар как положительный, отрицательный или нейтральный \nотзыв: {query}"`
106
- - `"Классифицируй чувствительную тему по запросу \nзапрос: {query}"`
107
 
108
- Для **retrieval-задач** (например, поиск ответа в тексте) можно использовать инструкцию:
109
- `'Дан вопрос, необходимо найти абзац текста с ответом \nвопрос: {query}'`.
110
 
111
- Такой подход особенно эффективен для задач поиска и извлечения информации, таких как поиск релевантных документов или извлечение ответов из текста.
 
112
 
113
- **Примеры инструкций для retrieval-задач:**
114
- - `'Дан вопрос, необходимо найти абзац текста с ответом \nвопрос: {query}'`
115
- - `'Given the question, find a paragraph with the answer \nquestion: {query}'`
116
 
117
- Использование инструкций позволяет значительно улучшить качество поиска и релевантность результатов, что подтверждается тестами на бенчмарках, таких как RuBQ. Для симметричных задач добавление инструкции перед каждым запросом обеспечивает согласованность и повышает точность модели.
 
118
 
119
- ## Поддерживаемые языки
 
120
 
121
- Эта модель инициализирована pretrain моделью GigaChat и дополнительно обучена на смеси английских и русских данных. Однако, поскольку pretrain GigaChat'a делался в основном на русскоязычных данных, мы рекомендуем использовать эту модель только для русского языка.
122
 
123
- ## FAQ
 
 
 
 
 
 
 
124
 
125
- 1. Нужно ли добавлять инструкции к запросу?
126
 
127
- Да, именно так модель обучалась, иначе вы увидите снижение качества. Определение задачи должно быть инструкцией в одном предложении, которая описывает задачу. Это способ настройки текстовых эмбеддингов для разных сценариев с помощью инструкций на естественном языке.
128
 
129
- С другой стороны, добавлять инструкции на сторону документа не требуется.
 
130
 
131
- 2. Почему мои воспроизведённые результаты немного отличаются от указанных в карточке модели?
 
132
 
133
- Разные версии библиотек transformers и pytorch могут вызывать незначительные, но ненулевые различия в результатах.
 
134
 
 
 
135
 
136
- ## Ограничения
 
137
 
138
- Использование этой модели для входных данных, содержащих более 4096 токенов, невозможно.
 
 
1
  ---
 
 
 
 
 
2
  tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - dense
7
+ pipeline_tag: sentence-similarity
8
+ library_name: sentence-transformers
9
  ---
 
 
 
 
10
 
11
+ # SentenceTransformer
12
 
13
+ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 2048-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
14
 
15
+ ## Model Details
16
 
17
+ ### Model Description
18
+ - **Model Type:** Sentence Transformer
19
+ <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
20
+ - **Maximum Sequence Length:** None tokens
21
+ - **Output Dimensionality:** 2048 dimensions
22
+ - **Similarity Function:** Cosine Similarity
23
+ <!-- - **Training Dataset:** Unknown -->
24
+ <!-- - **Language:** Unknown -->
25
+ <!-- - **License:** Unknown -->
26
 
27
+ ### Model Sources
28
 
29
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
30
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
31
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
 
 
 
 
 
 
 
 
 
 
 
32
 
33
+ ### Full Model Architecture
34
+
35
+ ```
36
+ SentenceTransformer(
37
+ (0): Transformer({'max_seq_length': None, 'do_lower_case': False, 'architecture': 'GigarEmbedModel'})
38
+ (1): Pooling({'word_embedding_dimension': 2048, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
39
+ )
40
+ ```
41
 
42
+ ## Usage
 
43
 
44
+ ### Direct Usage (Sentence Transformers)
 
 
45
 
46
+ First install the Sentence Transformers library:
 
 
47
 
48
+ ```bash
49
+ pip install -U sentence-transformers
50
+ ```
51
 
52
+ Then you can load this model and run inference.
53
  ```python
54
+ from sentence_transformers import SentenceTransformer
55
+
56
+ # Download from the 🤗 Hub
57
+ model = SentenceTransformer("sentence_transformers_model_id")
58
+ # Run inference
59
+ sentences = [
60
+ 'The weather is lovely today.',
61
+ "It's so sunny outside!",
62
+ 'He drove to the stadium.',
63
+ ]
64
+ embeddings = model.encode(sentences)
65
+ print(embeddings.shape)
66
+ # [3, 2048]
67
+
68
+ # Get the similarity scores for the embeddings
69
+ similarities = model.similarity(embeddings, embeddings)
70
+ print(similarities.shape)
71
+ # [3, 3]
72
+ ```
73
 
74
+ <!--
75
+ ### Direct Usage (Transformers)
76
 
77
+ <details><summary>Click to see the direct usage in Transformers</summary>
 
 
 
78
 
79
+ </details>
80
+ -->
 
 
 
81
 
82
+ <!--
83
+ ### Downstream Usage (Sentence Transformers)
84
 
85
+ You can finetune this model on your own dataset.
86
 
87
+ <details><summary>Click to expand</summary>
88
 
89
+ </details>
90
+ -->
91
 
92
+ <!--
93
+ ### Out-of-Scope Use
 
 
 
 
94
 
95
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
96
+ -->
97
 
98
+ <!--
99
+ ## Bias, Risks and Limitations
100
 
101
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
102
+ -->
 
103
 
104
+ <!--
105
+ ### Recommendations
106
 
107
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
108
+ -->
109
 
110
+ ## Training Details
111
 
112
+ ### Framework Versions
113
+ - Python: 3.10.12
114
+ - Sentence Transformers: 5.1.1
115
+ - Transformers: 4.51.0
116
+ - PyTorch: 2.5.1+cu124
117
+ - Accelerate: 1.2.1
118
+ - Datasets: 2.21.0
119
+ - Tokenizers: 0.21.4
120
 
121
+ ## Citation
122
 
123
+ ### BibTeX
124
 
125
+ <!--
126
+ ## Glossary
127
 
128
+ *Clearly define terms in order to be accessible across audiences.*
129
+ -->
130
 
131
+ <!--
132
+ ## Model Card Authors
133
 
134
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
135
+ -->
136
 
137
+ <!--
138
+ ## Model Card Contact
139
 
140
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
141
+ -->
config.json CHANGED
@@ -1,96 +1,218 @@
1
  {
2
- "_name_or_path": "ai-sage/Giga-Embeddings-instruct",
3
- "_non_freeze_layers_idxs": null,
4
- "activation_checkpoint_layers_num": null,
5
- "apply_torch_compile_to_projections": true,
6
- "architectures": [
7
- "GigarEmbedModel"
8
- ],
9
- "auto_map": {
10
- "AutoConfig": "configuration_gigarembed.GigarEmbedConfig",
11
- "AutoModel": "modeling_gigarembed.GigarEmbedModel"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  },
13
- "latent_attention_config": {
14
- "model_type": "latent_attention",
15
- "num_latents_value": 512,
16
- "num_cross_heads": 8,
17
- "cross_dim_head": 2048,
18
- "hidden_dim": 2048,
19
- "latent_dim": 2048,
20
- "mult": 4
21
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  "hidden_size": 2048,
23
- "text_config": {
24
- "_name_or_path": "ai-sage/Giga-Embeddings-instruct",
25
- "apply_qk_norm": true,
26
- "attention_bias": false,
27
- "attention_dropout": 0.0,
28
- "attention_hidden_size": null,
29
- "attention_type": "LlamaLatentAttention",
30
- "bos_token_id": 1,
31
- "delete_logits": true,
32
- "deterministic_attention": false,
33
- "enable_async_tp": false,
34
- "eos_token_id": 2,
35
- "freeze_non_embed": false,
36
- "fused_mlp": true,
37
- "fused_mlp_checkpoint_lvl": 3,
38
- "head_dim": 64,
39
- "hidden_act": "silu",
40
- "hidden_size": 2048,
41
- "ignore_index": -100,
42
- "init_device": "meta",
43
- "initializer_range": 0.02,
44
- "intermediate_size": 11008,
45
- "kv_lora_rank": 1024,
46
- "lora_alpha": null,
47
- "lora_r": null,
48
- "loss_inplace_backward": false,
49
- "max_position_embeddings": 4096,
50
- "max_window_layers": 36,
51
- "mla_config": {
52
- "kv_lora_rank": 1024,
53
- "q_lora_rank": 0,
54
- "qk_nope_head_dim": 64,
55
- "qk_rope_head_dim": 64,
56
- "v_head_dim": 128
57
- },
58
- "mlp_bias": false,
59
- "model_type": "gigar",
60
- "mtp_loss_weight": 0.1,
61
- "mtp_predictor_num": 1,
62
- "norm_type": "LlamaRMSNorm",
63
- "num_attention_heads": 16,
64
- "num_hidden_layers": 36,
65
- "num_key_value_heads": 16,
66
- "pad_token_id": 2,
67
- "parallel_embedding_type": "EmbeddingParallelEmbedding",
68
- "pretraining_tp": 1,
69
- "q_lora_rank": 0,
70
- "qk_nope_head_dim": 64,
71
- "qk_rope_head_dim": 64,
72
- "rms_norm_eps": 1e-06,
73
- "rope_scaling": null,
74
- "rope_theta": 100000.0,
75
- "skip_init_tp_modules": true,
76
- "sliding_window": null,
77
- "sp_split_type": "equal",
78
- "tie_word_embeddings": false,
79
- "tp_group": null,
80
- "tp_size": 1,
81
- "unk_token_id": 0,
82
- "use_cache": false,
83
- "use_cache_force": false,
84
- "use_custom_rotary_kernel": false,
85
- "use_liger": false,
86
- "use_mrope": false,
87
- "use_mtp": true,
88
- "use_sliding_window": false,
89
- "v_head_dim": 128,
90
- "varlen_input": true,
91
- "vocab_size": 128256,
92
- "z_loss_eps": 5e-05
93
  },
94
- "torch_dtype": "bfloat16",
95
- "transformers_version": "4.48.0"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96
  }
 
1
  {
2
+ "_non_freeze_layers_idxs": null,
3
+ "activation_checkpoint_layers_num": null,
4
+ "add_eos": true,
5
+ "add_pad_token": true,
6
+ "apply_torch_compile_to_projections": true,
7
+ "architectures": [
8
+ "GigarEmbedModel"
9
+ ],
10
+ "auto_map": {
11
+ "AutoConfig": "configuration_gigarembed.GigarEmbedConfig",
12
+ "AutoModel": "modeling_gigarembed.GigarEmbedModel"
13
+ },
14
+ "hidden_size": 2048,
15
+ "is_mask_instruction": true,
16
+ "latent_attention_config": {
17
+ "_attn_implementation_autoset": false,
18
+ "_name_or_path": "",
19
+ "add_cross_attention": false,
20
+ "architectures": null,
21
+ "bad_words_ids": null,
22
+ "begin_suppress_tokens": null,
23
+ "bos_token_id": null,
24
+ "chunk_size_feed_forward": 0,
25
+ "cross_attention_hidden_size": null,
26
+ "cross_dim_head": 2048,
27
+ "decoder_start_token_id": null,
28
+ "diversity_penalty": 0.0,
29
+ "do_sample": false,
30
+ "early_stopping": false,
31
+ "encoder_no_repeat_ngram_size": 0,
32
+ "eos_token_id": null,
33
+ "exponential_decay_length_penalty": null,
34
+ "finetuning_task": null,
35
+ "forced_bos_token_id": null,
36
+ "forced_eos_token_id": null,
37
+ "hidden_dim": 2048,
38
+ "id2label": {
39
+ "0": "LABEL_0",
40
+ "1": "LABEL_1"
41
  },
42
+ "is_decoder": false,
43
+ "is_encoder_decoder": false,
44
+ "label2id": {
45
+ "LABEL_0": 0,
46
+ "LABEL_1": 1
 
 
 
47
  },
48
+ "latent_dim": 2048,
49
+ "length_penalty": 1.0,
50
+ "max_length": 20,
51
+ "min_length": 0,
52
+ "model_type": "latent_attention",
53
+ "mult": 4,
54
+ "no_repeat_ngram_size": 0,
55
+ "num_beam_groups": 1,
56
+ "num_beams": 1,
57
+ "num_cross_heads": 8,
58
+ "num_latents_value": 512,
59
+ "num_return_sequences": 1,
60
+ "output_attentions": false,
61
+ "output_hidden_states": false,
62
+ "output_scores": false,
63
+ "pad_token_id": null,
64
+ "prefix": null,
65
+ "problem_type": null,
66
+ "pruned_heads": {},
67
+ "remove_invalid_values": false,
68
+ "repetition_penalty": 1.0,
69
+ "return_dict": true,
70
+ "return_dict_in_generate": false,
71
+ "sep_token_id": null,
72
+ "suppress_tokens": null,
73
+ "task_specific_params": null,
74
+ "temperature": 1.0,
75
+ "tf_legacy_loss": false,
76
+ "tie_encoder_decoder": false,
77
+ "tie_word_embeddings": true,
78
+ "tokenizer_class": null,
79
+ "top_k": 50,
80
+ "top_p": 1.0,
81
+ "torch_dtype": null,
82
+ "torchscript": false,
83
+ "typical_p": 1.0,
84
+ "use_bfloat16": false
85
+ },
86
+ "mask_type": "b",
87
+ "model_type": "gigarembed",
88
+ "padding_side": "right",
89
+ "text_config": {
90
+ "_attn_implementation_autoset": false,
91
+ "_name_or_path": "ai-sage/Giga-Embeddings-instruct",
92
+ "add_cross_attention": false,
93
+ "apply_qk_norm": true,
94
+ "architectures": null,
95
+ "attention_bias": false,
96
+ "attention_dropout": 0.0,
97
+ "attention_hidden_size": null,
98
+ "attention_type": "LlamaLatentAttention",
99
+ "bad_words_ids": null,
100
+ "begin_suppress_tokens": null,
101
+ "bos_token_id": 1,
102
+ "chunk_size_feed_forward": 0,
103
+ "cross_attention_hidden_size": null,
104
+ "decoder_start_token_id": null,
105
+ "delete_logits": true,
106
+ "deterministic_attention": false,
107
+ "diversity_penalty": 0.0,
108
+ "do_sample": false,
109
+ "early_stopping": false,
110
+ "enable_async_tp": false,
111
+ "encoder_no_repeat_ngram_size": 0,
112
+ "eos_token_id": 2,
113
+ "exponential_decay_length_penalty": null,
114
+ "finetuning_task": null,
115
+ "forced_bos_token_id": null,
116
+ "forced_eos_token_id": null,
117
+ "freeze_non_embed": false,
118
+ "fused_mlp": true,
119
+ "fused_mlp_checkpoint_lvl": 3,
120
+ "head_dim": 64,
121
+ "hidden_act": "silu",
122
  "hidden_size": 2048,
123
+ "id2label": {
124
+ "0": "LABEL_0",
125
+ "1": "LABEL_1"
126
+ },
127
+ "ignore_index": -100,
128
+ "init_device": "meta",
129
+ "initializer_range": 0.02,
130
+ "intermediate_size": 11008,
131
+ "is_decoder": false,
132
+ "is_encoder_decoder": false,
133
+ "kv_lora_rank": 1024,
134
+ "label2id": {
135
+ "LABEL_0": 0,
136
+ "LABEL_1": 1
137
+ },
138
+ "length_penalty": 1.0,
139
+ "lora_alpha": null,
140
+ "lora_r": null,
141
+ "loss_inplace_backward": false,
142
+ "max_length": 20,
143
+ "max_position_embeddings": 4096,
144
+ "max_window_layers": 36,
145
+ "min_length": 0,
146
+ "mla_config": {
147
+ "kv_lora_rank": 1024,
148
+ "q_lora_rank": 0,
149
+ "qk_nope_head_dim": 64,
150
+ "qk_rope_head_dim": 64,
151
+ "v_head_dim": 128
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
152
  },
153
+ "mlp_bias": false,
154
+ "model_type": "gigar",
155
+ "mtp_loss_weight": 0.1,
156
+ "mtp_predictor_num": 1,
157
+ "no_repeat_ngram_size": 0,
158
+ "norm_type": "LlamaRMSNorm",
159
+ "num_attention_heads": 16,
160
+ "num_beam_groups": 1,
161
+ "num_beams": 1,
162
+ "num_hidden_layers": 36,
163
+ "num_key_value_heads": 16,
164
+ "num_return_sequences": 1,
165
+ "output_attentions": false,
166
+ "output_hidden_states": false,
167
+ "output_scores": false,
168
+ "pad_token_id": 2,
169
+ "parallel_embedding_type": "EmbeddingParallelEmbedding",
170
+ "prefix": null,
171
+ "pretraining_tp": 1,
172
+ "problem_type": null,
173
+ "pruned_heads": {},
174
+ "q_lora_rank": 0,
175
+ "qk_nope_head_dim": 64,
176
+ "qk_rope_head_dim": 64,
177
+ "remove_invalid_values": false,
178
+ "repetition_penalty": 1.0,
179
+ "return_dict": true,
180
+ "return_dict_in_generate": false,
181
+ "rms_norm_eps": 1e-06,
182
+ "rope_scaling": null,
183
+ "rope_theta": 100000.0,
184
+ "sep_token_id": null,
185
+ "skip_init_tp_modules": true,
186
+ "sliding_window": null,
187
+ "sp_split_type": "equal",
188
+ "suppress_tokens": null,
189
+ "task_specific_params": null,
190
+ "temperature": 1.0,
191
+ "tf_legacy_loss": false,
192
+ "tie_encoder_decoder": false,
193
+ "tie_word_embeddings": false,
194
+ "tokenizer_class": null,
195
+ "top_k": 50,
196
+ "top_p": 1.0,
197
+ "torch_dtype": null,
198
+ "torchscript": false,
199
+ "tp_group": null,
200
+ "tp_size": 1,
201
+ "typical_p": 1.0,
202
+ "unk_token_id": 0,
203
+ "use_bfloat16": false,
204
+ "use_cache": false,
205
+ "use_cache_force": false,
206
+ "use_custom_rotary_kernel": false,
207
+ "use_liger": false,
208
+ "use_mrope": false,
209
+ "use_mtp": true,
210
+ "use_sliding_window": false,
211
+ "v_head_dim": 128,
212
+ "varlen_input": true,
213
+ "vocab_size": 128256,
214
+ "z_loss_eps": 5e-05
215
+ },
216
+ "torch_dtype": "float32",
217
+ "transformers_version": "4.51.0"
218
  }
config_sentence_transformers.json CHANGED
@@ -1,10 +1,14 @@
1
  {
 
2
  "__version__": {
3
- "sentence_transformers": "3.3.1",
4
- "transformers": "4.48.0",
5
- "pytorch": "2.1.1+cu121"
 
 
 
 
6
  },
7
- "prompts": {},
8
  "default_prompt_name": null,
9
  "similarity_fn_name": "cosine"
10
  }
 
1
  {
2
+ "model_type": "SentenceTransformer",
3
  "__version__": {
4
+ "sentence_transformers": "5.1.1",
5
+ "transformers": "4.51.0",
6
+ "pytorch": "2.5.1+cu124"
7
+ },
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
  },
 
12
  "default_prompt_name": null,
13
  "similarity_fn_name": "cosine"
14
  }
modeling_gigarembed.py CHANGED
@@ -1135,7 +1135,8 @@ class GigarEmbedModel(PreTrainedModel):
1135
  if return_embeddings:
1136
  return self.mean_pool(last_hidden, attention_mask)
1137
 
1138
- return last_hidden
 
1139
 
1140
  def mean_pool(self, last_hidden: torch.Tensor, attention_mask: torch.Tensor):
1141
  last_hidden = last_hidden.masked_fill(~attention_mask[..., None].bool(), 0.0)
 
1135
  if return_embeddings:
1136
  return self.mean_pool(last_hidden, attention_mask)
1137
 
1138
+ # return last_hidden
1139
+ return BaseModelOutputWithPast(last_hidden_state=last_hidden)
1140
 
1141
  def mean_pool(self, last_hidden: torch.Tensor, attention_mask: torch.Tensor):
1142
  last_hidden = last_hidden.masked_fill(~attention_mask[..., None].bool(), 0.0)
sentence_bert_config.json CHANGED
@@ -1,4 +1,4 @@
1
  {
2
- "max_seq_length": null,
3
- "do_lower_case": false
4
  }
 
1
  {
2
+ "max_seq_length": null,
3
+ "do_lower_case": false
4
  }
tokenizer_config.json CHANGED
@@ -2086,7 +2086,7 @@
2086
  "padding_side": "right",
2087
  "sep_token": "<unk>",
2088
  "stride": 0,
2089
- "tokenizer_class": "PreTrainedTokenizerFast",
2090
  "truncation_side": "right",
2091
  "truncation_strategy": "longest_first",
2092
  "unk_token": "<unk>"
 
2086
  "padding_side": "right",
2087
  "sep_token": "<unk>",
2088
  "stride": 0,
2089
+ "tokenizer_class": "PreTrainedTokenizer",
2090
  "truncation_side": "right",
2091
  "truncation_strategy": "longest_first",
2092
  "unk_token": "<unk>"