flaviawallen commited on
Commit
e2738f1
·
verified ·
1 Parent(s): 6b9088b

Upload 16 files

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,418 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:10481
8
+ - loss:MultipleNegativesRankingLoss
9
+ base_model: abhinand/MedEmbed-small-v0.1
10
+ widget:
11
+ - source_sentence: In the chest, the trachea divides as it enters the lungs to form
12
+ the right and left what?
13
+ sentences:
14
+ - Adulthood is divided into the stages of early, middle, and late adulthood.
15
+ - Motor vehicles account for almost half of fossil fuel use. Most vehicles run on
16
+ gasoline, which comes from petroleum.
17
+ - In the chest, the trachea divides as it enters the lungs to form the right and
18
+ left bronchi . The bronchi contain cartilage, which prevents them from collapsing.
19
+ Mucus in the bronchi traps any remaining particles in air. Tiny, hair-like structures
20
+ called cilia line the bronchi and sweep the particles and mucus toward the throat
21
+ so they can be expelled from the body.
22
+ - source_sentence: What atmospheric layer lies above the highest altitude an airplane
23
+ can go and below the lowest altitude a spacecraft can orbit?
24
+ sentences:
25
+ - Renal plasma flow equals the blood flow per minute times the hematocrit. If a
26
+ person has a hematocrit of 45, then the renal plasma flow is 55 percent. 1050*0.55
27
+ = 578 mL plasma/min.
28
+ - Not so fast. The mesosphere is the least known layer of the atmosphere. The mesosphere
29
+ lies above the highest altitude an airplane can go. It lies below the lowest altitude
30
+ a spacecraft can orbit. Maybe that's just as well. If you were in the mesosphere
31
+ without a space suit, your blood would boil! This is because the pressure is so
32
+ low that liquids would boil at normal body temperature.
33
+ - 'Cell division is just one of several stages that a cell goes through during its
34
+ lifetime. The cell cycle is a repeating series of events that include growth,
35
+ DNA synthesis, and cell division. The cell cycle in prokaryotes is quite simple:
36
+ the cell grows, its DNA replicates, and the cell divides. In eukaryotes, the cell
37
+ cycle is more complicated.'
38
+ - source_sentence: What distinctive dna shape forms when the two nucleotide chains
39
+ wrap around the same axis?
40
+ sentences:
41
+ - Simple Model of DNA. In this simple model of DNA, each line represents a nucleotide
42
+ chain. The double helix shape forms when the two chains wrap around the same axis.
43
+ - Most biochemical molecules are macromolecules, meaning that they are very large.
44
+ Some contain thousands of monomer molecules.
45
+ - The continental slope lies between the continental shelf and the abyssal plain.
46
+ It has a steep slope with a sharp drop to the deep ocean floor.
47
+ - source_sentence: Einstein’s equation helps scientists understand what happens in
48
+ nuclear reactions and why they produce so much what?
49
+ sentences:
50
+ - Einstein’s equation helps scientists understand what happens in nuclear reactions
51
+ and why they produce so much energy. When the nucleus of a radioisotope undergoes
52
+ fission or fusion in a nuclear reaction, it loses a tiny amount of mass. What
53
+ happens to the lost mass? It isn’t really lost at all. It is converted to energy.
54
+ How much energy? E = mc 2 . The change in mass is tiny, but it results in a great
55
+ deal of energy.
56
+ - Water is the main ingredient of many solutions. A solution is a mixture of two
57
+ or more substances that has the same composition throughout. Some solutions are
58
+ acids and some are bases. To understand acids and bases, you need to know more
59
+ about pure water. In pure water (such as distilled water), a tiny fraction of
60
+ water molecules naturally breaks down to form ions. An ion is an electrically
61
+ charged atom or molecule. The breakdown of water is represented by the chemical
62
+ equation.
63
+ - 'The muscular system consists of all the muscles of the body. Muscles are organs
64
+ composed mainly of muscle cells, which are also called muscle fibers . Each muscle
65
+ fiber is a very long, thin cell that can do something no other cell can do. It
66
+ can contract, or shorten. Muscle contractions are responsible for virtually all
67
+ the movements of the body, both inside and out. There are three types of muscle
68
+ tissues in the human body: cardiac, smooth, and skeletal muscle tissues. They
69
+ are shown in Figure below and described below.'
70
+ - source_sentence: Microfilaments are mostly concentrated just beneath what?
71
+ sentences:
72
+ - Vertebrates have a closed circulatory system with a heart. Blood is completely
73
+ contained within blood vessels that carry the blood throughout the body. The heart
74
+ is divided into chambers that work together to pump blood. There are between two
75
+ and four chambers in the vertebrate heart. With more chambers, there is more oxygen
76
+ in the blood and more vigorous pumping action.
77
+ - Weight measures the force of gravity pulling on an object. The SI unit for weight
78
+ is the Newton (N).
79
+ - Microfilaments , shown as (b) in Figure below , are made of two thin actin chains
80
+ that are twisted around one another. Microfilaments are mostly concentrated just
81
+ beneath the cell membrane, where they support the cell and help the cell keep
82
+ its shape. Microfilaments form cytoplasmatic extentions, such as pseudopodia and
83
+ microvilli , which allow certain cells to move. The actin of the microfilaments
84
+ interacts with the protein myosin to cause contraction in muscle cells. Microfilaments
85
+ are found in almost every cell, and are numerous in muscle cells and in cells
86
+ that move by changing shape, such as phagocytes (white blood cells that search
87
+ the body for bacteria and other invaders).
88
+ datasets:
89
+ - flaviawallen/MNLP_M3_rag_embedding_training
90
+ pipeline_tag: sentence-similarity
91
+ library_name: sentence-transformers
92
+ ---
93
+
94
+ # SentenceTransformer based on abhinand/MedEmbed-small-v0.1
95
+
96
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [abhinand/MedEmbed-small-v0.1](https://huggingface.co/abhinand/MedEmbed-small-v0.1) on the [train](https://huggingface.co/datasets/flaviawallen/MNLP_M3_rag_embedding_training) dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
97
+
98
+ ## Model Details
99
+
100
+ ### Model Description
101
+ - **Model Type:** Sentence Transformer
102
+ - **Base model:** [abhinand/MedEmbed-small-v0.1](https://huggingface.co/abhinand/MedEmbed-small-v0.1) <!-- at revision 40a5850d046cfdb56154e332b4d7099b63e8d50e -->
103
+ - **Maximum Sequence Length:** 512 tokens
104
+ - **Output Dimensionality:** 384 dimensions
105
+ - **Similarity Function:** Cosine Similarity
106
+ - **Training Dataset:**
107
+ - [train](https://huggingface.co/datasets/flaviawallen/MNLP_M3_rag_embedding_training)
108
+ <!-- - **Language:** Unknown -->
109
+ <!-- - **License:** Unknown -->
110
+
111
+ ### Model Sources
112
+
113
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
114
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
115
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
116
+
117
+ ### Full Model Architecture
118
+
119
+ ```
120
+ SentenceTransformer(
121
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
122
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
123
+ (2): Normalize()
124
+ )
125
+ ```
126
+
127
+ ## Usage
128
+
129
+ ### Direct Usage (Sentence Transformers)
130
+
131
+ First install the Sentence Transformers library:
132
+
133
+ ```bash
134
+ pip install -U sentence-transformers
135
+ ```
136
+
137
+ Then you can load this model and run inference.
138
+ ```python
139
+ from sentence_transformers import SentenceTransformer
140
+
141
+ # Download from the 🤗 Hub
142
+ model = SentenceTransformer("sentence_transformers_model_id")
143
+ # Run inference
144
+ sentences = [
145
+ 'Microfilaments are mostly concentrated just beneath what?',
146
+ 'Microfilaments , shown as (b) in Figure below , are made of two thin actin chains that are twisted around one another. Microfilaments are mostly concentrated just beneath the cell membrane, where they support the cell and help the cell keep its shape. Microfilaments form cytoplasmatic extentions, such as pseudopodia and microvilli , which allow certain cells to move. The actin of the microfilaments interacts with the protein myosin to cause contraction in muscle cells. Microfilaments are found in almost every cell, and are numerous in muscle cells and in cells that move by changing shape, such as phagocytes (white blood cells that search the body for bacteria and other invaders).',
147
+ 'Weight measures the force of gravity pulling on an object. The SI unit for weight is the Newton (N).',
148
+ ]
149
+ embeddings = model.encode(sentences)
150
+ print(embeddings.shape)
151
+ # [3, 384]
152
+
153
+ # Get the similarity scores for the embeddings
154
+ similarities = model.similarity(embeddings, embeddings)
155
+ print(similarities.shape)
156
+ # [3, 3]
157
+ ```
158
+
159
+ <!--
160
+ ### Direct Usage (Transformers)
161
+
162
+ <details><summary>Click to see the direct usage in Transformers</summary>
163
+
164
+ </details>
165
+ -->
166
+
167
+ <!--
168
+ ### Downstream Usage (Sentence Transformers)
169
+
170
+ You can finetune this model on your own dataset.
171
+
172
+ <details><summary>Click to expand</summary>
173
+
174
+ </details>
175
+ -->
176
+
177
+ <!--
178
+ ### Out-of-Scope Use
179
+
180
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
181
+ -->
182
+
183
+ <!--
184
+ ## Bias, Risks and Limitations
185
+
186
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
187
+ -->
188
+
189
+ <!--
190
+ ### Recommendations
191
+
192
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
193
+ -->
194
+
195
+ ## Training Details
196
+
197
+ ### Training Dataset
198
+
199
+ #### train
200
+
201
+ * Dataset: [train](https://huggingface.co/datasets/flaviawallen/MNLP_M3_rag_embedding_training) at [0b344ac](https://huggingface.co/datasets/flaviawallen/MNLP_M3_rag_embedding_training/tree/0b344ac3e3513dac08101975f56504971505c425)
202
+ * Size: 10,481 training samples
203
+ * Columns: <code>anchor</code> and <code>positive</code>
204
+ * Approximate statistics based on the first 1000 samples:
205
+ | | anchor | positive |
206
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
207
+ | type | string | string |
208
+ | details | <ul><li>min: 7 tokens</li><li>mean: 18.22 tokens</li><li>max: 63 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 99.59 tokens</li><li>max: 512 tokens</li></ul> |
209
+ * Samples:
210
+ | anchor | positive |
211
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
212
+ | <code>What type of organism is commonly used in preparation of foods such as cheese and yogurt?</code> | <code>Mesophiles grow best in moderate temperature, typically between 25°C and 40°C (77°F and 104°F). Mesophiles are often found living in or on the bodies of humans or other animals. The optimal growth temperature of many pathogenic mesophiles is 37°C (98°F), the normal human body temperature. Mesophilic organisms have important uses in food preparation, including cheese, yogurt, beer and wine.</code> |
213
+ | <code>What phenomenon makes global winds blow northeast to southwest or the reverse in the northern hemisphere and northwest to southeast or the reverse in the southern hemisphere?</code> | <code>Without Coriolis Effect the global winds would blow north to south or south to north. But Coriolis makes them blow northeast to southwest or the reverse in the Northern Hemisphere. The winds blow northwest to southeast or the reverse in the southern hemisphere.</code> |
214
+ | <code>Changes from a less-ordered state to a more-ordered state (such as a liquid to a solid) are always what?</code> | <code>Summary Changes of state are examples of phase changes, or phase transitions. All phase changes are accompanied by changes in the energy of a system. Changes from a more-ordered state to a less-ordered state (such as a liquid to a gas) areendothermic. Changes from a less-ordered state to a more-ordered state (such as a liquid to a solid) are always exothermic. The conversion of a solid to a liquid is called fusion (or melting). The energy required to melt 1 mol of a substance is its enthalpy of fusion (ΔHfus). The energy change required to vaporize 1 mol of a substance is the enthalpy of vaporization (ΔHvap). The direct conversion of a solid to a gas is sublimation. The amount of energy needed to sublime 1 mol of a substance is its enthalpy of sublimation (ΔHsub) and is the sum of the enthalpies of fusion and vaporization. Plots of the temperature of a substance versus heat added or versus heating time at a constant rate of heating are calledheating curves. Heating curves relate temper...</code> |
215
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
216
+ ```json
217
+ {
218
+ "scale": 20.0,
219
+ "similarity_fct": "cos_sim"
220
+ }
221
+ ```
222
+
223
+ ### Training Hyperparameters
224
+ #### Non-Default Hyperparameters
225
+
226
+ - `per_device_train_batch_size`: 16
227
+ - `per_device_eval_batch_size`: 16
228
+ - `num_train_epochs`: 1
229
+ - `warmup_ratio`: 0.1
230
+ - `batch_sampler`: no_duplicates
231
+
232
+ #### All Hyperparameters
233
+ <details><summary>Click to expand</summary>
234
+
235
+ - `overwrite_output_dir`: False
236
+ - `do_predict`: False
237
+ - `eval_strategy`: no
238
+ - `prediction_loss_only`: True
239
+ - `per_device_train_batch_size`: 16
240
+ - `per_device_eval_batch_size`: 16
241
+ - `per_gpu_train_batch_size`: None
242
+ - `per_gpu_eval_batch_size`: None
243
+ - `gradient_accumulation_steps`: 1
244
+ - `eval_accumulation_steps`: None
245
+ - `torch_empty_cache_steps`: None
246
+ - `learning_rate`: 5e-05
247
+ - `weight_decay`: 0.0
248
+ - `adam_beta1`: 0.9
249
+ - `adam_beta2`: 0.999
250
+ - `adam_epsilon`: 1e-08
251
+ - `max_grad_norm`: 1.0
252
+ - `num_train_epochs`: 1
253
+ - `max_steps`: -1
254
+ - `lr_scheduler_type`: linear
255
+ - `lr_scheduler_kwargs`: {}
256
+ - `warmup_ratio`: 0.1
257
+ - `warmup_steps`: 0
258
+ - `log_level`: passive
259
+ - `log_level_replica`: warning
260
+ - `log_on_each_node`: True
261
+ - `logging_nan_inf_filter`: True
262
+ - `save_safetensors`: True
263
+ - `save_on_each_node`: False
264
+ - `save_only_model`: False
265
+ - `restore_callback_states_from_checkpoint`: False
266
+ - `no_cuda`: False
267
+ - `use_cpu`: False
268
+ - `use_mps_device`: False
269
+ - `seed`: 42
270
+ - `data_seed`: None
271
+ - `jit_mode_eval`: False
272
+ - `use_ipex`: False
273
+ - `bf16`: False
274
+ - `fp16`: False
275
+ - `fp16_opt_level`: O1
276
+ - `half_precision_backend`: auto
277
+ - `bf16_full_eval`: False
278
+ - `fp16_full_eval`: False
279
+ - `tf32`: None
280
+ - `local_rank`: 0
281
+ - `ddp_backend`: None
282
+ - `tpu_num_cores`: None
283
+ - `tpu_metrics_debug`: False
284
+ - `debug`: []
285
+ - `dataloader_drop_last`: False
286
+ - `dataloader_num_workers`: 0
287
+ - `dataloader_prefetch_factor`: None
288
+ - `past_index`: -1
289
+ - `disable_tqdm`: False
290
+ - `remove_unused_columns`: True
291
+ - `label_names`: None
292
+ - `load_best_model_at_end`: False
293
+ - `ignore_data_skip`: False
294
+ - `fsdp`: []
295
+ - `fsdp_min_num_params`: 0
296
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
297
+ - `fsdp_transformer_layer_cls_to_wrap`: None
298
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
299
+ - `deepspeed`: None
300
+ - `label_smoothing_factor`: 0.0
301
+ - `optim`: adamw_torch
302
+ - `optim_args`: None
303
+ - `adafactor`: False
304
+ - `group_by_length`: False
305
+ - `length_column_name`: length
306
+ - `ddp_find_unused_parameters`: None
307
+ - `ddp_bucket_cap_mb`: None
308
+ - `ddp_broadcast_buffers`: False
309
+ - `dataloader_pin_memory`: True
310
+ - `dataloader_persistent_workers`: False
311
+ - `skip_memory_metrics`: True
312
+ - `use_legacy_prediction_loop`: False
313
+ - `push_to_hub`: False
314
+ - `resume_from_checkpoint`: None
315
+ - `hub_model_id`: None
316
+ - `hub_strategy`: every_save
317
+ - `hub_private_repo`: None
318
+ - `hub_always_push`: False
319
+ - `gradient_checkpointing`: False
320
+ - `gradient_checkpointing_kwargs`: None
321
+ - `include_inputs_for_metrics`: False
322
+ - `include_for_metrics`: []
323
+ - `eval_do_concat_batches`: True
324
+ - `fp16_backend`: auto
325
+ - `push_to_hub_model_id`: None
326
+ - `push_to_hub_organization`: None
327
+ - `mp_parameters`:
328
+ - `auto_find_batch_size`: False
329
+ - `full_determinism`: False
330
+ - `torchdynamo`: None
331
+ - `ray_scope`: last
332
+ - `ddp_timeout`: 1800
333
+ - `torch_compile`: False
334
+ - `torch_compile_backend`: None
335
+ - `torch_compile_mode`: None
336
+ - `dispatch_batches`: None
337
+ - `split_batches`: None
338
+ - `include_tokens_per_second`: False
339
+ - `include_num_input_tokens_seen`: False
340
+ - `neftune_noise_alpha`: None
341
+ - `optim_target_modules`: None
342
+ - `batch_eval_metrics`: False
343
+ - `eval_on_start`: False
344
+ - `use_liger_kernel`: False
345
+ - `eval_use_gather_object`: False
346
+ - `average_tokens_across_devices`: False
347
+ - `prompts`: None
348
+ - `batch_sampler`: no_duplicates
349
+ - `multi_dataset_batch_sampler`: proportional
350
+
351
+ </details>
352
+
353
+ ### Training Logs
354
+ | Epoch | Step | Training Loss |
355
+ |:------:|:----:|:-------------:|
356
+ | 0.1524 | 100 | 0.1488 |
357
+ | 0.3049 | 200 | 0.0939 |
358
+ | 0.4573 | 300 | 0.0744 |
359
+ | 0.6098 | 400 | 0.1175 |
360
+ | 0.7622 | 500 | 0.0954 |
361
+ | 0.9146 | 600 | 0.0813 |
362
+
363
+
364
+ ### Framework Versions
365
+ - Python: 3.12.8
366
+ - Sentence Transformers: 3.4.1
367
+ - Transformers: 4.48.2
368
+ - PyTorch: 2.5.1+cu124
369
+ - Accelerate: 1.3.0
370
+ - Datasets: 3.2.0
371
+ - Tokenizers: 0.21.0
372
+
373
+ ## Citation
374
+
375
+ ### BibTeX
376
+
377
+ #### Sentence Transformers
378
+ ```bibtex
379
+ @inproceedings{reimers-2019-sentence-bert,
380
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
381
+ author = "Reimers, Nils and Gurevych, Iryna",
382
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
383
+ month = "11",
384
+ year = "2019",
385
+ publisher = "Association for Computational Linguistics",
386
+ url = "https://arxiv.org/abs/1908.10084",
387
+ }
388
+ ```
389
+
390
+ #### MultipleNegativesRankingLoss
391
+ ```bibtex
392
+ @misc{henderson2017efficient,
393
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
394
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
395
+ year={2017},
396
+ eprint={1705.00652},
397
+ archivePrefix={arXiv},
398
+ primaryClass={cs.CL}
399
+ }
400
+ ```
401
+
402
+ <!--
403
+ ## Glossary
404
+
405
+ *Clearly define terms in order to be accessible across audiences.*
406
+ -->
407
+
408
+ <!--
409
+ ## Model Card Authors
410
+
411
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
412
+ -->
413
+
414
+ <!--
415
+ ## Model Card Contact
416
+
417
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
418
+ -->
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "abhinand/MedEmbed-small-v0.1",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "id2label": {
12
+ "0": "LABEL_0"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 1536,
16
+ "label2id": {
17
+ "LABEL_0": 0
18
+ },
19
+ "layer_norm_eps": 1e-12,
20
+ "max_position_embeddings": 512,
21
+ "model_type": "bert",
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 12,
24
+ "pad_token_id": 0,
25
+ "position_embedding_type": "absolute",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.48.2",
28
+ "type_vocab_size": 2,
29
+ "use_cache": true,
30
+ "vocab_size": 30522
31
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.1",
4
+ "transformers": "4.48.2",
5
+ "pytorch": "2.5.1+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bd93c67e5180812b5f519b07e786b3dcabdd7b5251ef6da0152fec81e6293ac9
3
+ size 133462128
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ea8895839fee948355130c9bac8abb4119ea850d8c0fa07a7ce15d9cb12586cd
3
+ size 265862074
rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9754ec1a1919cb1151786c3f9c4ab4243fe0e4448c3b4eafc5784e03179dd125
3
+ size 14244
scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0cc593fae2b0cf7cb58897213e39a3331453e7fa868bc01da02b7b6a82a7a48b
3
+ size 1064
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
trainer_state.json ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 1.0,
5
+ "eval_steps": 100,
6
+ "global_step": 656,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.1524390243902439,
13
+ "grad_norm": 4.109804630279541,
14
+ "learning_rate": 4.711864406779661e-05,
15
+ "loss": 0.1488,
16
+ "step": 100
17
+ },
18
+ {
19
+ "epoch": 0.3048780487804878,
20
+ "grad_norm": 1.42342209815979,
21
+ "learning_rate": 3.8644067796610175e-05,
22
+ "loss": 0.0939,
23
+ "step": 200
24
+ },
25
+ {
26
+ "epoch": 0.4573170731707317,
27
+ "grad_norm": 7.705954074859619,
28
+ "learning_rate": 3.016949152542373e-05,
29
+ "loss": 0.0744,
30
+ "step": 300
31
+ },
32
+ {
33
+ "epoch": 0.6097560975609756,
34
+ "grad_norm": 5.876913070678711,
35
+ "learning_rate": 2.1694915254237287e-05,
36
+ "loss": 0.1175,
37
+ "step": 400
38
+ },
39
+ {
40
+ "epoch": 0.7621951219512195,
41
+ "grad_norm": 4.6725382804870605,
42
+ "learning_rate": 1.3220338983050848e-05,
43
+ "loss": 0.0954,
44
+ "step": 500
45
+ },
46
+ {
47
+ "epoch": 0.9146341463414634,
48
+ "grad_norm": 3.6604549884796143,
49
+ "learning_rate": 4.745762711864407e-06,
50
+ "loss": 0.0813,
51
+ "step": 600
52
+ }
53
+ ],
54
+ "logging_steps": 100,
55
+ "max_steps": 656,
56
+ "num_input_tokens_seen": 0,
57
+ "num_train_epochs": 1,
58
+ "save_steps": 100,
59
+ "stateful_callbacks": {
60
+ "TrainerControl": {
61
+ "args": {
62
+ "should_epoch_stop": false,
63
+ "should_evaluate": false,
64
+ "should_log": false,
65
+ "should_save": true,
66
+ "should_training_stop": true
67
+ },
68
+ "attributes": {}
69
+ }
70
+ },
71
+ "total_flos": 0.0,
72
+ "train_batch_size": 16,
73
+ "trial_name": null,
74
+ "trial_params": null
75
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ea5823dff7189452598c441da7a116e6f51cd5a6383e1ed11b6f84803a52f239
3
+ size 5560
vocab.txt ADDED
The diff for this file is too large to render. See raw diff