Upload 16 files

Browse files

Files changed (16) hide show

1_Pooling/config.json +10 -0
README.md +418 -0
config.json +31 -0
config_sentence_transformers.json +10 -0
model.safetensors +3 -0
modules.json +20 -0
optimizer.pt +3 -0
rng_state.pth +3 -0
scheduler.pt +3 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +58 -0
trainer_state.json +75 -0
training_args.bin +3 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 384,
+  "pooling_mode_cls_token": true,
+  "pooling_mode_mean_tokens": false,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,418 @@

+---
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:10481
+- loss:MultipleNegativesRankingLoss
+base_model: abhinand/MedEmbed-small-v0.1
+widget:
+- source_sentence: In the chest, the trachea divides as it enters the lungs to form
+    the right and left what?
+  sentences:
+  - Adulthood is divided into the stages of early, middle, and late adulthood.
+  - Motor vehicles account for almost half of fossil fuel use. Most vehicles run on
+    gasoline, which comes from petroleum.
+  - In the chest, the trachea divides as it enters the lungs to form the right and
+    left bronchi . The bronchi contain cartilage, which prevents them from collapsing.
+    Mucus in the bronchi traps any remaining particles in air. Tiny, hair-like structures
+    called cilia line the bronchi and sweep the particles and mucus toward the throat
+    so they can be expelled from the body.
+- source_sentence: What atmospheric layer lies above the highest altitude an airplane
+    can go and below the lowest altitude a spacecraft can orbit?
+  sentences:
+  - Renal plasma flow equals the blood flow per minute times the hematocrit. If a
+    person has a hematocrit of 45, then the renal plasma flow is 55 percent. 1050*0.55
+    = 578 mL plasma/min.
+  - Not so fast. The mesosphere is the least known layer of the atmosphere. The mesosphere
+    lies above the highest altitude an airplane can go. It lies below the lowest altitude
+    a spacecraft can orbit. Maybe that's just as well. If you were in the mesosphere
+    without a space suit, your blood would boil! This is because the pressure is so
+    low that liquids would boil at normal body temperature.
+  - 'Cell division is just one of several stages that a cell goes through during its
+    lifetime. The cell cycle is a repeating series of events that include growth,
+    DNA synthesis, and cell division. The cell cycle in prokaryotes is quite simple:
+    the cell grows, its DNA replicates, and the cell divides. In eukaryotes, the cell
+    cycle is more complicated.'
+- source_sentence: What distinctive dna shape forms when the two nucleotide chains
+    wrap around the same axis?
+  sentences:
+  - Simple Model of DNA. In this simple model of DNA, each line represents a nucleotide
+    chain. The double helix shape forms when the two chains wrap around the same axis.
+  - Most biochemical molecules are macromolecules, meaning that they are very large.
+    Some contain thousands of monomer molecules.
+  - The continental slope lies between the continental shelf and the abyssal plain.
+    It has a steep slope with a sharp drop to the deep ocean floor.
+- source_sentence: Einstein’s equation helps scientists understand what happens in
+    nuclear reactions and why they produce so much what?
+  sentences:
+  - Einstein’s equation helps scientists understand what happens in nuclear reactions
+    and why they produce so much energy. When the nucleus of a radioisotope undergoes
+    fission or fusion in a nuclear reaction, it loses a tiny amount of mass. What
+    happens to the lost mass? It isn’t really lost at all. It is converted to energy.
+    How much energy? E = mc 2 . The change in mass is tiny, but it results in a great
+    deal of energy.
+  - Water is the main ingredient of many solutions. A solution is a mixture of two
+    or more substances that has the same composition throughout. Some solutions are
+    acids and some are bases. To understand acids and bases, you need to know more
+    about pure water. In pure water (such as distilled water), a tiny fraction of
+    water molecules naturally breaks down to form ions. An ion is an electrically
+    charged atom or molecule. The breakdown of water is represented by the chemical
+    equation.
+  - 'The muscular system consists of all the muscles of the body. Muscles are organs
+    composed mainly of muscle cells, which are also called muscle fibers . Each muscle
+    fiber is a very long, thin cell that can do something no other cell can do. It
+    can contract, or shorten. Muscle contractions are responsible for virtually all
+    the movements of the body, both inside and out. There are three types of muscle
+    tissues in the human body: cardiac, smooth, and skeletal muscle tissues. They
+    are shown in Figure below and described below.'
+- source_sentence: Microfilaments are mostly concentrated just beneath what?
+  sentences:
+  - Vertebrates have a closed circulatory system with a heart. Blood is completely
+    contained within blood vessels that carry the blood throughout the body. The heart
+    is divided into chambers that work together to pump blood. There are between two
+    and four chambers in the vertebrate heart. With more chambers, there is more oxygen
+    in the blood and more vigorous pumping action.
+  - Weight measures the force of gravity pulling on an object. The SI unit for weight
+    is the Newton (N).
+  - Microfilaments , shown as (b) in Figure below , are made of two thin actin chains
+    that are twisted around one another. Microfilaments are mostly concentrated just
+    beneath the cell membrane, where they support the cell and help the cell keep
+    its shape. Microfilaments form cytoplasmatic extentions, such as pseudopodia and
+    microvilli , which allow certain cells to move. The actin of the microfilaments
+    interacts with the protein myosin to cause contraction in muscle cells. Microfilaments
+    are found in almost every cell, and are numerous in muscle cells and in cells
+    that move by changing shape, such as phagocytes (white blood cells that search
+    the body for bacteria and other invaders).
+datasets:
+- flaviawallen/MNLP_M3_rag_embedding_training
+pipeline_tag: sentence-similarity
+library_name: sentence-transformers
+---
+# SentenceTransformer based on abhinand/MedEmbed-small-v0.1
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [abhinand/MedEmbed-small-v0.1](https://huggingface.co/abhinand/MedEmbed-small-v0.1) on the [train](https://huggingface.co/datasets/flaviawallen/MNLP_M3_rag_embedding_training) dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [abhinand/MedEmbed-small-v0.1](https://huggingface.co/abhinand/MedEmbed-small-v0.1) <!-- at revision 40a5850d046cfdb56154e332b4d7099b63e8d50e -->
+- **Maximum Sequence Length:** 512 tokens
+- **Output Dimensionality:** 384 dimensions
+- **Similarity Function:** Cosine Similarity
+- **Training Dataset:**
+    - [train](https://huggingface.co/datasets/flaviawallen/MNLP_M3_rag_embedding_training)
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+  (2): Normalize()
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("sentence_transformers_model_id")
+# Run inference
+sentences = [
+    'Microfilaments are mostly concentrated just beneath what?',
+    'Microfilaments , shown as (b) in Figure below , are made of two thin actin chains that are twisted around one another. Microfilaments are mostly concentrated just beneath the cell membrane, where they support the cell and help the cell keep its shape. Microfilaments form cytoplasmatic extentions, such as pseudopodia and microvilli , which allow certain cells to move. The actin of the microfilaments interacts with the protein myosin to cause contraction in muscle cells. Microfilaments are found in almost every cell, and are numerous in muscle cells and in cells that move by changing shape, such as phagocytes (white blood cells that search the body for bacteria and other invaders).',
+    'Weight measures the force of gravity pulling on an object. The SI unit for weight is the Newton (N).',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 384]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### train
+* Dataset: [train](https://huggingface.co/datasets/flaviawallen/MNLP_M3_rag_embedding_training) at [0b344ac](https://huggingface.co/datasets/flaviawallen/MNLP_M3_rag_embedding_training/tree/0b344ac3e3513dac08101975f56504971505c425)
+* Size: 10,481 training samples
+* Columns: <code>anchor</code> and <code>positive</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | anchor                                                                            | positive                                                                            |
+  |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+  | type    | string                                                                            | string                                                                              |
+  | details | <ul><li>min: 7 tokens</li><li>mean: 18.22 tokens</li><li>max: 63 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 99.59 tokens</li><li>max: 512 tokens</li></ul> |
+* Samples:
+  | anchor                                                                                                                                                                                      | positive                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
+  |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>What type of organism is commonly used in preparation of foods such as cheese and yogurt?</code>                                                                                      | <code>Mesophiles grow best in moderate temperature, typically between 25°C and 40°C (77°F and 104°F). Mesophiles are often found living in or on the bodies of humans or other animals. The optimal growth temperature of many pathogenic mesophiles is 37°C (98°F), the normal human body temperature. Mesophilic organisms have important uses in food preparation, including cheese, yogurt, beer and wine.</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+  | <code>What phenomenon makes global winds blow northeast to southwest or the reverse in the northern hemisphere and northwest to southeast or the reverse in the southern hemisphere?</code> | <code>Without Coriolis Effect the global winds would blow north to south or south to north. But Coriolis makes them blow northeast to southwest or the reverse in the Northern Hemisphere. The winds blow northwest to southeast or the reverse in the southern hemisphere.</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+  | <code>Changes from a less-ordered state to a more-ordered state (such as a liquid to a solid) are always what?</code>                                                                       | <code>Summary Changes of state are examples of phase changes, or phase transitions. All phase changes are accompanied by changes in the energy of a system. Changes from a more-ordered state to a less-ordered state (such as a liquid to a gas) areendothermic. Changes from a less-ordered state to a more-ordered state (such as a liquid to a solid) are always exothermic. The conversion of a solid to a liquid is called fusion (or melting). The energy required to melt 1 mol of a substance is its enthalpy of fusion (ΔHfus). The energy change required to vaporize 1 mol of a substance is the enthalpy of vaporization (ΔHvap). The direct conversion of a solid to a gas is sublimation. The amount of energy needed to sublime 1 mol of a substance is its enthalpy of sublimation (ΔHsub) and is the sum of the enthalpies of fusion and vaporization. Plots of the temperature of a substance versus heat added or versus heating time at a constant rate of heating are calledheating curves. Heating curves relate temper...</code> |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `per_device_train_batch_size`: 16
+- `per_device_eval_batch_size`: 16
+- `num_train_epochs`: 1
+- `warmup_ratio`: 0.1
+- `batch_sampler`: no_duplicates
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: no
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 16
+- `per_device_eval_batch_size`: 16
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 5e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 1
+- `max_steps`: -1
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.1
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: False
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: None
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `include_for_metrics`: []
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `dispatch_batches`: None
+- `split_batches`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `eval_use_gather_object`: False
+- `average_tokens_across_devices`: False
+- `prompts`: None
+- `batch_sampler`: no_duplicates
+- `multi_dataset_batch_sampler`: proportional
+</details>
+### Training Logs
+| Epoch  | Step | Training Loss |
+|:------:|:----:|:-------------:|
+| 0.1524 | 100  | 0.1488        |
+| 0.3049 | 200  | 0.0939        |
+| 0.4573 | 300  | 0.0744        |
+| 0.6098 | 400  | 0.1175        |
+| 0.7622 | 500  | 0.0954        |
+| 0.9146 | 600  | 0.0813        |
+### Framework Versions
+- Python: 3.12.8
+- Sentence Transformers: 3.4.1
+- Transformers: 4.48.2
+- PyTorch: 2.5.1+cu124
+- Accelerate: 1.3.0
+- Datasets: 3.2.0
+- Tokenizers: 0.21.0
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "_name_or_path": "abhinand/MedEmbed-small-v0.1",
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 384,
+  "id2label": {
+    "0": "LABEL_0"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 1536,
+  "label2id": {
+    "LABEL_0": 0
+  },
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.48.2",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 30522
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.4.1",
+    "transformers": "4.48.2",
+    "pytorch": "2.5.1+cu124"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": "cosine"
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bd93c67e5180812b5f519b07e786b3dcabdd7b5251ef6da0152fec81e6293ac9
+size 133462128

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ea8895839fee948355130c9bac8abb4119ea850d8c0fa07a7ce15d9cb12586cd
+size 265862074

rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9754ec1a1919cb1151786c3f9c4ab4243fe0e4448c3b4eafc5784e03179dd125
+size 14244

scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0cc593fae2b0cf7cb58897213e39a3331453e7fa868bc01da02b7b6a82a7a48b
+size 1064

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 512,
+  "do_lower_case": false
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,58 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "never_split": null,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

trainer_state.json ADDED Viewed

	@@ -0,0 +1,75 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 1.0,
+  "eval_steps": 100,
+  "global_step": 656,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.1524390243902439,
+      "grad_norm": 4.109804630279541,
+      "learning_rate": 4.711864406779661e-05,
+      "loss": 0.1488,
+      "step": 100
+    },
+    {
+      "epoch": 0.3048780487804878,
+      "grad_norm": 1.42342209815979,
+      "learning_rate": 3.8644067796610175e-05,
+      "loss": 0.0939,
+      "step": 200
+    },
+    {
+      "epoch": 0.4573170731707317,
+      "grad_norm": 7.705954074859619,
+      "learning_rate": 3.016949152542373e-05,
+      "loss": 0.0744,
+      "step": 300
+    },
+    {
+      "epoch": 0.6097560975609756,
+      "grad_norm": 5.876913070678711,
+      "learning_rate": 2.1694915254237287e-05,
+      "loss": 0.1175,
+      "step": 400
+    },
+    {
+      "epoch": 0.7621951219512195,
+      "grad_norm": 4.6725382804870605,
+      "learning_rate": 1.3220338983050848e-05,
+      "loss": 0.0954,
+      "step": 500
+    },
+    {
+      "epoch": 0.9146341463414634,
+      "grad_norm": 3.6604549884796143,
+      "learning_rate": 4.745762711864407e-06,
+      "loss": 0.0813,
+      "step": 600
+    }
+  ],
+  "logging_steps": 100,
+  "max_steps": 656,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 1,
+  "save_steps": 100,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": true
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 0.0,
+  "train_batch_size": 16,
+  "trial_name": null,
+  "trial_params": null
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ea5823dff7189452598c441da7a116e6f51cd5a6383e1ed11b6f84803a52f239
+size 5560

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff