Andyrasika commited on Nov 23, 2023

Commit

08c9e8c

1 Parent(s): 5b5d276

Upload folder using huggingface_hub

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

checkpoint-100/README.md +220 -0
checkpoint-100/adapter_config.json +29 -0
checkpoint-100/adapter_model.safetensors +3 -0
checkpoint-100/optimizer.pt +3 -0
checkpoint-100/rng_state.pth +3 -0
checkpoint-100/scheduler.pt +3 -0
checkpoint-100/trainer_state.json +48 -0
checkpoint-100/training_args.bin +3 -0
checkpoint-1000/README.md +220 -0
checkpoint-1000/adapter_config.json +29 -0
checkpoint-1000/adapter_model.safetensors +3 -0
checkpoint-1000/optimizer.pt +3 -0
checkpoint-1000/rng_state.pth +3 -0
checkpoint-1000/scheduler.pt +3 -0
checkpoint-1000/trainer_state.json +300 -0
checkpoint-1000/training_args.bin +3 -0
checkpoint-150/README.md +220 -0
checkpoint-150/adapter_config.json +29 -0
checkpoint-150/adapter_model.safetensors +3 -0
checkpoint-150/optimizer.pt +3 -0
checkpoint-150/rng_state.pth +3 -0
checkpoint-150/scheduler.pt +3 -0
checkpoint-150/trainer_state.json +62 -0
checkpoint-150/training_args.bin +3 -0
checkpoint-200/README.md +220 -0
checkpoint-200/adapter_config.json +29 -0
checkpoint-200/adapter_model.safetensors +3 -0
checkpoint-200/optimizer.pt +3 -0
checkpoint-200/rng_state.pth +3 -0
checkpoint-200/scheduler.pt +3 -0
checkpoint-200/trainer_state.json +76 -0
checkpoint-200/training_args.bin +3 -0
checkpoint-250/README.md +220 -0
checkpoint-250/adapter_config.json +29 -0
checkpoint-250/adapter_model.safetensors +3 -0
checkpoint-250/optimizer.pt +3 -0
checkpoint-250/rng_state.pth +3 -0
checkpoint-250/scheduler.pt +3 -0
checkpoint-250/trainer_state.json +90 -0
checkpoint-250/training_args.bin +3 -0
checkpoint-300/README.md +220 -0
checkpoint-300/adapter_config.json +29 -0
checkpoint-300/adapter_model.safetensors +3 -0
checkpoint-300/optimizer.pt +3 -0
checkpoint-300/rng_state.pth +3 -0
checkpoint-300/scheduler.pt +3 -0
checkpoint-300/trainer_state.json +104 -0
checkpoint-300/training_args.bin +3 -0
checkpoint-350/README.md +220 -0
checkpoint-350/adapter_config.json +29 -0

checkpoint-100/README.md ADDED Viewed

	@@ -0,0 +1,220 @@

+---
+library_name: peft
+base_model: mistralai/Mistral-7B-v0.1
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+## Training procedure
+The following `bitsandbytes` quantization config was used during training:
+- quant_method: bitsandbytes
+- load_in_8bit: False
+- load_in_4bit: True
+- llm_int8_threshold: 6.0
+- llm_int8_skip_modules: None
+- llm_int8_enable_fp32_cpu_offload: False
+- llm_int8_has_fp16_weight: False
+- bnb_4bit_quant_type: nf4
+- bnb_4bit_use_double_quant: True
+- bnb_4bit_compute_dtype: bfloat16
+### Framework versions
+- PEFT 0.6.3.dev0

checkpoint-100/adapter_config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-v0.1",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 16,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 8,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "gate_proj",
+    "lm_head",
+    "v_proj",
+    "up_proj",
+    "down_proj",
+    "q_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

checkpoint-100/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c69809f0040c8f92d4a238f6493d26dccf499247ceda24ca5edf1163b49e962e
+size 85100592

checkpoint-100/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9f73148243e73d765345c5789209c42faa666c876b06a6ceb5d4442ec1d88a3b
+size 43126684

checkpoint-100/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2c01ab4c0f45976dd0b37a94c24d44ab3264195b7231e616864a83fc30f1669a
+size 14244

checkpoint-100/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1f46dc04db0a603406c597c113e229228b08858bb09b49bfebd3512f1a8f3306
+size 1064

checkpoint-100/trainer_state.json ADDED Viewed

	@@ -0,0 +1,48 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 0.15673981191222572,
+  "eval_steps": 50,
+  "global_step": 100,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.08,
+      "learning_rate": 2.3869346733668342e-05,
+      "loss": 0.7797,
+      "step": 50
+    },
+    {
+      "epoch": 0.08,
+      "eval_loss": 0.2723180055618286,
+      "eval_runtime": 135.6616,
+      "eval_samples_per_second": 5.263,
+      "eval_steps_per_second": 0.663,
+      "step": 50
+    },
+    {
+      "epoch": 0.16,
+      "learning_rate": 2.2613065326633167e-05,
+      "loss": 0.2457,
+      "step": 100
+    },
+    {
+      "epoch": 0.16,
+      "eval_loss": 0.22004182636737823,
+      "eval_runtime": 136.2348,
+      "eval_samples_per_second": 5.241,
+      "eval_steps_per_second": 0.661,
+      "step": 100
+    }
+  ],
+  "logging_steps": 50,
+  "max_steps": 1000,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 2,
+  "save_steps": 50,
+  "total_flos": 1.75274075357184e+16,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-100/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9f208e3c6bbc0ff595dc52e32a7309c9e57d7d78823b465b2b38edcf101eb89a
+size 4600

checkpoint-1000/README.md ADDED Viewed

	@@ -0,0 +1,220 @@

+---
+library_name: peft
+base_model: mistralai/Mistral-7B-v0.1
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+## Training procedure
+The following `bitsandbytes` quantization config was used during training:
+- quant_method: bitsandbytes
+- load_in_8bit: False
+- load_in_4bit: True
+- llm_int8_threshold: 6.0
+- llm_int8_skip_modules: None
+- llm_int8_enable_fp32_cpu_offload: False
+- llm_int8_has_fp16_weight: False
+- bnb_4bit_quant_type: nf4
+- bnb_4bit_use_double_quant: True
+- bnb_4bit_compute_dtype: bfloat16
+### Framework versions
+- PEFT 0.6.3.dev0

checkpoint-1000/adapter_config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-v0.1",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 16,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 8,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "gate_proj",
+    "lm_head",
+    "v_proj",
+    "up_proj",
+    "down_proj",
+    "q_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

checkpoint-1000/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:99bd78db9d9a54986e2c11ffced397ff7188be95a72fb1d58e4dbfc9a5b10756
+size 85100592

checkpoint-1000/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:da86fb12bf85497d7d598e5053e8ac13fce7c88d2a2b25f9c6b8c2d69ef6e926
+size 43127132

checkpoint-1000/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1575b6cd4b082a5f2959edf357f5bf17e65f7756a963eead9feaa93dfcf50805
+size 14244

checkpoint-1000/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b4d6d865d6518a82dd54bb09f8f02628ebe31ca8be097a65ef5c8faff7622969
+size 1064

checkpoint-1000/trainer_state.json ADDED Viewed

	@@ -0,0 +1,300 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 1.567398119122257,
+  "eval_steps": 50,
+  "global_step": 1000,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.08,
+      "learning_rate": 2.3869346733668342e-05,
+      "loss": 0.7797,
+      "step": 50
+    },
+    {
+      "epoch": 0.08,
+      "eval_loss": 0.2723180055618286,
+      "eval_runtime": 135.6616,
+      "eval_samples_per_second": 5.263,
+      "eval_steps_per_second": 0.663,
+      "step": 50
+    },
+    {
+      "epoch": 0.16,
+      "learning_rate": 2.2613065326633167e-05,
+      "loss": 0.2457,
+      "step": 100
+    },
+    {
+      "epoch": 0.16,
+      "eval_loss": 0.22004182636737823,
+      "eval_runtime": 136.2348,
+      "eval_samples_per_second": 5.241,
+      "eval_steps_per_second": 0.661,
+      "step": 100
+    },
+    {
+      "epoch": 0.24,
+      "learning_rate": 2.135678391959799e-05,
+      "loss": 0.2088,
+      "step": 150
+    },
+    {
+      "epoch": 0.24,
+      "eval_loss": 0.19266444444656372,
+      "eval_runtime": 136.6465,
+      "eval_samples_per_second": 5.225,
+      "eval_steps_per_second": 0.659,
+      "step": 150
+    },
+    {
+      "epoch": 0.31,
+      "learning_rate": 2.0100502512562815e-05,
+      "loss": 0.1832,
+      "step": 200
+    },
+    {
+      "epoch": 0.31,
+      "eval_loss": 0.17922177910804749,
+      "eval_runtime": 136.7121,
+      "eval_samples_per_second": 5.223,
+      "eval_steps_per_second": 0.658,
+      "step": 200
+    },
+    {
+      "epoch": 0.39,
+      "learning_rate": 1.884422110552764e-05,
+      "loss": 0.1754,
+      "step": 250
+    },
+    {
+      "epoch": 0.39,
+      "eval_loss": 0.17311859130859375,
+      "eval_runtime": 136.3058,
+      "eval_samples_per_second": 5.238,
+      "eval_steps_per_second": 0.66,
+      "step": 250
+    },
+    {
+      "epoch": 0.47,
+      "learning_rate": 1.7587939698492464e-05,
+      "loss": 0.169,
+      "step": 300
+    },
+    {
+      "epoch": 0.47,
+      "eval_loss": 0.16897280514240265,
+      "eval_runtime": 136.923,
+      "eval_samples_per_second": 5.215,
+      "eval_steps_per_second": 0.657,
+      "step": 300
+    },
+    {
+      "epoch": 0.55,
+      "learning_rate": 1.6331658291457288e-05,
+      "loss": 0.166,
+      "step": 350
+    },
+    {
+      "epoch": 0.55,
+      "eval_loss": 0.1663457602262497,
+      "eval_runtime": 136.6033,
+      "eval_samples_per_second": 5.227,
+      "eval_steps_per_second": 0.659,
+      "step": 350
+    },
+    {
+      "epoch": 0.63,
+      "learning_rate": 1.507537688442211e-05,
+      "loss": 0.1682,
+      "step": 400
+    },
+    {
+      "epoch": 0.63,
+      "eval_loss": 0.16482460498809814,
+      "eval_runtime": 136.5801,
+      "eval_samples_per_second": 5.228,
+      "eval_steps_per_second": 0.659,
+      "step": 400
+    },
+    {
+      "epoch": 0.71,
+      "learning_rate": 1.3819095477386935e-05,
+      "loss": 0.1576,
+      "step": 450
+    },
+    {
+      "epoch": 0.71,
+      "eval_loss": 0.16245244443416595,
+      "eval_runtime": 136.7662,
+      "eval_samples_per_second": 5.221,
+      "eval_steps_per_second": 0.658,
+      "step": 450
+    },
+    {
+      "epoch": 0.78,
+      "learning_rate": 1.2562814070351759e-05,
+      "loss": 0.165,
+      "step": 500
+    },
+    {
+      "epoch": 0.78,
+      "eval_loss": 0.16068558394908905,
+      "eval_runtime": 136.6019,
+      "eval_samples_per_second": 5.227,
+      "eval_steps_per_second": 0.659,
+      "step": 500
+    },
+    {
+      "epoch": 0.86,
+      "learning_rate": 1.1306532663316583e-05,
+      "loss": 0.152,
+      "step": 550
+    },
+    {
+      "epoch": 0.86,
+      "eval_loss": 0.15984833240509033,
+      "eval_runtime": 136.8975,
+      "eval_samples_per_second": 5.216,
+      "eval_steps_per_second": 0.657,
+      "step": 550
+    },
+    {
+      "epoch": 0.94,
+      "learning_rate": 1.0050251256281408e-05,
+      "loss": 0.1563,
+      "step": 600
+    },
+    {
+      "epoch": 0.94,
+      "eval_loss": 0.15865428745746613,
+      "eval_runtime": 136.9521,
+      "eval_samples_per_second": 5.214,
+      "eval_steps_per_second": 0.657,
+      "step": 600
+    },
+    {
+      "epoch": 1.02,
+      "learning_rate": 8.793969849246232e-06,
+      "loss": 0.1477,
+      "step": 650
+    },
+    {
+      "epoch": 1.02,
+      "eval_loss": 0.1577940434217453,
+      "eval_runtime": 136.5669,
+      "eval_samples_per_second": 5.228,
+      "eval_steps_per_second": 0.659,
+      "step": 650
+    },
+    {
+      "epoch": 1.1,
+      "learning_rate": 7.537688442211055e-06,
+      "loss": 0.1491,
+      "step": 700
+    },
+    {
+      "epoch": 1.1,
+      "eval_loss": 0.157754048705101,
+      "eval_runtime": 136.107,
+      "eval_samples_per_second": 5.246,
+      "eval_steps_per_second": 0.661,
+      "step": 700
+    },
+    {
+      "epoch": 1.18,
+      "learning_rate": 6.2814070351758795e-06,
+      "loss": 0.1466,
+      "step": 750
+    },
+    {
+      "epoch": 1.18,
+      "eval_loss": 0.1569654941558838,
+      "eval_runtime": 137.1916,
+      "eval_samples_per_second": 5.204,
+      "eval_steps_per_second": 0.656,
+      "step": 750
+    },
+    {
+      "epoch": 1.25,
+      "learning_rate": 5.025125628140704e-06,
+      "loss": 0.1383,
+      "step": 800
+    },
+    {
+      "epoch": 1.25,
+      "eval_loss": 0.15617845952510834,
+      "eval_runtime": 136.7366,
+      "eval_samples_per_second": 5.222,
+      "eval_steps_per_second": 0.658,
+      "step": 800
+    },
+    {
+      "epoch": 1.33,
+      "learning_rate": 3.7688442211055276e-06,
+      "loss": 0.1417,
+      "step": 850
+    },
+    {
+      "epoch": 1.33,
+      "eval_loss": 0.15615858137607574,
+      "eval_runtime": 136.2828,
+      "eval_samples_per_second": 5.239,
+      "eval_steps_per_second": 0.66,
+      "step": 850
+    },
+    {
+      "epoch": 1.41,
+      "learning_rate": 2.512562814070352e-06,
+      "loss": 0.1374,
+      "step": 900
+    },
+    {
+      "epoch": 1.41,
+      "eval_loss": 0.155540332198143,
+      "eval_runtime": 137.0904,
+      "eval_samples_per_second": 5.208,
+      "eval_steps_per_second": 0.657,
+      "step": 900
+    },
+    {
+      "epoch": 1.49,
+      "learning_rate": 1.256281407035176e-06,
+      "loss": 0.147,
+      "step": 950
+    },
+    {
+      "epoch": 1.49,
+      "eval_loss": 0.15468443930149078,
+      "eval_runtime": 136.9218,
+      "eval_samples_per_second": 5.215,
+      "eval_steps_per_second": 0.657,
+      "step": 950
+    },
+    {
+      "epoch": 1.57,
+      "learning_rate": 0.0,
+      "loss": 0.1448,
+      "step": 1000
+    },
+    {
+      "epoch": 1.57,
+      "eval_loss": 0.15455935895442963,
+      "eval_runtime": 136.7415,
+      "eval_samples_per_second": 5.222,
+      "eval_steps_per_second": 0.658,
+      "step": 1000
+    }
+  ],
+  "logging_steps": 50,
+  "max_steps": 1000,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 2,
+  "save_steps": 50,
+  "total_flos": 1.7525216609776435e+17,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-1000/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9f208e3c6bbc0ff595dc52e32a7309c9e57d7d78823b465b2b38edcf101eb89a
+size 4600

checkpoint-150/README.md ADDED Viewed

	@@ -0,0 +1,220 @@

+---
+library_name: peft
+base_model: mistralai/Mistral-7B-v0.1
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+## Training procedure
+The following `bitsandbytes` quantization config was used during training:
+- quant_method: bitsandbytes
+- load_in_8bit: False
+- load_in_4bit: True
+- llm_int8_threshold: 6.0
+- llm_int8_skip_modules: None
+- llm_int8_enable_fp32_cpu_offload: False
+- llm_int8_has_fp16_weight: False
+- bnb_4bit_quant_type: nf4
+- bnb_4bit_use_double_quant: True
+- bnb_4bit_compute_dtype: bfloat16
+### Framework versions
+- PEFT 0.6.3.dev0

checkpoint-150/adapter_config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-v0.1",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 16,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 8,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "gate_proj",
+    "lm_head",
+    "v_proj",
+    "up_proj",
+    "down_proj",
+    "q_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

checkpoint-150/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:62a12141bb43e830a8718d52bd0d32f4b487ea502c4972da2acf46e2ab4a1aff
+size 85100592

checkpoint-150/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:13935a4d9f371ff3035fd3bf86cc3322a69a0f9c739f5dbef207611edaa9c922
+size 43126684

checkpoint-150/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0962aa698e0e188a79f51f32c71fcc3e315e7f273b4ba096ed39831a26a8f47b
+size 14244

checkpoint-150/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:87a0a7460dd8b31647fa0542d6e8cdd02c31293f0704d27ec57a49b4c476aa1c
+size 1064

checkpoint-150/trainer_state.json ADDED Viewed

	@@ -0,0 +1,62 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 0.23510971786833856,
+  "eval_steps": 50,
+  "global_step": 150,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.08,
+      "learning_rate": 2.3869346733668342e-05,
+      "loss": 0.7797,
+      "step": 50
+    },
+    {
+      "epoch": 0.08,
+      "eval_loss": 0.2723180055618286,
+      "eval_runtime": 135.6616,
+      "eval_samples_per_second": 5.263,
+      "eval_steps_per_second": 0.663,
+      "step": 50
+    },
+    {
+      "epoch": 0.16,
+      "learning_rate": 2.2613065326633167e-05,
+      "loss": 0.2457,
+      "step": 100
+    },
+    {
+      "epoch": 0.16,
+      "eval_loss": 0.22004182636737823,
+      "eval_runtime": 136.2348,
+      "eval_samples_per_second": 5.241,
+      "eval_steps_per_second": 0.661,
+      "step": 100
+    },
+    {
+      "epoch": 0.24,
+      "learning_rate": 2.135678391959799e-05,
+      "loss": 0.2088,
+      "step": 150
+    },
+    {
+      "epoch": 0.24,
+      "eval_loss": 0.19266444444656372,
+      "eval_runtime": 136.6465,
+      "eval_samples_per_second": 5.225,
+      "eval_steps_per_second": 0.659,
+      "step": 150
+    }
+  ],
+  "logging_steps": 50,
+  "max_steps": 1000,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 2,
+  "save_steps": 50,
+  "total_flos": 2.62911113035776e+16,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-150/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9f208e3c6bbc0ff595dc52e32a7309c9e57d7d78823b465b2b38edcf101eb89a
+size 4600

checkpoint-200/README.md ADDED Viewed

	@@ -0,0 +1,220 @@

+---
+library_name: peft
+base_model: mistralai/Mistral-7B-v0.1
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+## Training procedure
+The following `bitsandbytes` quantization config was used during training:
+- quant_method: bitsandbytes
+- load_in_8bit: False
+- load_in_4bit: True
+- llm_int8_threshold: 6.0
+- llm_int8_skip_modules: None
+- llm_int8_enable_fp32_cpu_offload: False
+- llm_int8_has_fp16_weight: False
+- bnb_4bit_quant_type: nf4
+- bnb_4bit_use_double_quant: True
+- bnb_4bit_compute_dtype: bfloat16
+### Framework versions
+- PEFT 0.6.3.dev0

checkpoint-200/adapter_config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-v0.1",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 16,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 8,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "gate_proj",
+    "lm_head",
+    "v_proj",
+    "up_proj",
+    "down_proj",
+    "q_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

checkpoint-200/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:727ef48f57d38fda1a97e3cc9c25f9341f961bd8a996adc089592cc9835622bc
+size 85100592

checkpoint-200/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bff1bcaec0babeb8e55e682d9da623230c8e0c9aea5651775ad7240718d3d9c9
+size 43126684

checkpoint-200/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f457ed62b714b4aba8d1b2432fdfc3a63a834912752b668d75a7da2e195a1587
+size 14244

checkpoint-200/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4f1b477d3bb44d9bf70633240462f7ac6e455d50eefacf5b2433c62e0cc9e80d
+size 1064

checkpoint-200/trainer_state.json ADDED Viewed

	@@ -0,0 +1,76 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 0.31347962382445144,
+  "eval_steps": 50,
+  "global_step": 200,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.08,
+      "learning_rate": 2.3869346733668342e-05,
+      "loss": 0.7797,
+      "step": 50
+    },
+    {
+      "epoch": 0.08,
+      "eval_loss": 0.2723180055618286,
+      "eval_runtime": 135.6616,
+      "eval_samples_per_second": 5.263,
+      "eval_steps_per_second": 0.663,
+      "step": 50
+    },
+    {
+      "epoch": 0.16,
+      "learning_rate": 2.2613065326633167e-05,
+      "loss": 0.2457,
+      "step": 100
+    },
+    {
+      "epoch": 0.16,
+      "eval_loss": 0.22004182636737823,
+      "eval_runtime": 136.2348,
+      "eval_samples_per_second": 5.241,
+      "eval_steps_per_second": 0.661,
+      "step": 100
+    },
+    {
+      "epoch": 0.24,
+      "learning_rate": 2.135678391959799e-05,
+      "loss": 0.2088,
+      "step": 150
+    },
+    {
+      "epoch": 0.24,
+      "eval_loss": 0.19266444444656372,
+      "eval_runtime": 136.6465,
+      "eval_samples_per_second": 5.225,
+      "eval_steps_per_second": 0.659,
+      "step": 150
+    },
+    {
+      "epoch": 0.31,
+      "learning_rate": 2.0100502512562815e-05,
+      "loss": 0.1832,
+      "step": 200
+    },
+    {
+      "epoch": 0.31,
+      "eval_loss": 0.17922177910804749,
+      "eval_runtime": 136.7121,
+      "eval_samples_per_second": 5.223,
+      "eval_steps_per_second": 0.658,
+      "step": 200
+    }
+  ],
+  "logging_steps": 50,
+  "max_steps": 1000,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 2,
+  "save_steps": 50,
+  "total_flos": 3.50548150714368e+16,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-200/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9f208e3c6bbc0ff595dc52e32a7309c9e57d7d78823b465b2b38edcf101eb89a
+size 4600

checkpoint-250/README.md ADDED Viewed

	@@ -0,0 +1,220 @@

+---
+library_name: peft
+base_model: mistralai/Mistral-7B-v0.1
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+## Training procedure
+The following `bitsandbytes` quantization config was used during training:
+- quant_method: bitsandbytes
+- load_in_8bit: False
+- load_in_4bit: True
+- llm_int8_threshold: 6.0
+- llm_int8_skip_modules: None
+- llm_int8_enable_fp32_cpu_offload: False
+- llm_int8_has_fp16_weight: False
+- bnb_4bit_quant_type: nf4
+- bnb_4bit_use_double_quant: True
+- bnb_4bit_compute_dtype: bfloat16
+### Framework versions
+- PEFT 0.6.3.dev0

checkpoint-250/adapter_config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-v0.1",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 16,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 8,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "gate_proj",
+    "lm_head",
+    "v_proj",
+    "up_proj",
+    "down_proj",
+    "q_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

checkpoint-250/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0ed08f577f810bfea2a625ae11c709cdfa654427fcce0a09e85e6fec516f73f5
+size 85100592

checkpoint-250/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f3bc2666ba548997464a7639f9dc6ecfd18172c99544643445cc9830bd28aa48
+size 43126684

checkpoint-250/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3e90eec24f22ad8e38976f35fa28211eae70ff1aac715343277c0bc4b2839fa3
+size 14244

checkpoint-250/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:09471b95cb193b326e2ae9278591cdf878ced8cb70ac85a4cb6b83f68d62fc51
+size 1064

checkpoint-250/trainer_state.json ADDED Viewed

	@@ -0,0 +1,90 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 0.39184952978056425,
+  "eval_steps": 50,
+  "global_step": 250,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.08,
+      "learning_rate": 2.3869346733668342e-05,
+      "loss": 0.7797,
+      "step": 50
+    },
+    {
+      "epoch": 0.08,
+      "eval_loss": 0.2723180055618286,
+      "eval_runtime": 135.6616,
+      "eval_samples_per_second": 5.263,
+      "eval_steps_per_second": 0.663,
+      "step": 50
+    },
+    {
+      "epoch": 0.16,
+      "learning_rate": 2.2613065326633167e-05,
+      "loss": 0.2457,
+      "step": 100
+    },
+    {
+      "epoch": 0.16,
+      "eval_loss": 0.22004182636737823,
+      "eval_runtime": 136.2348,
+      "eval_samples_per_second": 5.241,
+      "eval_steps_per_second": 0.661,
+      "step": 100
+    },
+    {
+      "epoch": 0.24,
+      "learning_rate": 2.135678391959799e-05,
+      "loss": 0.2088,
+      "step": 150
+    },
+    {
+      "epoch": 0.24,
+      "eval_loss": 0.19266444444656372,
+      "eval_runtime": 136.6465,
+      "eval_samples_per_second": 5.225,
+      "eval_steps_per_second": 0.659,
+      "step": 150
+    },
+    {
+      "epoch": 0.31,
+      "learning_rate": 2.0100502512562815e-05,
+      "loss": 0.1832,
+      "step": 200
+    },
+    {
+      "epoch": 0.31,
+      "eval_loss": 0.17922177910804749,
+      "eval_runtime": 136.7121,
+      "eval_samples_per_second": 5.223,
+      "eval_steps_per_second": 0.658,
+      "step": 200
+    },
+    {
+      "epoch": 0.39,
+      "learning_rate": 1.884422110552764e-05,
+      "loss": 0.1754,
+      "step": 250
+    },
+    {
+      "epoch": 0.39,
+      "eval_loss": 0.17311859130859375,
+      "eval_runtime": 136.3058,
+      "eval_samples_per_second": 5.238,
+      "eval_steps_per_second": 0.66,
+      "step": 250
+    }
+  ],
+  "logging_steps": 50,
+  "max_steps": 1000,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 2,
+  "save_steps": 50,
+  "total_flos": 4.3818518839296e+16,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-250/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9f208e3c6bbc0ff595dc52e32a7309c9e57d7d78823b465b2b38edcf101eb89a
+size 4600

checkpoint-300/README.md ADDED Viewed

	@@ -0,0 +1,220 @@

+---
+library_name: peft
+base_model: mistralai/Mistral-7B-v0.1
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+## Training procedure
+The following `bitsandbytes` quantization config was used during training:
+- quant_method: bitsandbytes
+- load_in_8bit: False
+- load_in_4bit: True
+- llm_int8_threshold: 6.0
+- llm_int8_skip_modules: None
+- llm_int8_enable_fp32_cpu_offload: False
+- llm_int8_has_fp16_weight: False
+- bnb_4bit_quant_type: nf4
+- bnb_4bit_use_double_quant: True
+- bnb_4bit_compute_dtype: bfloat16
+### Framework versions
+- PEFT 0.6.3.dev0

checkpoint-300/adapter_config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-v0.1",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 16,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 8,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "gate_proj",
+    "lm_head",
+    "v_proj",
+    "up_proj",
+    "down_proj",
+    "q_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

checkpoint-300/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e9dcad4faf20f41404b8cfead079476e1b9e12179561ce60578ab234a8eebc2d
+size 85100592

checkpoint-300/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3d36d7e08c0b28b1bf2a8b6580de32ebb04c5aa47ad21e5dc169f5b965a4ae42
+size 43127132

checkpoint-300/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b609533938f675544d701f32c5dfd0943480eeae212bb01e28566ca924db586f
+size 14244

checkpoint-300/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:25d0ec4220fe093365424ee63188b9cc5436640be7c2cb84202c87d53f32aeaf
+size 1064

checkpoint-300/trainer_state.json ADDED Viewed

	@@ -0,0 +1,104 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 0.4702194357366771,
+  "eval_steps": 50,
+  "global_step": 300,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.08,
+      "learning_rate": 2.3869346733668342e-05,
+      "loss": 0.7797,
+      "step": 50
+    },
+    {
+      "epoch": 0.08,
+      "eval_loss": 0.2723180055618286,
+      "eval_runtime": 135.6616,
+      "eval_samples_per_second": 5.263,
+      "eval_steps_per_second": 0.663,
+      "step": 50
+    },
+    {
+      "epoch": 0.16,
+      "learning_rate": 2.2613065326633167e-05,
+      "loss": 0.2457,
+      "step": 100
+    },
+    {
+      "epoch": 0.16,
+      "eval_loss": 0.22004182636737823,
+      "eval_runtime": 136.2348,
+      "eval_samples_per_second": 5.241,
+      "eval_steps_per_second": 0.661,
+      "step": 100
+    },
+    {
+      "epoch": 0.24,
+      "learning_rate": 2.135678391959799e-05,
+      "loss": 0.2088,
+      "step": 150
+    },
+    {
+      "epoch": 0.24,
+      "eval_loss": 0.19266444444656372,
+      "eval_runtime": 136.6465,
+      "eval_samples_per_second": 5.225,
+      "eval_steps_per_second": 0.659,
+      "step": 150
+    },
+    {
+      "epoch": 0.31,
+      "learning_rate": 2.0100502512562815e-05,
+      "loss": 0.1832,
+      "step": 200
+    },
+    {
+      "epoch": 0.31,
+      "eval_loss": 0.17922177910804749,
+      "eval_runtime": 136.7121,
+      "eval_samples_per_second": 5.223,
+      "eval_steps_per_second": 0.658,
+      "step": 200
+    },
+    {
+      "epoch": 0.39,
+      "learning_rate": 1.884422110552764e-05,
+      "loss": 0.1754,
+      "step": 250
+    },
+    {
+      "epoch": 0.39,
+      "eval_loss": 0.17311859130859375,
+      "eval_runtime": 136.3058,
+      "eval_samples_per_second": 5.238,
+      "eval_steps_per_second": 0.66,
+      "step": 250
+    },
+    {
+      "epoch": 0.47,
+      "learning_rate": 1.7587939698492464e-05,
+      "loss": 0.169,
+      "step": 300
+    },
+    {
+      "epoch": 0.47,
+      "eval_loss": 0.16897280514240265,
+      "eval_runtime": 136.923,
+      "eval_samples_per_second": 5.215,
+      "eval_steps_per_second": 0.657,
+      "step": 300
+    }
+  ],
+  "logging_steps": 50,
+  "max_steps": 1000,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 2,
+  "save_steps": 50,
+  "total_flos": 5.25822226071552e+16,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-300/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9f208e3c6bbc0ff595dc52e32a7309c9e57d7d78823b465b2b38edcf101eb89a
+size 4600

checkpoint-350/README.md ADDED Viewed

	@@ -0,0 +1,220 @@

+---
+library_name: peft
+base_model: mistralai/Mistral-7B-v0.1
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+## Training procedure
+The following `bitsandbytes` quantization config was used during training:
+- quant_method: bitsandbytes
+- load_in_8bit: False
+- load_in_4bit: True
+- llm_int8_threshold: 6.0
+- llm_int8_skip_modules: None
+- llm_int8_enable_fp32_cpu_offload: False
+- llm_int8_has_fp16_weight: False
+- bnb_4bit_quant_type: nf4
+- bnb_4bit_use_double_quant: True
+- bnb_4bit_compute_dtype: bfloat16
+### Framework versions
+- PEFT 0.6.3.dev0

checkpoint-350/adapter_config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-v0.1",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 16,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 8,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "gate_proj",
+    "lm_head",
+    "v_proj",
+    "up_proj",
+    "down_proj",
+    "q_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}