Spaces:

ManjunathReddy
/

fine_tuning_llm_phi-2

Sleeping

App Files Files Community

ManjunathReddy commited on Dec 23, 2023

Commit

2a37b68

1 Parent(s): 58cf6ec

Delete checkpoint-960

Browse files

Files changed (27) hide show

checkpoint-960/checkpoint-960/README.md +0 -204
checkpoint-960/checkpoint-960/adapter_config.json +0 -28
checkpoint-960/checkpoint-960/adapter_config.json:Zone.Identifier +0 -0
checkpoint-960/checkpoint-960/adapter_model.safetensors +0 -3
checkpoint-960/checkpoint-960/adapter_model.safetensors:Zone.Identifier +0 -0
checkpoint-960/checkpoint-960/added_tokens.json +0 -40
checkpoint-960/checkpoint-960/added_tokens.json:Zone.Identifier +0 -0
checkpoint-960/checkpoint-960/merges.txt +0 -0
checkpoint-960/checkpoint-960/merges.txt:Zone.Identifier +0 -0
checkpoint-960/checkpoint-960/optimizer.pt +0 -3
checkpoint-960/checkpoint-960/optimizer.pt:Zone.Identifier +0 -0
checkpoint-960/checkpoint-960/rng_state.pth +0 -3
checkpoint-960/checkpoint-960/rng_state.pth:Zone.Identifier +0 -0
checkpoint-960/checkpoint-960/scheduler.pt +0 -3
checkpoint-960/checkpoint-960/scheduler.pt:Zone.Identifier +0 -0
checkpoint-960/checkpoint-960/special_tokens_map.json +0 -24
checkpoint-960/checkpoint-960/special_tokens_map.json:Zone.Identifier +0 -0
checkpoint-960/checkpoint-960/tokenizer.json +0 -0
checkpoint-960/checkpoint-960/tokenizer.json:Zone.Identifier +0 -0
checkpoint-960/checkpoint-960/tokenizer_config.json +0 -324
checkpoint-960/checkpoint-960/tokenizer_config.json:Zone.Identifier +0 -0
checkpoint-960/checkpoint-960/trainer_state.json +0 -597
checkpoint-960/checkpoint-960/trainer_state.json:Zone.Identifier +0 -0
checkpoint-960/checkpoint-960/training_args.bin +0 -3
checkpoint-960/checkpoint-960/training_args.bin:Zone.Identifier +0 -0
checkpoint-960/checkpoint-960/vocab.json +0 -0
checkpoint-960/checkpoint-960/vocab.json:Zone.Identifier +0 -0

checkpoint-960/checkpoint-960/README.md DELETED Viewed

@@ -1,204 +0,0 @@
----
-library_name: peft
-base_model: microsoft/phi-2
----
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]
-### Framework versions
-- PEFT 0.7.2.dev0

checkpoint-960/checkpoint-960/adapter_config.json DELETED Viewed

@@ -1,28 +0,0 @@
-{
-  "alpha_pattern": {},
-  "auto_mapping": null,
-  "base_model_name_or_path": "microsoft/phi-2",
-  "bias": "none",
-  "fan_in_fan_out": false,
-  "inference_mode": true,
-  "init_lora_weights": true,
-  "layers_pattern": null,
-  "layers_to_transform": null,
-  "loftq_config": {},
-  "lora_alpha": 16,
-  "lora_dropout": 0.1,
-  "megatron_config": null,
-  "megatron_core": "megatron.core",
-  "modules_to_save": null,
-  "peft_type": "LORA",
-  "r": 64,
-  "rank_pattern": {},
-  "revision": null,
-  "target_modules": [
-    "Wqkv",
-    "fc2",
-    "fc1"
-  ],
-  "task_type": "CAUSAL_LM",
-  "use_rslora": false
-}

checkpoint-960/checkpoint-960/adapter_config.json:Zone.Identifier DELETED Viewed

File without changes

checkpoint-960/checkpoint-960/adapter_model.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:388ac43ab16da987a83f2d6668f704dd5f0abb0bf68a671c550aa7a66d09d14c
-size 293626136

checkpoint-960/checkpoint-960/adapter_model.safetensors:Zone.Identifier DELETED Viewed

File without changes

checkpoint-960/checkpoint-960/added_tokens.json DELETED Viewed

@@ -1,40 +0,0 @@
-{
-  "\t\t": 50294,
-  "\t\t\t": 50293,
-  "\t\t\t\t": 50292,
-  "\t\t\t\t\t": 50291,
-  "\t\t\t\t\t\t": 50290,
-  "\t\t\t\t\t\t\t": 50289,
-  "\t\t\t\t\t\t\t\t": 50288,
-  "\t\t\t\t\t\t\t\t\t": 50287,
-  "  ": 50286,
-  "   ": 50285,
-  "    ": 50284,
-  "     ": 50283,
-  "      ": 50282,
-  "       ": 50281,
-  "        ": 50280,
-  "         ": 50279,
-  "          ": 50278,
-  "           ": 50277,
-  "            ": 50276,
-  "             ": 50275,
-  "              ": 50274,
-  "               ": 50273,
-  "                ": 50272,
-  "                 ": 50271,
-  "                  ": 50270,
-  "                   ": 50269,
-  "                    ": 50268,
-  "                     ": 50267,
-  "                      ": 50266,
-  "                       ": 50265,
-  "                        ": 50264,
-  "                         ": 50263,
-  "                          ": 50262,
-  "                           ": 50261,
-  "                            ": 50260,
-  "                             ": 50259,
-  "                              ": 50258,
-  "                               ": 50257
-}

checkpoint-960/checkpoint-960/added_tokens.json:Zone.Identifier DELETED Viewed

File without changes

checkpoint-960/checkpoint-960/merges.txt DELETED Viewed

The diff for this file is too large to render. See raw diff

checkpoint-960/checkpoint-960/merges.txt:Zone.Identifier DELETED Viewed

File without changes

checkpoint-960/checkpoint-960/optimizer.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:815b8fe252f9879b8df9e1742b840c9c07c9e01012c8d0bdd01dd42e0415f069
-size 587321658

checkpoint-960/checkpoint-960/optimizer.pt:Zone.Identifier DELETED Viewed

File without changes

checkpoint-960/checkpoint-960/rng_state.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:dc9ef34e1aecee7db389e914ca03447ab6ab633408231a36507f3dde62aa727b
-size 14244

checkpoint-960/checkpoint-960/rng_state.pth:Zone.Identifier DELETED Viewed

File without changes

checkpoint-960/checkpoint-960/scheduler.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:7df40c2372e995711cd2139259b0afbd1b583e873344e5a95064c06af0d88ab9
-size 1064

checkpoint-960/checkpoint-960/scheduler.pt:Zone.Identifier DELETED Viewed

File without changes

checkpoint-960/checkpoint-960/special_tokens_map.json DELETED Viewed

@@ -1,24 +0,0 @@
-{
-  "bos_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": false,
-    "rstrip": false,
-    "single_word": false
-  },
-  "eos_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": false,
-    "rstrip": false,
-    "single_word": false
-  },
-  "pad_token": "<|endoftext|>",
-  "unk_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": false,
-    "rstrip": false,
-    "single_word": false
-  }
-}

checkpoint-960/checkpoint-960/special_tokens_map.json:Zone.Identifier DELETED Viewed

File without changes

checkpoint-960/checkpoint-960/tokenizer.json DELETED Viewed

The diff for this file is too large to render. See raw diff

checkpoint-960/checkpoint-960/tokenizer.json:Zone.Identifier DELETED Viewed

File without changes

checkpoint-960/checkpoint-960/tokenizer_config.json DELETED Viewed

@@ -1,324 +0,0 @@
-{
-  "add_prefix_space": false,
-  "added_tokens_decoder": {
-    "50256": {
-      "content": "<|endoftext|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "50257": {
-      "content": "                               ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50258": {
-      "content": "                              ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50259": {
-      "content": "                             ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50260": {
-      "content": "                            ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50261": {
-      "content": "                           ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50262": {
-      "content": "                          ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50263": {
-      "content": "                         ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50264": {
-      "content": "                        ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50265": {
-      "content": "                       ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50266": {
-      "content": "                      ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50267": {
-      "content": "                     ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50268": {
-      "content": "                    ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50269": {
-      "content": "                   ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50270": {
-      "content": "                  ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50271": {
-      "content": "                 ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50272": {
-      "content": "                ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50273": {
-      "content": "               ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50274": {
-      "content": "              ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50275": {
-      "content": "             ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50276": {
-      "content": "            ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50277": {
-      "content": "           ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50278": {
-      "content": "          ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50279": {
-      "content": "         ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50280": {
-      "content": "        ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50281": {
-      "content": "       ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50282": {
-      "content": "      ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50283": {
-      "content": "     ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50284": {
-      "content": "    ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50285": {
-      "content": "   ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50286": {
-      "content": "  ",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50287": {
-      "content": "\t\t\t\t\t\t\t\t\t",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50288": {
-      "content": "\t\t\t\t\t\t\t\t",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50289": {
-      "content": "\t\t\t\t\t\t\t",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50290": {
-      "content": "\t\t\t\t\t\t",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50291": {
-      "content": "\t\t\t\t\t",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50292": {
-      "content": "\t\t\t\t",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50293": {
-      "content": "\t\t\t",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "50294": {
-      "content": "\t\t",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    }
-  },
-  "bos_token": "<|endoftext|>",
-  "clean_up_tokenization_spaces": true,
-  "eos_token": "<|endoftext|>",
-  "model_max_length": 2048,
-  "pad_token": "<|endoftext|>",
-  "tokenizer_class": "CodeGenTokenizer",
-  "unk_token": "<|endoftext|>"
-}

checkpoint-960/checkpoint-960/tokenizer_config.json:Zone.Identifier DELETED Viewed

File without changes

checkpoint-960/checkpoint-960/trainer_state.json DELETED Viewed

@@ -1,597 +0,0 @@
-{
-  "best_metric": null,
-  "best_model_checkpoint": null,
-  "epoch": 0.8546628088137103,
-  "eval_steps": 500,
-  "global_step": 960,
-  "is_hyper_param_search": false,
-  "is_local_process_zero": true,
-  "is_world_process_zero": true,
-  "log_history": [
-    {
-      "epoch": 0.01,
-      "learning_rate": 0.0002,
-      "loss": 1.727,
-      "step": 10
-    },
-    {
-      "epoch": 0.02,
-      "learning_rate": 0.0002,
-      "loss": 1.7292,
-      "step": 20
-    },
-    {
-      "epoch": 0.03,
-      "learning_rate": 0.0002,
-      "loss": 1.7416,
-      "step": 30
-    },
-    {
-      "epoch": 0.04,
-      "learning_rate": 0.0002,
-      "loss": 2.0158,
-      "step": 40
-    },
-    {
-      "epoch": 0.04,
-      "learning_rate": 0.0002,
-      "loss": 2.143,
-      "step": 50
-    },
-    {
-      "epoch": 0.05,
-      "learning_rate": 0.0002,
-      "loss": 1.6469,
-      "step": 60
-    },
-    {
-      "epoch": 0.06,
-      "learning_rate": 0.0002,
-      "loss": 1.6423,
-      "step": 70
-    },
-    {
-      "epoch": 0.07,
-      "learning_rate": 0.0002,
-      "loss": 1.6669,
-      "step": 80
-    },
-    {
-      "epoch": 0.08,
-      "learning_rate": 0.0002,
-      "loss": 1.919,
-      "step": 90
-    },
-    {
-      "epoch": 0.09,
-      "learning_rate": 0.0002,
-      "loss": 2.1557,
-      "step": 100
-    },
-    {
-      "epoch": 0.1,
-      "learning_rate": 0.0002,
-      "loss": 1.5989,
-      "step": 110
-    },
-    {
-      "epoch": 0.11,
-      "learning_rate": 0.0002,
-      "loss": 1.6912,
-      "step": 120
-    },
-    {
-      "epoch": 0.12,
-      "learning_rate": 0.0002,
-      "loss": 1.565,
-      "step": 130
-    },
-    {
-      "epoch": 0.12,
-      "learning_rate": 0.0002,
-      "loss": 1.9146,
-      "step": 140
-    },
-    {
-      "epoch": 0.13,
-      "learning_rate": 0.0002,
-      "loss": 2.089,
-      "step": 150
-    },
-    {
-      "epoch": 0.14,
-      "learning_rate": 0.0002,
-      "loss": 1.5365,
-      "step": 160
-    },
-    {
-      "epoch": 0.15,
-      "learning_rate": 0.0002,
-      "loss": 1.6268,
-      "step": 170
-    },
-    {
-      "epoch": 0.16,
-      "learning_rate": 0.0002,
-      "loss": 1.6888,
-      "step": 180
-    },
-    {
-      "epoch": 0.17,
-      "learning_rate": 0.0002,
-      "loss": 1.9569,
-      "step": 190
-    },
-    {
-      "epoch": 0.18,
-      "learning_rate": 0.0002,
-      "loss": 2.1153,
-      "step": 200
-    },
-    {
-      "epoch": 0.19,
-      "learning_rate": 0.0002,
-      "loss": 1.5511,
-      "step": 210
-    },
-    {
-      "epoch": 0.2,
-      "learning_rate": 0.0002,
-      "loss": 1.6054,
-      "step": 220
-    },
-    {
-      "epoch": 0.2,
-      "learning_rate": 0.0002,
-      "loss": 1.5985,
-      "step": 230
-    },
-    {
-      "epoch": 0.21,
-      "learning_rate": 0.0002,
-      "loss": 1.8473,
-      "step": 240
-    },
-    {
-      "epoch": 0.22,
-      "learning_rate": 0.0002,
-      "loss": 1.9949,
-      "step": 250
-    },
-    {
-      "epoch": 0.23,
-      "learning_rate": 0.0002,
-      "loss": 1.5932,
-      "step": 260
-    },
-    {
-      "epoch": 0.24,
-      "learning_rate": 0.0002,
-      "loss": 1.5968,
-      "step": 270
-    },
-    {
-      "epoch": 0.25,
-      "learning_rate": 0.0002,
-      "loss": 1.5577,
-      "step": 280
-    },
-    {
-      "epoch": 0.26,
-      "learning_rate": 0.0002,
-      "loss": 1.943,
-      "step": 290
-    },
-    {
-      "epoch": 0.27,
-      "learning_rate": 0.0002,
-      "loss": 2.1429,
-      "step": 300
-    },
-    {
-      "epoch": 0.28,
-      "learning_rate": 0.0002,
-      "loss": 1.5053,
-      "step": 310
-    },
-    {
-      "epoch": 0.28,
-      "learning_rate": 0.0002,
-      "loss": 1.6697,
-      "step": 320
-    },
-    {
-      "epoch": 0.29,
-      "learning_rate": 0.0002,
-      "loss": 1.567,
-      "step": 330
-    },
-    {
-      "epoch": 0.3,
-      "learning_rate": 0.0002,
-      "loss": 1.8498,
-      "step": 340
-    },
-    {
-      "epoch": 0.31,
-      "learning_rate": 0.0002,
-      "loss": 2.0758,
-      "step": 350
-    },
-    {
-      "epoch": 0.32,
-      "learning_rate": 0.0002,
-      "loss": 1.5627,
-      "step": 360
-    },
-    {
-      "epoch": 0.33,
-      "learning_rate": 0.0002,
-      "loss": 1.5358,
-      "step": 370
-    },
-    {
-      "epoch": 0.34,
-      "learning_rate": 0.0002,
-      "loss": 1.7167,
-      "step": 380
-    },
-    {
-      "epoch": 0.35,
-      "learning_rate": 0.0002,
-      "loss": 1.9932,
-      "step": 390
-    },
-    {
-      "epoch": 0.36,
-      "learning_rate": 0.0002,
-      "loss": 2.0632,
-      "step": 400
-    },
-    {
-      "epoch": 0.37,
-      "learning_rate": 0.0002,
-      "loss": 1.5227,
-      "step": 410
-    },
-    {
-      "epoch": 0.37,
-      "learning_rate": 0.0002,
-      "loss": 1.6192,
-      "step": 420
-    },
-    {
-      "epoch": 0.38,
-      "learning_rate": 0.0002,
-      "loss": 1.6087,
-      "step": 430
-    },
-    {
-      "epoch": 0.39,
-      "learning_rate": 0.0002,
-      "loss": 1.9396,
-      "step": 440
-    },
-    {
-      "epoch": 0.4,
-      "learning_rate": 0.0002,
-      "loss": 2.1026,
-      "step": 450
-    },
-    {
-      "epoch": 0.41,
-      "learning_rate": 0.0002,
-      "loss": 1.625,
-      "step": 460
-    },
-    {
-      "epoch": 0.42,
-      "learning_rate": 0.0002,
-      "loss": 1.5609,
-      "step": 470
-    },
-    {
-      "epoch": 0.43,
-      "learning_rate": 0.0002,
-      "loss": 1.542,
-      "step": 480
-    },
-    {
-      "epoch": 0.44,
-      "learning_rate": 0.0002,
-      "loss": 1.9439,
-      "step": 490
-    },
-    {
-      "epoch": 0.45,
-      "learning_rate": 0.0002,
-      "loss": 2.0576,
-      "step": 500
-    },
-    {
-      "epoch": 0.45,
-      "learning_rate": 0.0002,
-      "loss": 1.5826,
-      "step": 510
-    },
-    {
-      "epoch": 0.46,
-      "learning_rate": 0.0002,
-      "loss": 1.6084,
-      "step": 520
-    },
-    {
-      "epoch": 0.47,
-      "learning_rate": 0.0002,
-      "loss": 1.6451,
-      "step": 530
-    },
-    {
-      "epoch": 0.48,
-      "learning_rate": 0.0002,
-      "loss": 1.8193,
-      "step": 540
-    },
-    {
-      "epoch": 0.49,
-      "learning_rate": 0.0002,
-      "loss": 2.0495,
-      "step": 550
-    },
-    {
-      "epoch": 0.5,
-      "learning_rate": 0.0002,
-      "loss": 1.6288,
-      "step": 560
-    },
-    {
-      "epoch": 0.51,
-      "learning_rate": 0.0002,
-      "loss": 1.5786,
-      "step": 570
-    },
-    {
-      "epoch": 0.52,
-      "learning_rate": 0.0002,
-      "loss": 1.6585,
-      "step": 580
-    },
-    {
-      "epoch": 0.53,
-      "learning_rate": 0.0002,
-      "loss": 1.9045,
-      "step": 590
-    },
-    {
-      "epoch": 0.53,
-      "learning_rate": 0.0002,
-      "loss": 2.0053,
-      "step": 600
-    },
-    {
-      "epoch": 0.54,
-      "learning_rate": 0.0002,
-      "loss": 1.5491,
-      "step": 610
-    },
-    {
-      "epoch": 0.55,
-      "learning_rate": 0.0002,
-      "loss": 1.6077,
-      "step": 620
-    },
-    {
-      "epoch": 0.56,
-      "learning_rate": 0.0002,
-      "loss": 1.6346,
-      "step": 630
-    },
-    {
-      "epoch": 0.57,
-      "learning_rate": 0.0002,
-      "loss": 1.9414,
-      "step": 640
-    },
-    {
-      "epoch": 0.58,
-      "learning_rate": 0.0002,
-      "loss": 2.1672,
-      "step": 650
-    },
-    {
-      "epoch": 0.59,
-      "learning_rate": 0.0002,
-      "loss": 1.5987,
-      "step": 660
-    },
-    {
-      "epoch": 0.6,
-      "learning_rate": 0.0002,
-      "loss": 1.5833,
-      "step": 670
-    },
-    {
-      "epoch": 0.61,
-      "learning_rate": 0.0002,
-      "loss": 1.5564,
-      "step": 680
-    },
-    {
-      "epoch": 0.61,
-      "learning_rate": 0.0002,
-      "loss": 1.8669,
-      "step": 690
-    },
-    {
-      "epoch": 0.62,
-      "learning_rate": 0.0002,
-      "loss": 2.1056,
-      "step": 700
-    },
-    {
-      "epoch": 0.63,
-      "learning_rate": 0.0002,
-      "loss": 1.5699,
-      "step": 710
-    },
-    {
-      "epoch": 0.64,
-      "learning_rate": 0.0002,
-      "loss": 1.6089,
-      "step": 720
-    },
-    {
-      "epoch": 0.65,
-      "learning_rate": 0.0002,
-      "loss": 1.6524,
-      "step": 730
-    },
-    {
-      "epoch": 0.66,
-      "learning_rate": 0.0002,
-      "loss": 1.821,
-      "step": 740
-    },
-    {
-      "epoch": 0.67,
-      "learning_rate": 0.0002,
-      "loss": 2.0973,
-      "step": 750
-    },
-    {
-      "epoch": 0.68,
-      "learning_rate": 0.0002,
-      "loss": 1.5607,
-      "step": 760
-    },
-    {
-      "epoch": 0.69,
-      "learning_rate": 0.0002,
-      "loss": 1.5428,
-      "step": 770
-    },
-    {
-      "epoch": 0.69,
-      "learning_rate": 0.0002,
-      "loss": 1.5836,
-      "step": 780
-    },
-    {
-      "epoch": 0.7,
-      "learning_rate": 0.0002,
-      "loss": 1.9913,
-      "step": 790
-    },
-    {
-      "epoch": 0.71,
-      "learning_rate": 0.0002,
-      "loss": 1.9762,
-      "step": 800
-    },
-    {
-      "epoch": 0.72,
-      "learning_rate": 0.0002,
-      "loss": 1.5603,
-      "step": 810
-    },
-    {
-      "epoch": 0.73,
-      "learning_rate": 0.0002,
-      "loss": 1.591,
-      "step": 820
-    },
-    {
-      "epoch": 0.74,
-      "learning_rate": 0.0002,
-      "loss": 1.486,
-      "step": 830
-    },
-    {
-      "epoch": 0.75,
-      "learning_rate": 0.0002,
-      "loss": 1.8912,
-      "step": 840
-    },
-    {
-      "epoch": 0.76,
-      "learning_rate": 0.0002,
-      "loss": 1.9843,
-      "step": 850
-    },
-    {
-      "epoch": 0.77,
-      "learning_rate": 0.0002,
-      "loss": 1.6012,
-      "step": 860
-    },
-    {
-      "epoch": 0.77,
-      "learning_rate": 0.0002,
-      "loss": 1.6078,
-      "step": 870
-    },
-    {
-      "epoch": 0.78,
-      "learning_rate": 0.0002,
-      "loss": 1.587,
-      "step": 880
-    },
-    {
-      "epoch": 0.79,
-      "learning_rate": 0.0002,
-      "loss": 1.9116,
-      "step": 890
-    },
-    {
-      "epoch": 0.8,
-      "learning_rate": 0.0002,
-      "loss": 2.0363,
-      "step": 900
-    },
-    {
-      "epoch": 0.81,
-      "learning_rate": 0.0002,
-      "loss": 1.5466,
-      "step": 910
-    },
-    {
-      "epoch": 0.82,
-      "learning_rate": 0.0002,
-      "loss": 1.604,
-      "step": 920
-    },
-    {
-      "epoch": 0.83,
-      "learning_rate": 0.0002,
-      "loss": 1.6372,
-      "step": 930
-    },
-    {
-      "epoch": 0.84,
-      "learning_rate": 0.0002,
-      "loss": 1.9387,
-      "step": 940
-    },
-    {
-      "epoch": 0.85,
-      "learning_rate": 0.0002,
-      "loss": 1.9897,
-      "step": 950
-    },
-    {
-      "epoch": 0.85,
-      "learning_rate": 0.0002,
-      "loss": 1.5116,
-      "step": 960
-    }
-  ],
-  "logging_steps": 10,
-  "max_steps": 1000,
-  "num_input_tokens_seen": 0,
-  "num_train_epochs": 1,
-  "save_steps": 10,
-  "total_flos": 4.998476659212288e+16,
-  "train_batch_size": 4,
-  "trial_name": null,
-  "trial_params": null
-}

checkpoint-960/checkpoint-960/trainer_state.json:Zone.Identifier DELETED Viewed

File without changes

checkpoint-960/checkpoint-960/training_args.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:4176ab35784e945d30c8bfe54747313b50577ba0295d87426a328dc2f10a584d
-size 4728

checkpoint-960/checkpoint-960/training_args.bin:Zone.Identifier DELETED Viewed

File without changes

checkpoint-960/checkpoint-960/vocab.json DELETED Viewed

The diff for this file is too large to render. See raw diff

checkpoint-960/checkpoint-960/vocab.json:Zone.Identifier DELETED Viewed

File without changes