Model save

Browse files

Files changed (9) hide show

README.md +136 -0
best/config.json +17 -0
best/generation_config.json +4 -0
best/model.safetensors +3 -0
best/training_args.bin +3 -0
config.json +17 -0
generation_config.json +4 -0
model.safetensors +3 -0
training_args.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,136 @@

+---
+tags:
+- generated_from_trainer
+model-index:
+- name: junk
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# junk
+This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 8.1252
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 32
+- eval_batch_size: 32
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 30
+- num_epochs: 100
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 10.42         | 1.25  | 5    | 10.1940         |
+| 10.1087       | 2.5   | 10   | 9.7539          |
+| 9.7572        | 3.75  | 15   | 9.4707          |
+| 9.5321        | 5.0   | 20   | 9.2852          |
+| 9.13          | 6.25  | 25   | 9.1155          |
+| 8.9989        | 7.5   | 30   | 8.9138          |
+| 8.7422        | 8.75  | 35   | 8.7181          |
+| 8.5133        | 10.0  | 40   | 8.5220          |
+| 8.0836        | 11.25 | 45   | 8.3687          |
+| 7.8212        | 12.5  | 50   | 8.2344          |
+| 7.6616        | 13.75 | 55   | 8.1437          |
+| 7.4743        | 15.0  | 60   | 8.0750          |
+| 7.1668        | 16.25 | 65   | 8.0275          |
+| 7.0485        | 17.5  | 70   | 7.9937          |
+| 6.9619        | 18.75 | 75   | 7.9525          |
+| 6.8705        | 20.0  | 80   | 7.9584          |
+| 6.6232        | 21.25 | 85   | 7.9238          |
+| 6.6423        | 22.5  | 90   | 7.9155          |
+| 6.5876        | 23.75 | 95   | 7.9088          |
+| 6.5075        | 25.0  | 100  | 7.9154          |
+| 6.4218        | 26.25 | 105  | 7.8957          |
+| 6.2857        | 27.5  | 110  | 7.9040          |
+| 6.1833        | 28.75 | 115  | 7.9092          |
+| 6.1263        | 30.0  | 120  | 7.9198          |
+| 6.0123        | 31.25 | 125  | 7.9103          |
+| 5.9111        | 32.5  | 130  | 7.9150          |
+| 5.9157        | 33.75 | 135  | 7.9178          |
+| 5.8237        | 35.0  | 140  | 7.9479          |
+| 5.6626        | 36.25 | 145  | 7.9358          |
+| 5.657         | 37.5  | 150  | 7.9548          |
+| 5.5894        | 38.75 | 155  | 7.9572          |
+| 5.5157        | 40.0  | 160  | 7.9800          |
+| 5.4606        | 41.25 | 165  | 7.9481          |
+| 5.2962        | 42.5  | 170  | 7.9568          |
+| 5.2877        | 43.75 | 175  | 7.9720          |
+| 5.2395        | 45.0  | 180  | 7.9709          |
+| 5.1394        | 46.25 | 185  | 7.9900          |
+| 5.0096        | 47.5  | 190  | 8.0010          |
+| 4.9646        | 48.75 | 195  | 8.0105          |
+| 4.973         | 50.0  | 200  | 8.0182          |
+| 4.866         | 51.25 | 205  | 8.0310          |
+| 4.8044        | 52.5  | 210  | 8.0372          |
+| 4.7804        | 53.75 | 215  | 8.0387          |
+| 4.7187        | 55.0  | 220  | 8.0166          |
+| 4.6399        | 56.25 | 225  | 8.0598          |
+| 4.6644        | 57.5  | 230  | 8.0465          |
+| 4.5318        | 58.75 | 235  | 8.0482          |
+| 4.4451        | 60.0  | 240  | 8.0538          |
+| 4.4442        | 61.25 | 245  | 8.0473          |
+| 4.3778        | 62.5  | 250  | 8.0517          |
+| 4.4453        | 63.75 | 255  | 8.0740          |
+| 4.3813        | 65.0  | 260  | 8.0658          |
+| 4.2654        | 66.25 | 265  | 8.0764          |
+| 4.2278        | 67.5  | 270  | 8.0737          |
+| 4.2212        | 68.75 | 275  | 8.0952          |
+| 4.1481        | 70.0  | 280  | 8.0877          |
+| 4.162         | 71.25 | 285  | 8.0882          |
+| 4.077         | 72.5  | 290  | 8.0813          |
+| 4.0134        | 73.75 | 295  | 8.0862          |
+| 3.9975        | 75.0  | 300  | 8.0980          |
+| 3.9174        | 76.25 | 305  | 8.0989          |
+| 3.9748        | 77.5  | 310  | 8.0903          |
+| 3.9362        | 78.75 | 315  | 8.1109          |
+| 3.8585        | 80.0  | 320  | 8.1049          |
+| 3.8832        | 81.25 | 325  | 8.1076          |
+| 3.8799        | 82.5  | 330  | 8.1078          |
+| 3.8354        | 83.75 | 335  | 8.1073          |
+| 3.8073        | 85.0  | 340  | 8.1182          |
+| 3.8701        | 86.25 | 345  | 8.1179          |
+| 3.7696        | 87.5  | 350  | 8.1204          |
+| 3.7907        | 88.75 | 355  | 8.1187          |
+| 3.7428        | 90.0  | 360  | 8.1172          |
+| 3.7048        | 91.25 | 365  | 8.1201          |
+| 3.724         | 92.5  | 370  | 8.1205          |
+| 3.7308        | 93.75 | 375  | 8.1191          |
+| 3.7665        | 95.0  | 380  | 8.1211          |
+| 3.6804        | 96.25 | 385  | 8.1244          |
+| 3.6001        | 97.5  | 390  | 8.1220          |
+| 3.6411        | 98.75 | 395  | 8.1245          |
+| 3.6321        | 100.0 | 400  | 8.1252          |
+### Framework versions
+- Transformers 4.40.2
+- Pytorch 2.3.0
+- Datasets 2.19.1
+- Tokenizers 0.19.1

best/config.json ADDED Viewed

	@@ -0,0 +1,17 @@

+{
+  "architectures": [
+    "Transformer"
+  ],
+  "d_model": 512,
+  "dim_feedforward": 2048,
+  "dropout": 0.1,
+  "input_dim": 30000,
+  "max_seq_len": 2000,
+  "model_type": "transformer",
+  "nhead": 8,
+  "num_decoder_layers": 6,
+  "num_encoder_layers": 6,
+  "output_dim": 30000,
+  "torch_dtype": "float32",
+  "transformers_version": "4.40.2"
+}

best/generation_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "_from_model_config": true,
+  "transformers_version": "4.40.2"
+}

best/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8ee9037c31b021b86986350df09da26fe152902095d49e0fbbefbadcb70b0deb
+size 303680184

best/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:59f5bc51a6f9c3d21eb1957d594b098aa6c474540f980f553c7d3f1713f307b3
+size 5112

config.json ADDED Viewed

	@@ -0,0 +1,17 @@

+{
+  "architectures": [
+    "Transformer"
+  ],
+  "d_model": 512,
+  "dim_feedforward": 2048,
+  "dropout": 0.1,
+  "input_dim": 30000,
+  "max_seq_len": 2000,
+  "model_type": "transformer",
+  "nhead": 8,
+  "num_decoder_layers": 6,
+  "num_encoder_layers": 6,
+  "output_dim": 30000,
+  "torch_dtype": "float32",
+  "transformers_version": "4.40.2"
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "_from_model_config": true,
+  "transformers_version": "4.40.2"
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8ee9037c31b021b86986350df09da26fe152902095d49e0fbbefbadcb70b0deb
+size 303680184

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:59f5bc51a6f9c3d21eb1957d594b098aa6c474540f980f553c7d3f1713f307b3
+size 5112