End of training
Browse files
README.md
CHANGED
|
@@ -18,11 +18,11 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 18 |
|
| 19 |
This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on an unknown dataset.
|
| 20 |
It achieves the following results on the evaluation set:
|
| 21 |
-
- Loss: 2.
|
| 22 |
-
- Rouge1: 0.
|
| 23 |
-
- Rouge2: 0.
|
| 24 |
-
- Rougel: 0.
|
| 25 |
-
- Rougelsum: 0.
|
| 26 |
- Gen Len: 20.0
|
| 27 |
|
| 28 |
## Model description
|
|
@@ -43,22 +43,48 @@ More information needed
|
|
| 43 |
|
| 44 |
The following hyperparameters were used during training:
|
| 45 |
- learning_rate: 2e-05
|
| 46 |
-
- train_batch_size:
|
| 47 |
-
- eval_batch_size:
|
| 48 |
- seed: 42
|
| 49 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 50 |
- lr_scheduler_type: linear
|
| 51 |
-
- num_epochs:
|
| 52 |
- mixed_precision_training: Native AMP
|
| 53 |
|
| 54 |
### Training results
|
| 55 |
|
| 56 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|
| 57 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
|
| 58 |
-
| No log | 1.0 |
|
| 59 |
-
| No log | 2.0 |
|
| 60 |
-
| No log | 3.0 |
|
| 61 |
-
| No log | 4.0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
|
| 64 |
### Framework versions
|
|
|
|
| 18 |
|
| 19 |
This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on an unknown dataset.
|
| 20 |
It achieves the following results on the evaluation set:
|
| 21 |
+
- Loss: 2.2923
|
| 22 |
+
- Rouge1: 0.1987
|
| 23 |
+
- Rouge2: 0.0971
|
| 24 |
+
- Rougel: 0.1702
|
| 25 |
+
- Rougelsum: 0.1701
|
| 26 |
- Gen Len: 20.0
|
| 27 |
|
| 28 |
## Model description
|
|
|
|
| 43 |
|
| 44 |
The following hyperparameters were used during training:
|
| 45 |
- learning_rate: 2e-05
|
| 46 |
+
- train_batch_size: 32
|
| 47 |
+
- eval_batch_size: 32
|
| 48 |
- seed: 42
|
| 49 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 50 |
- lr_scheduler_type: linear
|
| 51 |
+
- num_epochs: 30
|
| 52 |
- mixed_precision_training: Native AMP
|
| 53 |
|
| 54 |
### Training results
|
| 55 |
|
| 56 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|
| 57 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
|
| 58 |
+
| No log | 1.0 | 31 | 2.5664 | 0.1535 | 0.0599 | 0.1259 | 0.126 | 20.0 |
|
| 59 |
+
| No log | 2.0 | 62 | 2.5187 | 0.1742 | 0.0706 | 0.1446 | 0.1446 | 20.0 |
|
| 60 |
+
| No log | 3.0 | 93 | 2.4849 | 0.1909 | 0.0835 | 0.1607 | 0.1606 | 20.0 |
|
| 61 |
+
| No log | 4.0 | 124 | 2.4579 | 0.197 | 0.0876 | 0.1651 | 0.1651 | 20.0 |
|
| 62 |
+
| No log | 5.0 | 155 | 2.4365 | 0.1955 | 0.086 | 0.1636 | 0.1634 | 20.0 |
|
| 63 |
+
| No log | 6.0 | 186 | 2.4185 | 0.1969 | 0.0877 | 0.1655 | 0.1654 | 20.0 |
|
| 64 |
+
| No log | 7.0 | 217 | 2.4042 | 0.1975 | 0.0894 | 0.1669 | 0.1667 | 20.0 |
|
| 65 |
+
| No log | 8.0 | 248 | 2.3883 | 0.1967 | 0.089 | 0.1665 | 0.1664 | 20.0 |
|
| 66 |
+
| No log | 9.0 | 279 | 2.3775 | 0.1969 | 0.0903 | 0.1672 | 0.1671 | 20.0 |
|
| 67 |
+
| No log | 10.0 | 310 | 2.3660 | 0.1977 | 0.0913 | 0.1683 | 0.1684 | 20.0 |
|
| 68 |
+
| No log | 11.0 | 341 | 2.3555 | 0.1976 | 0.0919 | 0.1687 | 0.1687 | 20.0 |
|
| 69 |
+
| No log | 12.0 | 372 | 2.3491 | 0.198 | 0.092 | 0.1682 | 0.1682 | 20.0 |
|
| 70 |
+
| No log | 13.0 | 403 | 2.3410 | 0.1987 | 0.0943 | 0.1692 | 0.1691 | 20.0 |
|
| 71 |
+
| No log | 14.0 | 434 | 2.3360 | 0.1998 | 0.0957 | 0.1703 | 0.1702 | 20.0 |
|
| 72 |
+
| No log | 15.0 | 465 | 2.3286 | 0.1998 | 0.0952 | 0.1706 | 0.1706 | 20.0 |
|
| 73 |
+
| No log | 16.0 | 496 | 2.3226 | 0.1993 | 0.095 | 0.1703 | 0.1704 | 20.0 |
|
| 74 |
+
| 2.4711 | 17.0 | 527 | 2.3194 | 0.1992 | 0.0959 | 0.1707 | 0.1707 | 20.0 |
|
| 75 |
+
| 2.4711 | 18.0 | 558 | 2.3147 | 0.199 | 0.0958 | 0.1708 | 0.1708 | 20.0 |
|
| 76 |
+
| 2.4711 | 19.0 | 589 | 2.3114 | 0.1987 | 0.0962 | 0.1707 | 0.1708 | 20.0 |
|
| 77 |
+
| 2.4711 | 20.0 | 620 | 2.3076 | 0.199 | 0.0956 | 0.1704 | 0.1703 | 20.0 |
|
| 78 |
+
| 2.4711 | 21.0 | 651 | 2.3041 | 0.1986 | 0.0963 | 0.1698 | 0.1698 | 20.0 |
|
| 79 |
+
| 2.4711 | 22.0 | 682 | 2.3012 | 0.1993 | 0.0969 | 0.1707 | 0.1706 | 20.0 |
|
| 80 |
+
| 2.4711 | 23.0 | 713 | 2.2982 | 0.1993 | 0.0968 | 0.1704 | 0.1704 | 20.0 |
|
| 81 |
+
| 2.4711 | 24.0 | 744 | 2.2975 | 0.1991 | 0.0965 | 0.1704 | 0.1704 | 20.0 |
|
| 82 |
+
| 2.4711 | 25.0 | 775 | 2.2968 | 0.1988 | 0.0965 | 0.1701 | 0.17 | 20.0 |
|
| 83 |
+
| 2.4711 | 26.0 | 806 | 2.2951 | 0.1983 | 0.0965 | 0.1701 | 0.1699 | 20.0 |
|
| 84 |
+
| 2.4711 | 27.0 | 837 | 2.2935 | 0.1986 | 0.0973 | 0.1704 | 0.1702 | 20.0 |
|
| 85 |
+
| 2.4711 | 28.0 | 868 | 2.2927 | 0.1987 | 0.0971 | 0.1703 | 0.1702 | 20.0 |
|
| 86 |
+
| 2.4711 | 29.0 | 899 | 2.2925 | 0.1987 | 0.0971 | 0.1702 | 0.1701 | 20.0 |
|
| 87 |
+
| 2.4711 | 30.0 | 930 | 2.2923 | 0.1987 | 0.0971 | 0.1702 | 0.1701 | 20.0 |
|
| 88 |
|
| 89 |
|
| 90 |
### Framework versions
|
runs/Feb14_20-38-08_cbe6401c379d/events.out.tfevents.1739565491.cbe6401c379d.3312.5
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a51e6fa417b4d5b1c429b545d12b6e5fe90d3769199c8482e754abb704f1a822
|
| 3 |
+
size 22278
|