Update README.md
Browse files
README.md
CHANGED
@@ -25,14 +25,13 @@ model-index:
|
|
25 |
- name: Rouge1
|
26 |
type: rouge
|
27 |
value: 42.3629
|
|
|
|
|
28 |
---
|
29 |
|
30 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
31 |
-
should probably proofread and complete it, then remove this comment. -->
|
32 |
-
|
33 |
# tFINE-base-300m-samsum
|
34 |
|
35 |
-
|
36 |
It achieves the following results on the evaluation set:
|
37 |
- Loss: 1.9820
|
38 |
- Rouge1: 42.3629
|
@@ -41,17 +40,8 @@ It achieves the following results on the evaluation set:
|
|
41 |
- Rougelsum: 38.7792
|
42 |
- Gen Len: 27.8033
|
43 |
|
44 |
-
|
45 |
-
|
46 |
-
More information needed
|
47 |
-
|
48 |
-
## Intended uses & limitations
|
49 |
-
|
50 |
-
More information needed
|
51 |
-
|
52 |
-
## Training and evaluation data
|
53 |
-
|
54 |
-
More information needed
|
55 |
|
56 |
## Training procedure
|
57 |
|
@@ -71,6 +61,9 @@ The following hyperparameters were used during training:
|
|
71 |
|
72 |
### Training results
|
73 |
|
|
|
|
|
|
|
74 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|
75 |
|:-------------:|:------:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
|
76 |
| 1.9528 | 0.9989 | 115 | 1.9189 | 40.093 | 18.2018 | 33.9749 | 36.9071 | 29.3333 |
|
@@ -78,10 +71,3 @@ The following hyperparameters were used during training:
|
|
78 |
| 1.1696 | 2.9967 | 345 | 1.9820 | 42.3629 | 18.4285 | 34.6339 | 38.7792 | 27.8033 |
|
79 |
| 0.9359 | 3.9957 | 460 | 2.1588 | 41.2237 | 17.8161 | 33.7101 | 37.9569 | 30.18 |
|
80 |
|
81 |
-
|
82 |
-
### Framework versions
|
83 |
-
|
84 |
-
- Transformers 4.44.0
|
85 |
-
- Pytorch 2.2.0+cu121
|
86 |
-
- Datasets 2.20.0
|
87 |
-
- Tokenizers 0.19.1
|
|
|
25 |
- name: Rouge1
|
26 |
type: rouge
|
27 |
value: 42.3629
|
28 |
+
library_name: transformers
|
29 |
+
pipeline_tag: summarization
|
30 |
---
|
31 |
|
|
|
|
|
|
|
32 |
# tFINE-base-300m-samsum
|
33 |
|
34 |
+
An example fine-tune of [pszemraj/tFINE-base-300m](https://hf.co/pszemraj/tFINE-base-300m) for summarization using the samsum dataset.
|
35 |
It achieves the following results on the evaluation set:
|
36 |
- Loss: 1.9820
|
37 |
- Rouge1: 42.3629
|
|
|
40 |
- Rougelsum: 38.7792
|
41 |
- Gen Len: 27.8033
|
42 |
|
43 |
+
> [!NOTE]
|
44 |
+
> The base model was pre-trained with CTX 1024 and fine-tuned on samsum with 1024 CTX inputs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
45 |
|
46 |
## Training procedure
|
47 |
|
|
|
61 |
|
62 |
### Training results
|
63 |
|
64 |
+
> keep epoch 3 checkpt as final
|
65 |
+
|
66 |
+
|
67 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|
68 |
|:-------------:|:------:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
|
69 |
| 1.9528 | 0.9989 | 115 | 1.9189 | 40.093 | 18.2018 | 33.9749 | 36.9071 | 29.3333 |
|
|
|
71 |
| 1.1696 | 2.9967 | 345 | 1.9820 | 42.3629 | 18.4285 | 34.6339 | 38.7792 | 27.8033 |
|
72 |
| 0.9359 | 3.9957 | 460 | 2.1588 | 41.2237 | 17.8161 | 33.7101 | 37.9569 | 30.18 |
|
73 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|