nbroad commited on
Commit
e9d8acf
·
1 Parent(s): eafa309

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -4
README.md CHANGED
@@ -7,14 +7,21 @@ metrics:
7
  model-index:
8
  - name: fix_punct_cased_t5_small
9
  results: []
 
 
 
 
 
 
10
  ---
 
 
11
 
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
 
15
- # fix_punct_cased_t5_small
 
 
16
 
17
- This model is a fine-tuned version of [google/t5-v1_1-small](https://huggingface.co/google/t5-v1_1-small) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
  - Loss: 0.2744
20
  - Rouge1: 93.3712
 
7
  model-index:
8
  - name: fix_punct_cased_t5_small
9
  results: []
10
+ datasets:
11
+ - https://huggingface.co/datasets/nbroad/fix_punctuation
12
+ widget:
13
+ - text: This is, a sentence. with odd punctuation to show off what, the model. can do
14
+ - text: What, should the proper. punctuation. in. this sentence be?
15
+ - text: Where are. we? What, is, the meaning, of this?
16
  ---
17
+ # fix_punct_cased_t5_small
18
+ This model is a fine-tuned version of [google/t5-v1_1-small](https://huggingface.co/google/t5-v1_1-small) on the [NPR utterances dataset](https://www.kaggle.com/datasets/shuyangli94/interview-npr-media-dialog-transcripts?select=utterances.csv).
19
 
 
 
20
 
21
+ ## Dataset
22
+ The model was trained on 80k rows from the above dataset consisting of NPR radio transcripts. Commans, periods, and semicolons were removed from the text and then random commas, periods, and semicolons were added. The model was trained to place those three punctuation marks in the correct location. The casing of the texts was not modified during training.
23
+
24
 
 
25
  It achieves the following results on the evaluation set:
26
  - Loss: 0.2744
27
  - Rouge1: 93.3712