Update pipeline tag and add library name

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +17 -15
README.md CHANGED
@@ -1,11 +1,13 @@
1
  ---
 
2
  language: cs
3
  license: cc-by-nc-sa-4.0
4
  tags:
5
  - Czech
6
  - GEC
7
  - GECCC dataset
8
- base_model: google/byt5-small
 
9
  ---
10
 
11
  # Model Card for byt5-small-geccc-mate
@@ -18,20 +20,20 @@ the MATE method and the [GECCC dataset](https://hdl.handle.net/11234/1-4861).
18
 
19
  ## Model Description
20
 
21
- - **Developed by:** [Seznam.cz](https://seznam.cz) and [Charles University, MFF, ÚFAL](https://ufal.mff.cuni.cz/)
22
- - **Language(s) (NLP):** Czech
23
- - **Model type:** character-based encoder-decoder Transformer model
24
- - **Finetuned from model:** `google/byt5-small`
25
- - **Finetuned on:**
26
- - first synthetic errors generated by the MATE method (see [the paper](https://arxiv.org/abs/2506.22402))
27
- - then the [GECCC dataset](https://hdl.handle.net/11234/1-4861)
28
- - **License:** CC BY-NC-SA 4.0
29
 
30
  ## Model Sources
31
 
32
- - **Repository:** https://github.com/ufal/tsd2025-gec
33
- - **Paper:** [Refining Czech GEC: Insights from a Multi-Experiment Approach](https://arxiv.org/abs/2506.22402)
34
- - **Dataset:** [GECCC dataset](https://hdl.handle.net/11234/1-4861)
35
 
36
  ## Evaluation
37
 
@@ -69,8 +71,8 @@ print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
69
 
70
  ```
71
  @InProceedings{10.1007/978-3-032-02551-7_7,
72
- author="Pechman, Petr and Straka, Milan and Strakov{\'a}, Jana and N{\'a}plava, Jakub",
73
- editor="Ek{\v{s}}tein, Kamil and Konop{\'i}k, Miloslav and Pra{\v{z}}{\'a}k, Ond{\v{r}}ej and P{\'a}rtl, Franti{\v{s}}ek",
74
  title="Refining Czech GEC: Insights from a Multi-experiment Approach",
75
  booktitle="Text, Speech, and Dialogue",
76
  year="2026",
@@ -80,4 +82,4 @@ print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
80
  isbn="978-3-032-02551-7",
81
  doi="10.1007/978-3-032-02551-7_7"
82
  }
83
- ```
 
1
  ---
2
+ base_model: google/byt5-small
3
  language: cs
4
  license: cc-by-nc-sa-4.0
5
  tags:
6
  - Czech
7
  - GEC
8
  - GECCC dataset
9
+ pipeline_tag: text-generation
10
+ library_name: transformers
11
  ---
12
 
13
  # Model Card for byt5-small-geccc-mate
 
20
 
21
  ## Model Description
22
 
23
+ - **Developed by:** [Seznam.cz](https://seznam.cz) and [Charles University, MFF, ÚFAL](https://ufal.mff.cuni.cz/)
24
+ - **Language(s) (NLP):** Czech
25
+ - **Model type:** character-based encoder-decoder Transformer model
26
+ - **Finetuned from model:** `google/byt5-small`
27
+ - **Finetuned on:**
28
+ - first synthetic errors generated by the MATE method (see [the paper](https://arxiv.org/abs/2506.22402))
29
+ - then the [GECCC dataset](https://hdl.handle.net/11234/1-4861)
30
+ - **License:** CC BY-NC-SA 4.0
31
 
32
  ## Model Sources
33
 
34
+ - **Repository:** https://github.com/ufal/tsd2025-gec
35
+ - **Paper:** [Refining Czech GEC: Insights from a Multi-Experiment Approach](https://arxiv.org/abs/2506.22402)
36
+ - **Dataset:** [GECCC dataset](https://hdl.handle.net/11234/1-4861)
37
 
38
  ## Evaluation
39
 
 
71
 
72
  ```
73
  @InProceedings{10.1007/978-3-032-02551-7_7,
74
+ author="Pechman, Petr and Straka, Milan and Strakov{\'a}, Jana and Náplava, Jakub",
75
+ editor="Ek{\v{s}}tein, Kamil and Konopík, Miloslav and Pražák, Ondřej and Pártl, František",
76
  title="Refining Czech GEC: Insights from a Multi-experiment Approach",
77
  booktitle="Text, Speech, and Dialogue",
78
  year="2026",
 
82
  isbn="978-3-032-02551-7",
83
  doi="10.1007/978-3-032-02551-7_7"
84
  }
85
+ ```