nielsr HF Staff commited on
Commit
405534c
Β·
verified Β·
1 Parent(s): ae05d29

Add pipeline tag, library name and link to Github repo

Browse files

This PR adds the `pipeline_tag` and `library_name` to the model card metadata.
The `pipeline_tag` is set to `automatic-speech-recognition` as this model performs Automatic Speech Recognition.
The `library_name` is set to `transformers` because the model is readily usable with the Hugging Face Transformers library.
I also added a link to the Github repository in the overview.
This addition improves discoverability of the model on the Hugging Face Hub.

Files changed (1) hide show
  1. README.md +7 -9
README.md CHANGED
@@ -1,19 +1,21 @@
1
  ---
2
- license: cc-by-4.0
3
- language:
4
- - en
5
- - it
6
  datasets:
7
  - FBK-MT/mosel
8
  - facebook/covost2
9
  - openslr/librispeech_asr
10
  - facebook/voxpopuli
 
 
 
 
11
  metrics:
12
  - wer
13
  tags:
14
  - speech
15
  - speech recognition
16
  - ASR
 
 
17
  ---
18
 
19
  # FAMA-small-asr
@@ -40,7 +42,6 @@ All the artifacts used for realizing FAMA models, including codebase, datasets,
40
  themself are [released under OS-compliant licenses](#license), promoting a more
41
  responsible creation of models in our community.
42
 
43
-
44
  It is available in 2 sizes, with 2 variants for ASR only:
45
 
46
  - [FAMA-small](https://huggingface.co/FBK-MT/fama-small) - 475 million parameters
@@ -49,7 +50,7 @@ It is available in 2 sizes, with 2 variants for ASR only:
49
  - [FAMA-medium-asr](https://huggingface.co/FBK-MT/fama-medium-asr) - 878 million parameters
50
 
51
  For more information about FAMA, please check our [blog post](https://huggingface.co/blog/FAMA/release) and the [arXiv](https://arxiv.org/abs/2505.22759) preprint.
52
-
53
 
54
  ## Usage
55
 
@@ -124,7 +125,6 @@ We also benchmark FAMA in terms of computational time and maximum batch size sup
124
  - FAMA achieves up to 4.2 WER improvement on average across languages compared to OWSM v3.1
125
  - FAMA is up to 8 times faster than Whisper large-v3 while achieving comparable performance
126
 
127
-
128
  ### Automatic Speech Recogniton (ASR)
129
  | ***Model/Dataset WER (↓)*** | **CommonVoice**-*en* | **CommonVoice**-*it* | **MLS**-*en* | **MLS**-*it* | **VoxPopuli**-*en* | **VoxPopuli**-*it* | **AVG**-*en* | **AVG**-*it* |
130
  |-----------------------------------------|---------|---------|---------|---------|---------|----------|---------|----------|
@@ -138,7 +138,6 @@ We also benchmark FAMA in terms of computational time and maximum batch size sup
138
  | FAMA *small* | 13.7 | 8.6 | 5.8 | 12.8 | 7.3 | **15.6** | 8.9 | 12.3 |
139
  | FAMA *medium* | 11.5 | 7.0 | 5.2 | 13.9 | 7.2 | 15.9 | 8.0 | 12.3 |
140
 
141
-
142
  ### Computational Time and Maximum Batch Size
143
 
144
  | ***Model*** | ***Batch Size*** | ***xRTF en (↑)*** | ***xRTF it (↑)*** | ***xRTF AVG (↑)*** |
@@ -150,7 +149,6 @@ We also benchmark FAMA in terms of computational time and maximum batch size sup
150
  | FAMA *small* | 16 | **57.4** | **56.0** | **56.7** |
151
  | FAMA *medium* | 8 | 39.5 | 41.2 | 40.4 |
152
 
153
-
154
  ## License
155
 
156
  We release the FAMA model weights, and training data under the CC-BY 4.0 license.
 
1
  ---
 
 
 
 
2
  datasets:
3
  - FBK-MT/mosel
4
  - facebook/covost2
5
  - openslr/librispeech_asr
6
  - facebook/voxpopuli
7
+ language:
8
+ - en
9
+ - it
10
+ license: cc-by-4.0
11
  metrics:
12
  - wer
13
  tags:
14
  - speech
15
  - speech recognition
16
  - ASR
17
+ library_name: transformers
18
+ pipeline_tag: automatic-speech-recognition
19
  ---
20
 
21
  # FAMA-small-asr
 
42
  themself are [released under OS-compliant licenses](#license), promoting a more
43
  responsible creation of models in our community.
44
 
 
45
  It is available in 2 sizes, with 2 variants for ASR only:
46
 
47
  - [FAMA-small](https://huggingface.co/FBK-MT/fama-small) - 475 million parameters
 
50
  - [FAMA-medium-asr](https://huggingface.co/FBK-MT/fama-medium-asr) - 878 million parameters
51
 
52
  For more information about FAMA, please check our [blog post](https://huggingface.co/blog/FAMA/release) and the [arXiv](https://arxiv.org/abs/2505.22759) preprint.
53
+ The code is available in the [Github repository](https://github.com/hlt-mt/FBK-fairseq).
54
 
55
  ## Usage
56
 
 
125
  - FAMA achieves up to 4.2 WER improvement on average across languages compared to OWSM v3.1
126
  - FAMA is up to 8 times faster than Whisper large-v3 while achieving comparable performance
127
 
 
128
  ### Automatic Speech Recogniton (ASR)
129
  | ***Model/Dataset WER (↓)*** | **CommonVoice**-*en* | **CommonVoice**-*it* | **MLS**-*en* | **MLS**-*it* | **VoxPopuli**-*en* | **VoxPopuli**-*it* | **AVG**-*en* | **AVG**-*it* |
130
  |-----------------------------------------|---------|---------|---------|---------|---------|----------|---------|----------|
 
138
  | FAMA *small* | 13.7 | 8.6 | 5.8 | 12.8 | 7.3 | **15.6** | 8.9 | 12.3 |
139
  | FAMA *medium* | 11.5 | 7.0 | 5.2 | 13.9 | 7.2 | 15.9 | 8.0 | 12.3 |
140
 
 
141
  ### Computational Time and Maximum Batch Size
142
 
143
  | ***Model*** | ***Batch Size*** | ***xRTF en (↑)*** | ***xRTF it (↑)*** | ***xRTF AVG (↑)*** |
 
149
  | FAMA *small* | 16 | **57.4** | **56.0** | **56.7** |
150
  | FAMA *medium* | 8 | 39.5 | 41.2 | 40.4 |
151
 
 
152
  ## License
153
 
154
  We release the FAMA model weights, and training data under the CC-BY 4.0 license.