fgaim commited on
Commit
56143a6
·
1 Parent(s): d4d974d

Update README

Browse files
Files changed (1) hide show
  1. README.md +5 -2
README.md CHANGED
@@ -1,15 +1,18 @@
1
  # RoBERTa Pretrained for Tigrinya Language
2
 
3
- We pretrain RoBERTa base on a relatively small dataset for Tigrinya (34M tokens).
 
 
4
 
5
 
6
  ## Hyperparameters
7
 
8
  The hyperparameters corresponding to model sizes mentioned above are as follows:
 
9
  | Model Size | L | AH | HS | FFN | P |
10
  |------------|----|----|-----|------|------|
11
  | BASE | 12 | 12 | 768 | 3072 | 125M |
12
 
13
- (AH = number of attention heads; HS = hidden size; FFN = feedforward network dimension; P = number of parameters.)
14
 
15
 
 
1
  # RoBERTa Pretrained for Tigrinya Language
2
 
3
+ We pretrain a RoBERTa Base model on a relatively small dataset for Tigrinya (34M tokens) for 18 epochs.
4
+
5
+ Contained in this card is a PyTorch model exported from the original model that was trained on TPU v3.8 with Flax.
6
 
7
 
8
  ## Hyperparameters
9
 
10
  The hyperparameters corresponding to model sizes mentioned above are as follows:
11
+
12
  | Model Size | L | AH | HS | FFN | P |
13
  |------------|----|----|-----|------|------|
14
  | BASE | 12 | 12 | 768 | 3072 | 125M |
15
 
16
+ (L = number of layers; AH = number of attention heads; HS = hidden size; FFN = feedforward network dimension; P = number of parameters.)
17
 
18