File size: 743 Bytes
0284c37
 
 
 
 
 
 
6b11a5d
48f225a
6b11a5d
f5faaf4
6b11a5d
f5faaf4
6b11a5d
f5faaf4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
---
license: apache-2.0
base_model: distilbert-base-multilingual-cased
model-index:
- name: zero-shot-cross-lingual-transfer-demo-masked
  results: []
---

Next word prediction in 103 languages. Give it a sentence in another language, and replace one of the words with "[MASK]". Works with English too, obviously, but that defeats the point of the demo.

distilbert-base-multilingual-cased finetuned on 50,000 examples from r/explainlikeimfive subset of ELI5 dataset, for English causal language modelling. All knowledge of target languages is acquired from pretraining only.

Hyperparameters: epochs 3, learning rate 2e-5, batch size 8, weight decay 0.01, optimizer Adam with betas=(0.9,0.999) and epsilon=1e-08.

Final model perplexity 10.22