Andrija
/

SRoBERTa-L

Model card Files Files and versions

SRoBERTa-L / README.md

Andrija's picture

Add "multilingual" to the language tag (#1)

fbaa893 over 2 years ago

|

674 Bytes

	---
	datasets:
	- oscar
	- srwac
	- leipzig

	language:
	- hr
	- sr
	- multilingual

	tags:
	- masked-lm

	widget:
	- text: "Ovo je početak <mask>."

	license: apache-2.0

	---
	# Transformer language model for Croatian and Serbian
	Trained on 6GB datasets that contain Croatian and Serbian language for two epochs (500k steps).
	Leipzig, OSCAR and srWac datasets

	\| Model \| #params \| Arch. \| Training data \|
	\|--------------------------------\|--------------------------------\|-------\|-----------------------------------\|
	\| `Andrija/SRoBERTa-L` \| 80M \| Third \| Leipzig Corpus, OSCAR and srWac (6 GB of text) \|