metadata

license: cc-by-nc-3.0
language:
  - da
tags:
  - word embeddings
  - Danish

Danish medical word embeddings

MeDa-We was trained on a Danish medical corpus of 123M tokens. The word embeddings are 300-dimensional and are trained using FastText.

The embeddings were trained for 10 epochs using a window size of 5 and 10 negative samples.

The development of the corpus and word embeddings is described further in our paper.

We also trained a transformer model on the developed corpus which can be found here.

@article{
}