Muniekstache's picture
Update README.md
a1c1b6a verified
|
raw
history blame
1.78 kB
metadata
license: mit
language:
  - en
  - nl
tags:
  - machine-translation
  - low-resource
  - creativity
library_name: transformers
pipeline_tag: translation
model-index:
  - name: EN-DE  EN-NL  Creative
    results:
      - task:
          type: machine-translation
          name: Translation
        dataset:
          name: Dutch Parallel Corpus + OpenSubtitles (creative subset)
          type: Helsinki-NLP/open_subtitles
          split: test
        metrics:
          - type: sacrebleu
            name: SacreBLEU
            value: 18.35
            greater_is_better: true

EN-DE parent ➜ EN-NL fine-tuned on creative corpus

Authors: Niek Holter
Thesis: “Transferring Creativity”

Summary

This model starts from Helsinki-NLP’s MarianMT opus-mt-en-de and is fine-tuned on a 10k-sentence creative English–Dutch corpus (fiction + subtitles).
It is one of four systems trained for my bachelor’s thesis to study how transfer-learning settings affect MT creativity.

Parent model Fine-tune data BLEU COMET Transformed Creativity Score
en-de Creative 18.4 0.662 0.42

Intended use

  • Research on creative MT and low-resource transfer learning

Training details

  • Hardware  : NVIDIA GTX 1070 (CUDA 12.1)
  • Epochs : Early-stopped ≤ 200 (patience 5)
  • LR / batch : 2 e-5 / 16
  • Script : finetuning.py
  • Env : environment.yml

Data