--- license: mit language: - en - nl tags: - machine-translation - low-resource - creativity library_name: transformers pipeline_tag: translation model-index: - name: EN-FR → EN-NL • Creative results: - task: type: machine-translation name: Translation dataset: name: Dutch Parallel Corpus + OpenSubtitles (creative subset) type: Helsinki-NLP/open_subtitles split: test metrics: - type: sacrebleu name: SacreBLEU value: 17.748 greater_is_better: true --- # EN-DE parent ➜ EN-NL fine-tuned on creative corpus **Authors:** Niek Holter **Thesis:** “Transferring Creativity” ## Summary This model starts from Helsinki-NLP’s MarianMT `opus-mt-en-fr` and is fine-tuned on a 10k-sentence **creative** English–Dutch corpus (fiction + subtitles). It is one of four systems trained for my bachelor’s thesis to study how transfer-learning settings affect MT creativity. | Parent model | Fine-tune data | BLEU | COMET | Transformed Creativity Score | |-------------|----------------|------|-------|------------------| | en-fr | Creative | 17.7| 0.615 | 0.37 | ## Intended use * Research on creative MT and low-resource transfer learning ## Training details * Hardware  : NVIDIA GTX 1070 (CUDA 12.1) * Epochs : Early-stopped ≤ 200 (patience 5) * LR / batch : 2 e-5 / 16 * Script : [`finetuning.py`](./finetuning.py) * Env : [`environment.yml`](./environment.yml) ## Data * **Creative corpus** (7.6 k fiction sentences from DPC + 2.4 k OpenSubtitles). * Sentence-level 1:1 alignments; deduplicated to avoid leakage. See https://github.com/muniekstache/Transfer-Creativity.git for full pipeline.