EN-DE parent ➜ EN-NL fine-tuned on creative corpus
Authors: Niek Holter
Thesis: “Transferring Creativity”
Summary
This model starts from Helsinki-NLP’s MarianMT opus-mt-en-fr
and is fine-tuned on a 10k-sentence creative English–Dutch corpus (fiction + subtitles).
It is one of four systems trained for my bachelor’s thesis to study how transfer-learning settings affect MT creativity.
Parent model | Fine-tune data | BLEU | COMET | Transformed Creativity Score |
---|---|---|---|---|
en-fr | Creative | 17.7 | 0.615 | 0.37 |
Intended use
- Research on creative MT and low-resource transfer learning
Training details
- Hardware : NVIDIA GTX 1070 (CUDA 12.1)
- Epochs : Early-stopped ≤ 200 (patience 5)
- LR / batch : 2 e-5 / 16
- Script :
finetuning.py
- Env :
environment.yml
Data
- Creative corpus (7.6 k fiction sentences from DPC + 2.4 k OpenSubtitles).
- Sentence-level 1:1 alignments; deduplicated to avoid leakage.
See https://github.com/muniekstache/Transfer-Creativity.git for full pipeline.
- Downloads last month
- 14
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Evaluation results
- SacreBLEU on Dutch Parallel Corpus + OpenSubtitles (creative subset)test set self-reported17.748