Automatic word-level translation alignment of Ancient Greek and Portuguese parallel texts

Translation alignment is the process of mapping translation equivalents between two texts in different languages. It can be performed on different granularity levels, form document, paragraph, sentence, and word/phrase levels.

This model has been developed in the context of a project called Linking Ancient Languages to Portuguese and Improving an Automatic Translation Alignment Model (Letras clássicas digitais: interligando línguas antigas ao português e aprimorando um modelo automático de alinhamento de tradução).

Funding agency: FAPESP proc.# 2022/09490-0
Host Institution: Faculty of Sciences and Letters, UNESP/Araraquara
Grant within the eScience Technological Innovation Programs - Research Program in eScience and Data Science / Research Project - Regular

The main objective of the project was to develop an automatic translation aligner specifically for Ancient Greek and Brazilian Portuguese. This implied the following tasks: a) to generate manual alignments based on the gold standard guidelines, Ancient Greek texts with parallel Brazilian Portuguese translations, creating pairs of alignments aimed at b) the supervised training of automatic alignment models for the two languages, so that they can be used to create automatic alignments. The motivation for having an automatic aligner for Greek-Portuguese translations includes the possibility of generating a larger quantity of aligned texts for reading Ancient Greek by Portuguese speakers, expanding the dynamic lexicon relating to the two languages, as well as composing a digital corpus that can be used in other services.

Usage

This colab notebook contains the necessary code to use the model.

Citation:

@inproceedings{youseftranslation,
  title={Translation Alignment for Ancient Greek and Portuguese},
  author={Yousef, Tariq and Ferreira, Anise D’Orange and dos Reis, Michel F and Palladino, Chiara}
  journal={DH2024 Book of Abstracts},
  year={2024}
  pages={754--756},
}
Downloads last month
5
Safetensors
Model size
270M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for UGARIT/grc-por-alignment

Finetuned
(4)
this model