We have successfully trained a FastText-based Word2Vec model on our dataset, utilizing an embedding size of 100 dimensions.
This model is designed to generate vector representations for individual words and sub-words, allowing it to effectively capture semantic and morphological relationships within the text.
To obtain representations at the sentence level, we compute embeddings for all constituent words and sub-words in a given text and then apply averaging.
This approach ensures that the resulting sentence embedding encapsulates the overall meaning and preserving contextual nuances.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support