--- language: dje tags: - fasttext - word-embeddings - zarma - nlp license: apache-2.0 datasets: - 27Group/noisy_zarma --- ## Description This repository contains a pre-trained FastText model for the Zarma language. The model generates word embeddings for Zarma text, capturing semantic and contextual information for various NLP tasks. ## Tasks - **Word Embeddings**: Generate vector representations for Zarma words. - **Part-of-Speech (POS) Tagging**: Provide features for POS tagging models. - **Text Classification**: Use embeddings for sentiment analysis or topic classification. - **Semantic Similarity**: Compute similarity between Zarma words or phrases. ## Usage Examples ### 1. Word Embeddings Load the FastText model to get word embeddings for Zarma text. ```python import fasttext model = fasttext.load_model('zarma_fasttext.bin') word = "ay" embedding = model.get_word_vector(word) print(f"Embedding for '{word}': {embedding[:5]}...") ``` ### 2. Semantic Similarity ```python import fasttext import numpy as np model = fasttext.load_model('zarma_fasttext.bin') word1 = "ay" word2 = "ni" vec1 = model.get_word_vector(word1) vec2 = model.get_word_vector(word2) similarity = np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2) + 1e-8) print(f"Similarity between '{word1}' and '{word2}': {similarity:.4f}") ``` ## How to Use Install FastText: **pip install fasttext** Download **zarma_fasttext.bin** from this repository. Use the code snippets above to integrate the model into your NLP pipeline. ## How to cite If you use this model in your work, please cite: ``` @misc{zarma_fasttext, title = {Pre-trained FastText Embeddings for Zarma}, author = {Mamadou K. Keita and Christopher Homan}, year = {2025}, howpublished = {\url{https://huggingface.co/27Group/zarma_fasttext}} } ```