BiniyamAjaw
/

amharic_tokenizer

Model card Files Files and versions Community

Amharic Tokenizer

Model Details

Vocabulary Size: 100,000
Tokenizer Type: Byte-Pair Encoder

Model Description

Developed by: Biniyam Ajaw
Language(s) (NLP): Amharic and Amharic-Driven Languages
License: MIT

Model Sources [optional]

Repository: https://github.com/biniyam69/Amharic-LLM-Finetuning/

Uses

Model can be called by the autotokenizer module from the transformers package and can be used to tokenize any amharic text perfectly

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train BiniyamAjaw/amharic_tokenizer