--- language: - et base_model: - EMBEDDIA/est-roberta pipeline_tag: token-classification library_name: transformers tags: - NER license: cc-by-4.0 --- # est-roberta-ud-ner ### Model Description est-roberta-ud-ner is an [Est-RoBERTa](https://huggingface.co/EMBEDDIA/est-roberta) based model fine-tuned for named entity recognition in Estonian on the [EDT](https://github.com/UniversalDependencies/UD_Estonian-EDT) and [EWT](https://github.com/UniversalDependencies/UD_Estonian-EWT) datasets. ### How to use The model can be used with Transformers pipeline for NER. Try it in Google Colab, where the Transformers library is pre-installed or on your local machine (preferably using a virtual environment, see tutorial below) and install the Transformers library using ```pip install transformers```. ``` from transformers import pipeline ner = pipeline("ner", model="vbius01/est-roberta-ud-ner") text = "Eesti kuulub erinevalt Lätist ja Leedust kahtlemata Põhjamaade kultuuriruumi." results = ner(text) print(results) ``` ``` [{'entity': 'B-GEP', 'score': np.float32(0.99339926), 'index': 1, 'word': '▁Eesti', 'start': 0, 'end': 5}, {'entity': 'B-GEP', 'score': np.float32(0.9923631), 'index': 4, 'word': '▁Lätist', 'start': 22, 'end': 29}, {'entity': 'B-GEP', 'score': np.float32(0.990756), 'index': 6, 'word': '▁Leedust', 'start': 32, 'end': 40}, {'entity': 'B-LOC', 'score': np.float32(0.61792), 'index': 8, 'word': '▁Põhjamaade', 'start': 51, 'end': 62}] ``` - **Repository:** [github.com/martinkivisikk/ner_thesis](https://github.com/martinkivisikk/ner_thesis) - **Paper:** [Developing a NER Model Based on Treebank Corpora]() ### Virtual environment setup Create and activate a virtual environment in your project directory with venv. ``` python -m venv .env source .env/bin/activate ``` ## Uses This model can be used to find named entities from Estonian texts.