--- license: apache-2.0 datasets: - Arailym-aitu/small_kazakh_corpus language: - kk metrics: - accuracy - f1 base_model: - nur-dev/roberta-kaz-large new_version: nur-dev/roberta-kaz-large pipeline_tag: text-classification --- # Model Card for Model ID This model is designed for text classification tasks in the Kazakh language, based on the RoBERTa architecture and fine-tuned using the Small Kazakh Corpus dataset. ## Model Details ### Model Description The model aims to enhance natural language processing (NLP) capabilities for the Kazakh language, particularly in text classification tasks. - **Developed by:** Tleubayeva Arailym, Tabuldin Aisultan, Aubakirov Sultan - **Model type:** Transformer-based (RoBERTa) - **Language(s) (NLP):** Kazakh (kk) - **License:** apache-2.0 ### Results Evaluation results show an improvement in both accuracy and F1-score: Base model performance: Accuracy: 50.30% F1-score: 48.89% Fine-tuned model performance: Accuracy: 55.51% (+10%) F1-score: 54.83% (+5%) ## Citation We will definitely add a bit later. ## Model Card Authors Tleubayeva Arailym, PhD student of Astana IT University Tabuldin Aisultan, 3rd year student of Astana IT University Aubakirov Sultan, 3rd year student of Astana IT University