PEFT
Safetensors
Sinhala

Preliminary Testing of the First Fine-Tuned Sinhala AI Model

#2
by chamud - opened

Thanks for the attempt to create a first model trained for Sinhala language specifically. In the Sri Lankan market, AI solutions tend to be rule-based, template-based, or GPT-backed, with GPT-backed ones relying heavily upon translation services or multilingual support, which is typically not feasible for Sinhala. I tested on a medium-level GPU (e.g., NVIDIA GTX 1660 or equivalent) with default batch sizes and attempted multiple times to chat. But the model hardly takes good prompts. While in a few cases there are some efforts to create creatively, most of the outputs repeat words excessively, go off topic, or fail to produce coherent and reasonable responses. Adjusting parameters like temperature, top_k, and top_p only had a small effect in improving quality. Overall, the model has potential as a proof-of-concept for Sinhala language modeling but would require significant leaps in coherence, context understanding, and creative work to be pragmatically useful.

Resuilt_page-0003.jpg
Resuilt_page-0004.jpg
Resuilt_page-0001.jpg
Resuilt_page-0002.jpg

Polyglots FYP org

This is not an Instruct tuned model. This is a base model. In order to do what you are attempting to do in your example, you need to instruct tune this model first. Here is an explanation https://builder.aws.com/content/2ZVa61RxToXUFzcuY8Hbut6L150/what-is-an-instruct-model-instruction-and-chat-fine-tuning

Sign up or log in to comment