Preliminary Testing of the First Fine-Tuned Sinhala AI Model

by chamud - opened 22 days ago

22 days ago

Thanks for the attempt to create a first model trained for Sinhala language specifically. In the Sri Lankan market, AI solutions tend to be rule-based, template-based, or GPT-backed, with GPT-backed ones relying heavily upon translation services or multilingual support, which is typically not feasible for Sinhala. I tested on a medium-level GPU (e.g., NVIDIA GTX 1660 or equivalent) with default batch sizes and attempted multiple times to chat. But the model hardly takes good prompts. While in a few cases there are some efforts to create creatively, most of the outputs repeat words excessively, go off topic, or fail to produce coherent and reasonable responses. Adjusting parameters like temperature, top_k, and top_p only had a small effect in improving quality. Overall, the model has potential as a proof-of-concept for Sinhala language modeling but would require significant leaps in coherence, context understanding, and creative work to be pragmatically useful.

NisansaDdS

Polyglots FYP org 22 days ago

This is not an Instruct tuned model. This is a base model. In order to do what you are attempting to do in your example, you need to instruct tune this model first. Here is an explanation https://builder.aws.com/content/2ZVa61RxToXUFzcuY8Hbut6L150/what-is-an-instruct-model-instruction-and-chat-fine-tuning

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment