Sorachio 1B - Conversational AI Model

Overview

Sorachio is a fine-tuned conversational AI model based on the Gemma3 architecture, optimized for roleplay and conversational tasks. The model has been trained using Supervised Fine-Tuning (SFT) with QLoRA (Quantized Low-Rank Adaptation) techniques to achieve efficient parameter updates while maintaining high performance.

Dataset

The model was trained on a custom curated dataset focusing on conversational and roleplay scenarios:

Quick Start

Installation

pip install transformers torch accelerate

Inference Example

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "IzzulGod/sorachio-1b-8192-2e-4-it-v1"
tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",               
    torch_dtype=torch.float16,       
    attn_implementation="eager"      
).eval()

messages = [
    {"role": "user", "content": "Apa itu Machine Learning?"}
]

input_ids = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

with torch.no_grad():
    outputs = model.generate(
        input_ids=input_ids,
        attention_mask=(input_ids != tokenizer.pad_token_id).long(),
        max_new_tokens=512,
        do_sample=True,
        top_p=0.95,
        temperature=0.7,
        pad_token_id=tokenizer.eos_token_id
    )

output_text = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
print("Sorachio:", output_text.strip())

Model Downloads

For efficient inference on various hardware configurations:

  • F16 GGUF - Full precision quantized model
  • Q8_0 GGUF - 8-bit quantized model for memory efficiency

Training Details

Technical Specifications

  • Architecture: Transformer-based language model
  • Fine-tuning Method: Supervised Fine-Tuning (SFT) + Quantized Low-Rank Adaptation (QLoRA)
  • LoRA Rank (r): 8
  • LoRA Alpha: 16
  • LoRA Dropout: 0.05
  • Target Modules:
    • Attention: q_proj, k_proj, v_proj, o_proj
    • MLP: gate_proj, up_proj, down_proj
  • Quantization: 4-bit (NF4 via bitsandbytes)
  • Optimizer: AdamW 8-bit
  • Learning Rate: 2e-4 with cosine scheduling
  • Batch Size: 2 (per device) × 4 (gradient accumulation) = 8 effective
  • Training Epochs: 3
  • Precision: FP16 for efficient training

Training Results

The model achieved consistent loss reduction during training:

[192/192 09:45, Epoch 3/3]
Step	Training Loss
20	    4.942400
40	    2.978700
60	    2.624300
80	    2.247000
100	    2.157000
120	    2.083200
140	    2.010100
160	    1.916800
180	    1.848900

Final Training Loss: 2.489
Training Runtime: 594.31 seconds
Training Samples/Second: 2.569

Use Cases

  • Conversational AI: General purpose chatbot applications
  • Roleplay: Interactive storytelling and character-based conversations
  • Indonesian Language Tasks: Optimized for Indonesian language understanding
  • Educational Applications: Q&A systems and tutoring applications

Limitations

  • Performance may vary for highly specialized technical domains
  • As a 1B parameter model, it may have limitations compared to larger models
  • Responses should be validated for factual accuracy in critical applications

License

This model is released under the Apache 2.0 License. Please refer to the license file for more details.

Note: This entire project — from dataset preprocessing to fine-tuning and evaluation — was successfully completed using a free Google Colab T4 GPU environment, demonstrating its feasibility even on limited computing resources.

Downloads last month
38
Safetensors
Model size
1,000M params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for IzzulGod/sorachio-1b-8192-2e-4-it-v1

Quantized
(101)
this model

Dataset used to train IzzulGod/sorachio-1b-8192-2e-4-it-v1