Nizami-1.7B / README.md
Rustamshry's picture
Update README.md
0a28495 verified
metadata
base_model: unsloth/Qwen3-1.7B
library_name: peft
license: mit
datasets:
  - az-llm/az_academic_qa-v1.0
language:
  - az
pipeline_tag: text-generation
tags:
  - transformers
  - unsloth
  - trl
  - sft
  - lora

Nizami-1.7B

A Lightweight Language Model

Model Description 📝

Nizami-1.7B is a fine-tuned version of Qwen3-1.7B for academic-style comprehension and reasoning in Azerbaijani. It was trained on a curated dataset of 7,000 examples from historical, legal, philosophical, and social science texts.

Key Features ✨

  • Architecture: Transformer-based language model 🏗️
  • Developed by: Rustam Shiriyev
  • Language(s): Azerbaijani
  • License: MIT
  • Fine-Tuning Method: Supervised fine-tuning
  • Domain: Academic texts (History, Law, Philosophy, Social Sciences) 📚
  • Finetuned from model: unsloth/Qwen3-1.7B
  • Dataset: az-llm/az_academic_qa-v1.0

Intended Use

  • Academic research assistance in Azerbaijani 🏆
  • Question answering on humanities/social science topics 🎯
  • Knowledge exploration in Azerbaijani⚡

Limitations ⚠️

  • Generating factual statements without verification
  • Limited dataset size (7,000 examples) → may not generalize perfectly outside training domains.
  • Possible hallucinations if asked for factual details.

How to Get Started with the Model 💻

from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

login(token="")

tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen3-1.7B",)
base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/Qwen3-1.7B",
    device_map={"": 0}, token=""
)

model = PeftModel.from_pretrained(base_model,"Rustamshry/Nizami-1.7B")

question = """
Əldə olunan arxeoloji qazıntı materiallarına əsasən, Eneolit dövründə Azərbaycanda metalın ilk istifadəsi ilə bağlı hansı konkret obyektlər tapılmışdır və bu obyektlər həmin dövrdə cəmiyyətin sosial strukturunun inkişafına necə təsir etmişdir? Əlavə olaraq, həmin dövrdə metallurgiya və metalişləmə sənətkarlığının inkişafının iqtisadi və mədəni aspektləri haqqında nə deyə bilərsiniz?
"""

messages = [
    {"role" : "user", "content" : question}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize = False,
    add_generation_prompt = True, 
    enable_thinking = False,
)

from transformers import TextStreamer
_ = model.generate(
    **tokenizer(text, return_tensors = "pt").to("cuda"),
    max_new_tokens = 1800,
    temperature = 0.7,
    top_p = 0.8,
    top_k = 20,
    streamer = TextStreamer(tokenizer, skip_prompt = True),
)

Training Data

Dataset: az-llm/az_academic_qa-v1.0

Description: A 7,000-example dataset for academic-style comprehension and reasoning in Azerbaijani. Each example contains a long chunk_text, a high-complexity question, a detailed structured answer, and a tone tag (e.g., Formal, Open-ended). Sourced from historical, legal, philosophical, and social science texts.

Fields:

  • chunk_text: Source paragraph or multi-sentence input
  • question: Open-ended or context-based question
  • answer: Long-form response
  • tone: Answer style (e.g., formal, informal)

License: CC-BY 4.0

Size: 7,000 entries

Avg. tokens per entry: ~400

Version: 1.0

Language: Azerbaijani

Framework versions

  • PEFT 0.14.0