Model Card for Model ID
maximilianshwarzmullers/tm-hukuk
Model Description
This is Turkmen LLM Finetuned in Turkmen law data
- Developed by: [Annamyrat Saparow]
- Funded date: [30.04.2025]
- Shared date: [07.05.2025]
- Model type: [Instruction model]
- Language(s) (NLP): [Turkmen, English]
- License: [MIT]
- Finetuned from model [optional]: [llama 3.1 8b instruct ]
Model Sources [optional]
- Repository: [maximilianshwarzmullers/tm-hukuk]
Training Data
Training data is maximilianshwarzmullers/hukukchy for law It is my own dataset
Training Procedure
QLora
Preprocessing [optional]
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
Instruction:
{}
Input:
{}
Response:
{}"""
EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN def formatting_prompts_func(examples): instructions = examples["instruction"] inputs = examples["question"] outputs = examples["answer"] texts = [] for instruction, input, output in zip(instructions, inputs, outputs): # Must add EOS_TOKEN, otherwise your generation will go on forever! text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN texts.append(text) return { "text" : texts, } pass
from datasets import load_dataset dataset = load_dataset("maximilianshwarzmullers/hukukchy", split = "train") dataset = dataset.map(formatting_prompts_func, batched = True,)
Training Hyperparameters
from trl import SFTTrainer from transformers import TrainingArguments from unsloth import is_bfloat16_supported
trainer = SFTTrainer( model = model, tokenizer = tokenizer, train_dataset = dataset, dataset_text_field = "text", max_seq_length = max_seq_length, dataset_num_proc = 2, packing = False, # Can make training 5x faster for short sequences. args = TrainingArguments( per_device_train_batch_size = 2, gradient_accumulation_steps = 4, warmup_steps = 5, # num_train_epochs = 1, # Set this for 1 full training run. max_steps = 750, learning_rate = 3e-5, fp16 = not is_bfloat16_supported(), bf16 = is_bfloat16_supported(), logging_steps = 1, optim = "adamw_8bit", weight_decay = 0.01, lr_scheduler_type = "linear", seed = 3407, output_dir = "outputs", report_to = "none", # Use this for WandB etc ), )
Model tree for maximilianshwarzmullers/tm-hukuk
Base model
meta-llama/Llama-3.1-8B