metadata
base_model: openai/gpt-oss-20b
library_name: peft
license: apache-2.0
tags:
- trl
- sft
- lora
- reasoning
- multilingual
model_type: lora
gpt-oss-20b-multilingual-reasoner
This is a LoRA (Low-Rank Adaptation) fine-tuned model based on openai/gpt-oss-20b.
Model Details
- Base Model: openai/gpt-oss-20b
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Training Framework: TRL (Transformer Reinforcement Learning)
- LoRA Rank: 8
- LoRA Alpha: 16
- Target Modules: q_proj, o_proj, v_proj, k_proj
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"openai/gpt-oss-20b",
torch_dtype=torch.float16,
device_map="auto"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "yiwenX/gpt-oss-20b-multilingual-reasoner")
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("yiwenX/gpt-oss-20b-multilingual-reasoner")
# Generate text
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Training Details
This model was fine-tuned using:
- Framework: TRL (Transformer Reinforcement Learning)
- Method: Supervised Fine-Tuning (SFT)
- PEFT Type: LoRA
- Transformers Version: 4.56.0
- PyTorch Version: 2.8.0+cu128
Model Files
adapter_config.json
: LoRA configurationadapter_model.safetensors
: LoRA weightstokenizer.json
: Tokenizer vocabularytokenizer_config.json
: Tokenizer configurationspecial_tokens_map.json
: Special tokens mappingchat_template.jinja
: Chat template for conversation format