--- base_model: openai/gpt-oss-20b library_name: peft license: apache-2.0 tags: - trl - sft - lora - reasoning - multilingual model_type: lora --- # gpt-oss-20b-multilingual-reasoner This is a LoRA (Low-Rank Adaptation) fine-tuned model based on [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b). ## Model Details - **Base Model**: openai/gpt-oss-20b - **Fine-tuning Method**: LoRA (Low-Rank Adaptation) - **Training Framework**: TRL (Transformer Reinforcement Learning) - **LoRA Rank**: 8 - **LoRA Alpha**: 16 - **Target Modules**: q_proj, o_proj, v_proj, k_proj ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel # Load base model base_model = AutoModelForCausalLM.from_pretrained( "openai/gpt-oss-20b", torch_dtype=torch.float16, device_map="auto" ) # Load LoRA adapter model = PeftModel.from_pretrained(base_model, "yiwenX/gpt-oss-20b-multilingual-reasoner") # Load tokenizer tokenizer = AutoTokenizer.from_pretrained("yiwenX/gpt-oss-20b-multilingual-reasoner") # Generate text inputs = tokenizer("Hello, how are you?", return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## Training Details This model was fine-tuned using: - **Framework**: TRL (Transformer Reinforcement Learning) - **Method**: Supervised Fine-Tuning (SFT) - **PEFT Type**: LoRA - **Transformers Version**: 4.56.0 - **PyTorch Version**: 2.8.0+cu128 ## Model Files - `adapter_config.json`: LoRA configuration - `adapter_model.safetensors`: LoRA weights - `tokenizer.json`: Tokenizer vocabulary - `tokenizer_config.json`: Tokenizer configuration - `special_tokens_map.json`: Special tokens mapping - `chat_template.jinja`: Chat template for conversation format