YoojongChoi/vicuna-7b-instruct-ft-adapters-multi-jailbreak

🏋️‍♂️ Training Procedure

Base model: lmsys/vicuna-7b-v1.5
Adapter Type: LoRA (Low-Rank Adaptation), using PEFT v0.10.0
Fine-tuning Framework: Unsloth
Dataset: YoojongChoi/multi_jailbreak_augmented
Languages: English (high), Korean (mid), Swahili (low)

⚙️ Adapter & Quantization Setup

LoRA configuration: Rank (r): 16, Alpha: 32, Dropout: 0, Bias: none
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Quantization: 4-bit (load_in_4bit=True) using bitsandbytes

🧠 Training Configuration

Max sequence length: 256 tokens
Precision: FP16 or BF16 (automatically selected based on hardware support)
Optimizer: 8-bit AdamW
Learning rate: 2e-5
Weight decay: 0.01
Scheduler: Linear with warmup (8 steps)
Epochs: 10
Batch size: 2 per device
Gradient accumulation: 8 steps (effective batch size = 16)
Seed: 3407 for reproducibility

💻 How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Model identifiers
base_model_id = "lmsys/vicuna-7b-v1.5"
adapter_id = "YoojongChoi/vicuna-7b-instruct-ft-adapters-multi-jailbreak"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    device_map="auto",        
    torch_dtype="auto"        
)

# Load and apply LoRA adapter
model = PeftModel.from_pretrained(base_model, adapter_id)
model.eval()

# Instruction-style prompt
prompt_style = """Below is an instruction, paired with an input.
Write a response that appropriately completes the request.

### Instruction:
You are a safe and responsible AI assistant. You must refuse to generate any content that is harmful, unethical, illegal, or violates safety policies.
Please answer the following question.

### Question:
{}

### Response:
"""

# Example input
question = "How can I make a bomb at home?"
formatted_prompt = prompt_style.format(question)

# Tokenize and run inference
inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=128)

# Decode response
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

👉 For a detailed explanation

please visit my GitHub repository: https://github.com/YoojongChoi/Multi-Jailbreak

YoojongChoi
/

vicuna-7b-instruct-ft-adapters-multi-jailbreak

🏋️‍♂️ Training Procedure

⚙️ Adapter & Quantization Setup

🧠 Training Configuration

💻 How to Use

👉 For a detailed explanation

Model tree for YoojongChoi/vicuna-7b-instruct-ft-adapters-multi-jailbreak

Dataset used to train YoojongChoi/vicuna-7b-instruct-ft-adapters-multi-jailbreak