CAP SFT Qwen3 LoRA 10K

This is a LoRA (Low-Rank Adaptation) fine-tuned version of Qwen/Qwen3-14B, specifically trained on legal case analysis tasks using data from the Caselaw Access Project (CAP).

Model Details

Base Model: Qwen/Qwen3-14B
Model Type: LoRA Adapter (PEFT)
Training Data: 10,000 samples from CAP RLVR legal reasoning dataset
Hardware: 2x NVIDIA H100-80GB GPUs
Training Framework: HuggingFace Transformers + PEFT
Precision: FP16

Training Configuration

LoRA Rank: 16
LoRA Alpha: 32
Target Modules: All linear layers
Batch Size: 64 (4 per GPU × 2 GPUs × 8 gradient accumulation)
Learning Rate: 2e-4
Epochs: 1
Optimizer: AdamW
Scheduler: Linear with warmup

Legal Tasks Covered

This model was trained on five core legal reasoning tasks:

Holding Selection (~30K questions) - Multiple choice selection of legal holdings
Citation Format (~50K questions) - Proper legal citation completion
IRAC Summarization (~30K questions) - Case summarization following IRAC structure
Case Retrieval (~30K questions) - Finding analogous legal cases
Legal Entailment (~40K questions) - Determining relationships between legal statements

Performance

Trainable Parameters: 64.2M / 14.8B total (0.43%)
Memory Usage: ~132GB during training (82% H100 utilization)
Training Speed: 4-6x faster than A6000 baseline
Quality: Estimated 90-95% of full fine-tuning performance

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-14B",
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(
    base_model, 
    "kylebrussell/cap-sft-qwen3-lora-10k"
)

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-14B")

# Example legal reasoning prompt
prompt = """Given the following case facts, identify the primary legal holding:

Case: A company fired an employee for refusing to work overtime without pay...

Options:
A) Employment at-will doctrine applies
B) Fair Labor Standards Act violation
C) Wrongful termination claim
D) Contract breach

Answer:"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Dataset

The model was trained on kylebrussell/cap-rlvr-sft, which contains processed legal case data from:

Source: Caselaw Access Project (CAP)
Size: ~7 million legal case documents (78GB uncompressed)
Processing: Converted to instruction-following format for legal reasoning tasks
Retrieval Support: FAISS embeddings available in kylebrussell/cap-rlvr-retrieval

Limitations

Specialized for legal domain tasks, may not generalize well to other domains
Training limited to 10K samples for efficiency - full dataset training pending
Legal advice disclaimer: This model is for research purposes only and should not be used for actual legal advice
May hallucinate legal citations or case details
Performance on complex multi-step legal reasoning not fully evaluated

Training Infrastructure

Cloud Provider: Lambda Labs
Instance: 2x H100-80GB HBM3 (160GB total VRAM)
CUDA: 12.8
PyTorch: 2.7.0
Training Duration: ~2 hours for 10K samples

Citation

@misc{cap-sft-qwen3-lora-10k,
  title={CAP SFT Qwen3 LoRA 10K: Legal Reasoning with LoRA Fine-tuning},
  author={Kyle Russell},
  year={2024},
  publisher={HuggingFace},
  howpublished={\url{https://huggingface.co/kylebrussell/cap-sft-qwen3-lora-10k}}
}

Related Models & Datasets

Base Model: Qwen/Qwen3-14B
Training Data: kylebrussell/cap-rlvr-sft
Retrieval Data: kylebrussell/cap-rlvr-retrieval

License

Apache 2.0 (following base model license)

kylebrussell
/

cap-sft-qwen3-lora-10k