CAP SFT Qwen3 LoRA 10K

This is a LoRA (Low-Rank Adaptation) fine-tuned version of Qwen/Qwen3-14B, specifically trained on legal case analysis tasks using data from the Caselaw Access Project (CAP).

Model Details

  • Base Model: Qwen/Qwen3-14B
  • Model Type: LoRA Adapter (PEFT)
  • Training Data: 10,000 samples from CAP RLVR legal reasoning dataset
  • Hardware: 2x NVIDIA H100-80GB GPUs
  • Training Framework: HuggingFace Transformers + PEFT
  • Precision: FP16

Training Configuration

  • LoRA Rank: 16
  • LoRA Alpha: 32
  • Target Modules: All linear layers
  • Batch Size: 64 (4 per GPU ร— 2 GPUs ร— 8 gradient accumulation)
  • Learning Rate: 2e-4
  • Epochs: 1
  • Optimizer: AdamW
  • Scheduler: Linear with warmup

Legal Tasks Covered

This model was trained on five core legal reasoning tasks:

  1. Holding Selection (~30K questions) - Multiple choice selection of legal holdings
  2. Citation Format (~50K questions) - Proper legal citation completion
  3. IRAC Summarization (~30K questions) - Case summarization following IRAC structure
  4. Case Retrieval (~30K questions) - Finding analogous legal cases
  5. Legal Entailment (~40K questions) - Determining relationships between legal statements

Performance

  • Trainable Parameters: 64.2M / 14.8B total (0.43%)
  • Memory Usage: ~132GB during training (82% H100 utilization)
  • Training Speed: 4-6x faster than A6000 baseline
  • Quality: Estimated 90-95% of full fine-tuning performance

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-14B",
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(
    base_model, 
    "kylebrussell/cap-sft-qwen3-lora-10k"
)

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-14B")

# Example legal reasoning prompt
prompt = """Given the following case facts, identify the primary legal holding:

Case: A company fired an employee for refusing to work overtime without pay...

Options:
A) Employment at-will doctrine applies
B) Fair Labor Standards Act violation
C) Wrongful termination claim
D) Contract breach

Answer:"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Dataset

The model was trained on kylebrussell/cap-rlvr-sft, which contains processed legal case data from:

  • Source: Caselaw Access Project (CAP)
  • Size: ~7 million legal case documents (78GB uncompressed)
  • Processing: Converted to instruction-following format for legal reasoning tasks
  • Retrieval Support: FAISS embeddings available in kylebrussell/cap-rlvr-retrieval

Limitations

  • Specialized for legal domain tasks, may not generalize well to other domains
  • Training limited to 10K samples for efficiency - full dataset training pending
  • Legal advice disclaimer: This model is for research purposes only and should not be used for actual legal advice
  • May hallucinate legal citations or case details
  • Performance on complex multi-step legal reasoning not fully evaluated

Training Infrastructure

  • Cloud Provider: Lambda Labs
  • Instance: 2x H100-80GB HBM3 (160GB total VRAM)
  • CUDA: 12.8
  • PyTorch: 2.7.0
  • Training Duration: ~2 hours for 10K samples

Citation

@misc{cap-sft-qwen3-lora-10k,
  title={CAP SFT Qwen3 LoRA 10K: Legal Reasoning with LoRA Fine-tuning},
  author={Kyle Russell},
  year={2024},
  publisher={HuggingFace},
  howpublished={\url{https://huggingface.co/kylebrussell/cap-sft-qwen3-lora-10k}}
}

Related Models & Datasets

License

Apache 2.0 (following base model license)

Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for kylebrussell/cap-sft-qwen3-lora-10k

Finetuned
Qwen/Qwen3-14B
Adapter
(30)
this model