UniLoRA

Uni-LoRA is a PEFT method that shares a compact trainable vector bank across low-rank adapter weights. Instead of learning every LoRA matrix element independently, UniLoRA deterministically projects entries into shared theta_d values and learns the shared parameters used by the adapter update.

Quick Start

from peft import UniLoraConfig, get_peft_model
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B")

config = UniLoraConfig(
    r=32,
    theta_d_length=256,
    proj_seed=42,
    target_modules=["q_proj", "v_proj"],
    unilora_dropout=0.0,
    init_weights=True,
    task_type="CAUSAL_LM",
)

peft_model = get_peft_model(model, config)
peft_model.print_trainable_parameters()

Important Parameters

r controls the low-rank adapter dimension. Larger values increase adapter capacity and memory use.

theta_d_length controls the length of the shared UniLoRA vector bank. This is the main trainable storage shared by the projected adapter entries.

proj_seed controls deterministic index generation for the fixed projections into theta_d. Reusing the same seed and configuration makes the generated adapter indices reproducible.

target_modules selects which modules receive UniLoRA adapters. Use module suffixes such as ["q_proj", "v_proj"], a regex string, or "all-linear" when supported by the model architecture.

unilora_dropout applies dropout inside UniLoRA adapter layers during training.

init_weights controls UniLoRA parameter initialization. Set it to False to keep a random theta_d initialization when you need to manage initialization manually.

save_indices controls whether UniLoRA checkpoints save the generated index and scale tensors together with the shared theta_d parameters. Keeping this disabled gives smaller checkpoints and regenerates indices from proj_seed; enabling it makes saved adapters independent from future index-generation changes.

Benchmark overview

API

UniLoraConfig

class peft.UniLoraConfig

< source >

( task_type: Optional[Union[str, TaskType]] = Nonepeft_type: Optional[Union[str, PeftType]] = Noneauto_mapping: Optional[dict] = Nonepeft_version: Optional[str] = Nonebase_model_name_or_path: Optional[str] = Nonerevision: Optional[str] = Noneinference_mode: bool = Falser: int = 4proj_seed: int = 42theta_d_length: int = 256target_modules: typing.Union[str, list[str], NoneType] = Noneunilora_dropout: float = 0.0fan_in_fan_out: bool = Falsebias: str = 'none'modules_to_save: typing.Optional[list[str]] = Noneinit_theta_d_bound: float = 0.02init_weights: bool = Truesave_indices: bool = Falselayers_to_transform: typing.Union[list[int], int, NoneType] = Nonelayers_pattern: typing.Union[str, list[str], NoneType] = None )

Parameters

r (int) — Rank of the low-rank adaptation. This controls the expressive capacity of the UniLora update.
proj_seed (int) — Random seed used to generate the fixed index assignment. This ensures reproducibility across runs.
theta_d_length (int) — Length of the shared UniLora vector theta_d.
target_modules (Union[list[str], str], optional) — Names or patterns of modules to which UniLora adapters are applied.
- If a string is provided, it is treated as a regular expression.
- If a list is provided, modules are matched by exact name or suffix.
- The special value ‘all-linear’ applies UniLora to all Linear / Conv1D layers except the output layer. If not specified, modules are inferred from the model architecture. An error is raised if the architecture is unsupported.
unilora_dropout (float) — Dropout probability applied within UniLora layers.
fan_in_fan_out (bool) — Whether the replaced layer stores weights in (fan_in, fan_out) format. This should be set to True for models such as GPT-2 that use Conv1D layers.
bias (str) — Specifies which bias terms are trainable:
- ‘none’: no bias parameters are updated
- ‘all’: all bias parameters are updated
- ‘unilora_only’: only biases inside UniLora layers are updated Note: enabling bias updates changes model outputs even when adapters are disabled.
modules_to_save (list[str], optional) — Additional modules (outside UniLora layers) that should remain trainable and be saved in the final checkpoint. This is commonly used for task-specific heads such as classifiers.
init_theta_d_bound (float) — Initialization bound for the UniLora vector bank. Vectors are sampled uniformly from [-init_theta_d_bound, init_theta_d_bound]. Initializing with zeros is avoided to prevent vanishing gradients. Small values (e.g., 0.02) are recommended for stable training.
init_weights (bool) — Whether to initialize theta_d with the default UniLora initialization. If set to False, theta_d keeps a random initialization.
save_indices (bool) — Whether to save the generated UniLora index and scale buffers alongside theta_d. This increases checkpoint size, but makes saved adapters independent from future changes to the index generation routine.
layers_to_transform (Union[list[int], int], optional) — Indices of transformer layers to which UniLora is applied. If specified, only these layers are modified. This option is valid only when target_modules is a list.
layers_pattern (Union[list[str], str], optional) — Custom layer name pattern used together with layers_to_transform when the model does not follow standard layer naming conventions. This option is valid only when target_modules is a list.

Configuration class for UniLora adapters.

This class defines all hyperparameters required to initialize and apply UniLora layers within the PEFT framework. The configuration is intentionally minimal and only includes parameters that are actively used by the current UniLora implementation.

Reference: Uni-LoRA: One Vector Is All You Need https://arxiv.org/abs/2506.00799

UniLoraModel

class peft.UniLoraModel

< source >

( modelconfigadapter_namelow_cpu_mem_usage: bool = Falsestate_dict: dict[str, torch.Tensor] | None = None )

Creates a UniLora adapter around a pretrained model.

generate_index

< source >

( lora_param_count: inttheta_d_length: intproj_seed: int )

Assign deterministic theta_d indices to the flattened UniLora parameter space.

A plain np.random.choice(np.arange(theta_d_length), size=lora_param_count) samples each position independently, which can leave some theta_d entries unused for smaller adapters. UniLora instead uses a balanced deterministic assignment: each index appears either floor(D / d) or ceil(D / d) times, where D is the flattened LoRA parameter count and d is theta_d_length. This keeps per-index normalization stable while still shuffling the assignment with proj_seed.

Update on GitHub