What is the best way for the inference process in LORA in PEFT approach
Here is the SFTtrainer method i used for finetuning mistral
trainer = SFTTrainer(
model=peft_model,
train_dataset=data,
peft_config=peft_config,
dataset_text_field=" column name",
max_seq_length=3000,
tokenizer=tokenizer,
args=training_arguments,
packing=packing,
)
trainer.train()
I found different mechanisms for the finetuned model inference after PEFT based LORA finetuning
Method - 1
save adapter after completing training and then merge with base model then use for inference
trainer.model.save_pretrained("new_adapter_path")
from peft import PeftModel
finetuned_model = PeftModel.from_pretrained(base_model,
new_adapter_path,
torch_dtype=torch.float16,
is_trainable=False,
device_map="auto"
)
finetuned_model = finetuned_model.merge_and_unload()
Method - 2
save checkpoints during training and then use the checkpoint with the least loss
from peft import PeftModel
finetuned_model = PeftModel.from_pretrained(base_model,
"least loss checkpoint path",
torch_dtype=torch.float16,
is_trainable=False,
device_map="auto"
)
finetuned_model = finetuned_model.merge_and_unload()
Method - 3
same method with AutoPeftModelForCausalLM class
model = AutoPeftModelForCausalLM.from_pretrained(
"output directory checkpoint path",
low_cpu_mem_usage=True,
return_dict=True,
torch_dtype=torch.float16,
device_map="cuda")
finetuned_model = finetuned_model.merge_and_unload()
Method-4
AutoPeftModelForCausalLM class specifies the output folder without specifying a specific checkpoint
instruction_tuned_model = AutoPeftModelForCausalLM.from_pretrained(
training_args.output_dir,
torch_dtype=torch.bfloat16,
device_map = 'auto',
trust_remote_code=True,
)
finetuned_model = finetuned_model.merge_and_unload()
Method-5
All the above methods without merging
#finetuned_model = finetuned_model.merge_and_unload()
Which is the actual method I should follow for inference?
and when to use which method over another?
I use "Method 1" and it works fine always. Better to save adapter checkpoints which are smaller in size and merge for once with base model rather than saving entire base model checkpoints everytime.
Btw can you share a sample notebook for finetuning ? I was using this - https://colab.research.google.com/drive/1VDa0lIfqiwm16hBlIlEaabGVTNB3dN1A?usp=sharing
But my training loss starts to increase after 1000 steps for some reason. Any ideas ? Running on custom dataset. Tried using both Alpaca & Mistral templates although that shouldn't matter much for finetuning i guess.
@sumegh try with a lower learning rate which will reduce the loss. Do you have any idea on to select the max_steps parameter?
That is if training a full epoch is not feasible for you. Else set num_train_epochs = 1. Otherwise, see total number of steps for single epoch based on batch size and then set max_steps < total steps for epoch.
Can you share your finetuning notebook for reference ?
Notebook sharing is not possible due to security reasons. It is confidential in my organization level
okay no issues. Also what optimizer are you using ? I was doing 4-bit LoRA finetuning. Using the paged_adamw_8bit optimizer from huggingface training config.
i am using - paged_adamw_32bit