OutOfMemoryError: CUDA out of memory.
I have two GPUs
[0] NVIDIA GeForce RTX 3090
[1] NVIDIA GeForce RTX 3090
but when i try to load the model
model_name = 'meta-llama/Meta-Llama-3-8B'
model = AutoModelForCausalLM.from_pretrained(model_name, token=access_token)
I get out of memory error
Did you check the cuda availability and that the model is properly load into the gpu?
I meet the same problem.
System: two RTX 4090Ti
I meet the same problem.
System: two RTX 4090Ti
Which model? The 8b?
llama3.2 3b and llama3 8b
I meet the same problem.
System: two RTX 4090TiWhich model? The 8b?
Try to use Nvidia-smi command. There u can see if the gpu ram is used properly
I have used this command.The informantion shows the result of out of the out of memory.I don't konw how to resolve it,but i konw the model should not need the memory size.
Did u try different weights, maybe float8 or lower.
I'm trying to fine tune the model.Could you say the way more clearly?I'm new to large models, so I don't quite understand what you mean
You have different options to test:
Reduce the Batch Size.
Reduce the precision such as half-precision (FP16), instead of single-precision (FP32). You can also get lower.
Both would reduce the memory load.
Maybe u you can also use seq_len to reduce the amount to feed the model
What you mean is to modify these configuration parameters:?
--model_name_or_path
llama3_8b
--tokenizer_name_or_path
llama3_8b
--dataset_dir
--per_device_train_batch_size
1
--per_device_eval_batch_size
1
--do_train
1
--do_eval
1
--seed
42
--bf16
1
--num_train_epochs
3
--lr_scheduler_type
cosine
--learning_rate
1e-4
--warmup_ratio
0.05
--weight_decay
0.1
--logging_strategy
steps
--logging_steps
10
--save_strategy
steps
--save_total_limit
3
--evaluation_strategy
steps
--eval_steps
100
--save_steps
200
--gradient_accumulation_steps
8
--preprocessing_num_workers
8
--max_seq_length
1024
--output_dir
--overwrite_output_dir
1
--ddp_timeout
30000
--logging_first_step
True
--lora_rank
64
--lora_alpha
128
--trainable
"q_proj,v_proj,k_proj,o_proj,gate_proj,down_proj,up_proj"
--lora_dropout
0.05
--modules_to_save
"embed_tokens,lm_head"
--torch_dtype
bfloat16
--validation_file
--load_in_kbits
16