--- library_name: peft base_model: chihoonlee10/T3Q-Mistral-Orca-Math-DPO --- # Model Card for Model ID 推理配置需要注意的几个参数: ``` params = { 'temperature': 0.85, 'top_p': 0.95, 'top_k': 7, 'repetition_penalty': 1.18, 'max_tokens': 500, 'stop': [], 'typical_p': 0.95, 'n': 1, } ``` Prompt模板格式 ``` ### Instruction: (without the <>) ### Response: ``` 训练参数(使用Llama-Factory训练): ``` - learning_rate: 5e-05 - lr_scheduler_type: cosine - per_device_train_batch_size: 1 - per_device_eval_batch_size: 1 - gradient_accumulation_steps: 2 - warmup_steps: 24 - num_train_epochs: 2 - template: alpaca - cutoff_len: 4608 - finetuning_type: lora - lora_target: q_proj,v_proj,o_proj,k_proj - quantization_bit: 4 - lora_rank: 64 - lora_alpha: 16 - bf16: True - logging_steps: 20 - val_size: 4 - save_steps: 200 ```