Uploaded model
- Developed by: mimi1998
- License: apache-2.0
- Finetuned from model : mimi1998/Qwen2.5-7B-Instruct-GRPO-LoRA
This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for mimi1998/Qwen2.5-7B-SFT-LoRA-V4
Base model
Qwen/Qwen2.5-7B
Finetuned
Qwen/Qwen2.5-7B-Instruct
Finetuned
unsloth/Qwen2.5-7B-Instruct
Finetuned
mimi1998/Qwen2.5-7B-Instruct-SFT-LoRA-V3
Finetuned
mimi1998/Qwen2.5-7B-Instruct-GRPO-LoRA