Do you have the numbers that shows method 2 works and better? @prithivMLmods
Xiaoyun Wu
CUIGuy
ยท
AI & ML interests
conversational user interface
Recent Activity
commented on
an
article
2 days ago
Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies
commented on
an
article
3 days ago
Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies
commented on
an
article
3 days ago
Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies
Organizations
CUIGuy's activity
commented on
Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies
2 days ago
commented on
Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies
3 days ago
method 2 has vllm in it, but disabled, method 1 does on have vllm in it. am I missing something? So method 2 is the one to use? if enable grpo?
commented on
Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies
3 days ago
which one is vllm based? How can one tell? can you mention it in article?
Aslo, are you happen to aware work on using grpo to improve MMLU (or some task inside it) with models like qwen 2.5 7b or even smaller?
@prithivMLmods
thanks.
commented on
Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies
3 days ago
why there are 2 methods?
on what hardware did you find tune this?
#1 opened about 1 year ago
by
CUIGuy
can you share the script?
1
#2 opened about 1 year ago
by
CUIGuy
the same fine tune code, flan t5 work, and this does not.
#3 opened about 1 year ago
by
CUIGuy
can you share the code for fine-tuning ?
1
#2 opened about 1 year ago
by
CUIGuy
when will have a ggml version?
8
#3 opened over 1 year ago
by
CUIGuy
can not install rotary
1
#4 opened over 1 year ago
by
CUIGuy
ValueError: expected sequence of length 35 at dim 1 (got 22)
1
#3 opened over 1 year ago
by
CUIGuy
what is the difference between v2 and v3?
1
#2 opened over 1 year ago
by
CUIGuy
where do we find the definition for the converted field?
1
#7 opened over 1 year ago
by
CUIGuy
why we can not make this fully HF ready?
8
#11 opened over 1 year ago
by
CUIGuy
why we can not make this fully HF ready?
8
#11 opened over 1 year ago
by
CUIGuy
why we can not make this fully HF ready?
8
#11 opened over 1 year ago
by
CUIGuy