view post Post 4478 Just included example scripts for aligning models using GSPO (including VLM example) šāāļøšāāļøGSPO is the latest RL alignment algo by @Alibaba_Qwen and it's already supported in the latest TRL v0.20 release.Super-easy-to-get-started example scripts below, GO run them!š©āš»š©āš» š§āšØ Script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo.pyš¦ VLM script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo_vlm.pyš§© More TRL examples: https://huggingface.co/docs/trl/main/en/example_overviewš§āāļø GSPO paper: Group Sequence Policy Optimization (2507.18071) See translation š 6 6 + Reply
A Survey of Context Engineering for Large Language Models Paper ⢠2507.13334 ⢠Published Jul 17 ⢠245