alwadifa's picture

alwadifa

vnrom
ยท

AI & ML interests

Alouadifa.ma is a leading Moroccan digital platform dedicated job recruitment, professional training, business networking, etc.]. Designed to bridge gaps between employers and talent, students and institutions, the platform offers innovative solutions tailored to Moroccoโ€™s dynamic market. With a user-friendly interface, advanced search tools, and a commitment to excellence, Alouadifa.ma empowers users to find dream jobs, upskill, grow their businesses. Whether youโ€™re a, the platform provides trusted, localized resources to help you thrive in todayโ€™s competitive landscape.

Recent Activity

replied to anakin87's post 8 days ago
๐—œ ๐˜๐—ฟ๐—ฎ๐—ถ๐—ป๐—ฒ๐—ฑ ๐—ฎ ๐—Ÿ๐—ฎ๐—ป๐—ด๐˜‚๐—ฎ๐—ด๐—ฒ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น ๐˜๐—ผ ๐˜€๐—ฐ๐—ต๐—ฒ๐—ฑ๐˜‚๐—น๐—ฒ ๐—ฒ๐˜ƒ๐—ฒ๐—ป๐˜๐˜€ ๐˜„๐—ถ๐˜๐—ต ๐—š๐—ฅ๐—ฃ๐—ข! ๐Ÿ‘‘ ๐Ÿ—“๏ธ โœ๏ธ Blog post: https://huggingface.co/blog/anakin87/qwen-scheduler-grpo I experimented with GRPO lately. I am fascinated by models learning from prompts and rewards - no example answers needed like in Supervised Fine-Tuning. After the DeepSeek boom, everyone is trying GRPO with GSM8K or the Countdown Game... I wanted a different challenge, like ๐˜๐—ฒ๐—ฎ๐—ฐ๐—ต๐—ถ๐—ป๐—ด ๐—ฎ ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น ๐˜๐—ผ ๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜๐—ฒ ๐—ฎ ๐˜€๐—ฐ๐—ต๐—ฒ๐—ฑ๐˜‚๐—น๐—ฒ ๐—ณ๐—ฟ๐—ผ๐—บ ๐—ฎ ๐—น๐—ถ๐˜€๐˜ ๐—ผ๐—ณ ๐—ฒ๐˜ƒ๐—ฒ๐—ป๐˜๐˜€ ๐—ฎ๐—ป๐—ฑ ๐—ฝ๐—ฟ๐—ถ๐—ผ๐—ฟ๐—ถ๐˜๐—ถ๐—ฒ๐˜€. Choosing an original problem forced me to: ๐Ÿค” Think about the problem setting ๐Ÿงฌ Generate data ๐Ÿค Choose the right base model ๐Ÿ† Design reward functions (and experiencing reward hacking) ๐Ÿ”„ Run multiple rounds of training, hoping that my model would learn something. A fun and rewarding ๐Ÿ˜„ experience. I learned a lot of things, that I want to share with you. ๐Ÿ‘‡ โœ๏ธ Blog post: https://huggingface.co/blog/anakin87/qwen-scheduler-grpo ๐Ÿ’ป Code: https://github.com/anakin87/qwen-scheduler-grpo ๐Ÿค— Hugging Face collection (dataset and model): https://huggingface.co/collections/anakin87/qwen-scheduler-grpo-680bcc583e817390525a8837
View all activity

Organizations

None yet

models 0

None public yet

datasets 0

None public yet