AI & ML interests
Large Language Models

ayut
updated
a
dataset
4 months ago

ariG23498
updated
a
dataset
4 months ago

ariG23498
published
a
dataset
4 months ago
Post
2835
Tried my hand at simplifying the derivations of Direct Preference Optimization.
I cover how one can reformulate RLHF into DPO. The idea of implicit reward modeling is chef's kiss.
Blog: https://huggingface.co/blog/ariG23498/rlhf-to-dpo
I cover how one can reformulate RLHF into DPO. The idea of implicit reward modeling is chef's kiss.
Blog: https://huggingface.co/blog/ariG23498/rlhf-to-dpo
Post
2038
Timm ❤️ Transformers
Wtih the latest version of transformers you can now use any timm model with the familiar transformers API.
Blog Post: https://huggingface.co/blog/timm-transformers
Repository with examples: https://github.com/ariG23498/timm-wrapper-examples
Collection: ariG23498/timmwrapper-6777b85f1e8d085d3f1374a1
Wtih the latest version of transformers you can now use any timm model with the familiar transformers API.
Blog Post: https://huggingface.co/blog/timm-transformers
Repository with examples: https://github.com/ariG23498/timm-wrapper-examples
Collection: ariG23498/timmwrapper-6777b85f1e8d085d3f1374a1

ariG23498
updated
a
Space
7 months ago
Post
1454
We are blessed with another iteration of Pali Gemma. Google launches PaliGemma 2.
google/paligemma-2-release-67500e1e1dbfdd4dee27ba48
merve/paligemma2-vqav2
google/paligemma-2-release-67500e1e1dbfdd4dee27ba48
merve/paligemma2-vqav2
Post
1611
Cohere drops two new multilingual models!
https://huggingface.co/CohereForAI/aya-expanse-8b
https://huggingface.co/CohereForAI/aya-expanse-32b
Try them out here
https://huggingface.co/spaces/CohereForAI/aya_expanse
https://huggingface.co/CohereForAI/aya-expanse-8b
https://huggingface.co/CohereForAI/aya-expanse-32b
Try them out here
https://huggingface.co/spaces/CohereForAI/aya_expanse
Post
1642
You can now use DoRA for your embedding layers!
PR: https://github.com/huggingface/peft/pull/2006
I have documented my journey of this specific PR in a blog post for everyone to read. The highlight of the PR was when the first author of DoRA reviewed my code.
Blog Post: https://huggingface.co/blog/ariG23498/peft-dora
Huge thanks to @BenjaminB for all the help I needed.
PR: https://github.com/huggingface/peft/pull/2006
I have documented my journey of this specific PR in a blog post for everyone to read. The highlight of the PR was when the first author of DoRA reviewed my code.
Blog Post: https://huggingface.co/blog/ariG23498/peft-dora
Huge thanks to @BenjaminB for all the help I needed.

ariG23498
authored
a
paper
almost 2 years ago