Burning ray

adarksky
·

AI & ML interests

None yet

Recent Activity

liked a model 12 days ago
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1
upvoted a collection 15 days ago
Llama 4
liked a dataset about 2 months ago
ylecun/mnist
View all activity

Organizations

fast.ai community's profile picture Hugging Face Discord Community's profile picture

adarksky's activity

reacted to merve's post with 🔥 5 months ago
view post
Post
2688
small but mighty 🔥
you can fine-tune SmolVLM on an L4 with batch size of 4 and it will only take 16.4 GB VRAM 🫰🏻 also with gradient accumulation simulated batch size is 16 ✨
I made a notebook that includes all the goodies: QLoRA, gradient accumulation, gradient checkpointing with explanations on how they work 💝 https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb