Aritra Roy Gosthipaty PRO
ariG23498
AI & ML interests
Deep Representation Learning
Recent Activity
updated
a dataset
about 20 hours ago
model-metadata/trending_models
commented on
their
article
1 day ago
KV Cache from scratch in nanoVLM
Organizations
ariG23498's activity

posted
an
update
1 day ago

reacted to
danielhanchen's
post with π₯
2 days ago
Post
3014
New DeepSeek-R1-0528 1.65-bit Dynamic GGUF!
Run the model locally even easier! Will fit on a 192GB Macbook and run at 7 tokens/s.
DeepSeek-R1-0528 GGUFs: unsloth/DeepSeek-R1-0528-GGUF
Qwen3-8B DeepSeek-R1-0528 GGUFs: unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF
And read our Guide: https://docs.unsloth.ai/basics/deepseek-r1-0528
Run the model locally even easier! Will fit on a 192GB Macbook and run at 7 tokens/s.
DeepSeek-R1-0528 GGUFs: unsloth/DeepSeek-R1-0528-GGUF
Qwen3-8B DeepSeek-R1-0528 GGUFs: unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF
And read our Guide: https://docs.unsloth.ai/basics/deepseek-r1-0528

reacted to
hesamation's
post with π₯
9 days ago
Post
2383
I really like how this seven-stage pipeline was laid out in the Ultimate Guide to Fine-Tuning book.
It gives an overview, then goes into detail for each stage, even providing best practices.
Itβs 115 pages on arxiv, definitely worth a read.
Check it out: https://arxiv.org/abs/2408.13296
It gives an overview, then goes into detail for each stage, even providing best practices.
Itβs 115 pages on arxiv, definitely worth a read.
Check it out: https://arxiv.org/abs/2408.13296

reacted to
merve's
post with π₯
24 days ago
Post
5024
VLMS 2025 UPDATE π₯
We just shipped a blog on everything latest on vision language models, including
π€ GUI agents, agentic VLMs, omni models
π multimodal RAG
β―οΈ video LMs
π€π» smol models
..and more! https://huggingface.co/blog/vlms-2025
We just shipped a blog on everything latest on vision language models, including
π€ GUI agents, agentic VLMs, omni models
π multimodal RAG
β―οΈ video LMs
π€π» smol models
..and more! https://huggingface.co/blog/vlms-2025

reacted to
merve's
post with ππ
about 1 month ago
Post
6574
A real-time object detector much faster and accurate than YOLO with Apache 2.0 license just landed to Hugging Face transformers π₯
D-FINE is the sota real-time object detector that runs on T4 (free Colab) π€©
> Collection with all checkpoints and demo ustc-community/d-fine-68109b427cbe6ee36b4e7352
Notebooks:
> Tracking https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_tracking.ipynb
> Inference https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_inference.ipynb
> Fine-tuning https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_finetune_on_a_custom_dataset.ipynb
h/t @vladislavbro @qubvel-hf @ariG23498 and the authors of the paper π©
Regular object detectors attempt to predict bounding boxes in (x, y, w, h) pixel perfect coordinates, which is very rigid and hard to solve π₯²βΉοΈ
D-FINE formulates object detection as a distribution for bounding box coordinates, refines them iteratively, and it's more accurate π€©
Another core idea behind this model is Global Optimal Localization Self-Distillation ‡οΈ
this model uses final layer's distribution output (sort of like a teacher) to distill to earlier layers to make early layers more performant.
D-FINE is the sota real-time object detector that runs on T4 (free Colab) π€©
> Collection with all checkpoints and demo ustc-community/d-fine-68109b427cbe6ee36b4e7352
Notebooks:
> Tracking https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_tracking.ipynb
> Inference https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_inference.ipynb
> Fine-tuning https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_finetune_on_a_custom_dataset.ipynb
h/t @vladislavbro @qubvel-hf @ariG23498 and the authors of the paper π©
Regular object detectors attempt to predict bounding boxes in (x, y, w, h) pixel perfect coordinates, which is very rigid and hard to solve π₯²βΉοΈ
D-FINE formulates object detection as a distribution for bounding box coordinates, refines them iteratively, and it's more accurate π€©
Another core idea behind this model is Global Optimal Localization Self-Distillation ‡οΈ
this model uses final layer's distribution output (sort of like a teacher) to distill to earlier layers to make early layers more performant.

reacted to
burtenshaw's
post with π
4 months ago
Post
4069
π§ Work in Progress! π§
π·ββοΈ We're working hard on getting the official agents course ready for the 50,000 students that have signed up.
If you want to contribute to the discussion, I started these community posts. Looking forward to hearing from you:
- smolagents unit in the agents course - agents-course/README#7
- LlamaIndex Unit in the agents course - agents-course/README#6
- LangChain and LangGraph unit in the agents course - agents-course/README#5
- Real world use cases in the agents course - agents-course/README#8
π·ββοΈ We're working hard on getting the official agents course ready for the 50,000 students that have signed up.
If you want to contribute to the discussion, I started these community posts. Looking forward to hearing from you:
- smolagents unit in the agents course - agents-course/README#7
- LlamaIndex Unit in the agents course - agents-course/README#6
- LangChain and LangGraph unit in the agents course - agents-course/README#5
- Real world use cases in the agents course - agents-course/README#8

posted
an
update
5 months ago
Post
2823
Tried my hand at simplifying the derivations of Direct Preference Optimization.
I cover how one can reformulate RLHF into DPO. The idea of implicit reward modeling is chef's kiss.
Blog: https://huggingface.co/blog/ariG23498/rlhf-to-dpo
I cover how one can reformulate RLHF into DPO. The idea of implicit reward modeling is chef's kiss.
Blog: https://huggingface.co/blog/ariG23498/rlhf-to-dpo

posted
an
update
5 months ago
Post
2016
Timm β€οΈ Transformers
Wtih the latest version of transformers you can now use any timm model with the familiar transformers API.
Blog Post: https://huggingface.co/blog/timm-transformers
Repository with examples: https://github.com/ariG23498/timm-wrapper-examples
Collection: ariG23498/timmwrapper-6777b85f1e8d085d3f1374a1
Wtih the latest version of transformers you can now use any timm model with the familiar transformers API.
Blog Post: https://huggingface.co/blog/timm-transformers
Repository with examples: https://github.com/ariG23498/timm-wrapper-examples
Collection: ariG23498/timmwrapper-6777b85f1e8d085d3f1374a1

reacted to
burtenshaw's
post with ππ₯
5 months ago
Post
49109
Weβre launching a FREE and CERTIFIED course on Agents!
We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.
Here's what you'll learn:
- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience
This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.
Enroll today and start building the next generation of AI agent applications!
https://bit.ly/hf-learn-agents
We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.
Here's what you'll learn:
- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience
This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.
Enroll today and start building the next generation of AI agent applications!
https://bit.ly/hf-learn-agents

reacted to
burtenshaw's
post with π₯
6 months ago
Post
2533
Quick update from week 1 of smol course. The community is taking the driving seat and using the material for their own projects. If you want to do the same, join in!
- we have ongoing translation projects in Korean, Vietnamese, Portuguese, and Spanish
- 3 chapters are ready for students. On topics like, instruction tuning, preference alignment, and parameter efficient fine tuning
- 3 chapters are in progress on evaluation, vision language models, and synthetic data.
- around 780 people have forked the repo to use it for learning, teaching, sharing.
βοΈ Next step is to support people that want to use the course for teaching, content creation, internal knowledge sharing, or anything. If you're into this. Drop an issue or PR
REPO: https://buff.ly/3ZCMKX2
discord channel: https://buff.ly/4f9F8jA
- we have ongoing translation projects in Korean, Vietnamese, Portuguese, and Spanish
- 3 chapters are ready for students. On topics like, instruction tuning, preference alignment, and parameter efficient fine tuning
- 3 chapters are in progress on evaluation, vision language models, and synthetic data.
- around 780 people have forked the repo to use it for learning, teaching, sharing.
βοΈ Next step is to support people that want to use the course for teaching, content creation, internal knowledge sharing, or anything. If you're into this. Drop an issue or PR
REPO: https://buff.ly/3ZCMKX2
discord channel: https://buff.ly/4f9F8jA

posted
an
update
6 months ago
Post
1445
We are blessed with another iteration of Pali Gemma. Google launches PaliGemma 2.
google/paligemma-2-release-67500e1e1dbfdd4dee27ba48
merve/paligemma2-vqav2
google/paligemma-2-release-67500e1e1dbfdd4dee27ba48
merve/paligemma2-vqav2

reacted to
rwightman's
post with π
6 months ago
Post
1460
There's a new
New optimizers include:
* AdafactorBigVision -
* ADOPT -
* MARS -
* LaProp -
* Cautious Optimizers - a modification to all of the above, prefix with
I shared some caution comparisons in this model repo: rwightman/timm-optim-caution
For details, references, see the code: https://github.com/huggingface/pytorch-image-models/tree/main/timm/optim
timm
release, v 1.0.12, with a focus on optimizers. The optimizer factory has been refactored, there's now a timm.optim.list_optimizers()
and new way to register optimizers and their attributes. As always you can use an timm
optimizer like a torch
one, just replace torch.optim
with timm.optim
New optimizers include:
* AdafactorBigVision -
adfactorbv
* ADOPT -
adopt
/ adoptw
(decoupled decay)* MARS -
mars
* LaProp -
laprop
* Cautious Optimizers - a modification to all of the above, prefix with
c
as well as cadamw
, cnadamw
, csgdw
, clamb
, crmsproptf
I shared some caution comparisons in this model repo: rwightman/timm-optim-caution
For details, references, see the code: https://github.com/huggingface/pytorch-image-models/tree/main/timm/optim

reacted to
davidberenstein1957's
post with ππ§ π
6 months ago
Post
3508
The Data Is Better Together community is set to release the first Apache 2 licensed image preference dataset!
Great work and let's give this a final push :)
@aashish1904 congrats on your month of HF pro. There is more to win during this sprint!
@aashish1904 @AnyaDesdein @davidberenstein1957 @Malalatiana @beta3 @fffiloni @munish0838 @Reza2kn @bbunzeck @Creazycreator @andrei-saceleanu @jafhaponiuk @rca-etl @kf120 @burtenshaw @mmhamdy @grib0ed0v @Doopus @AnyaDes @ttkap @Xceron @Lewox @davanstrien @Azazelle @adirik @Ashish08 @AntonVic @kenantang @sdiazlor @g-ronimo @dennis-rall @prithivMLmods @girtss3 @flozi00 @WaveCut @Taylor658 @Wildminder @Sara9999 @phaelishall @sararob @dvilasuero @pgabrys @plaguss @CDS899 @timajwilliams @rudzinskimaciej @pavel-ai @aggr8 @ignacioct @MouseAI @Leeps @MaksKul @NicolasDmln @Muinez @kusht55 @caiolang @Jakub-Brand24 @loamy @Demijan @eliab96 @Viewegger @JosephCatrambone @p1atdev @mrshu @o639 @Targezed @Aviv-anthonnyolime @thliang01 @Ahmed-Amine @glards @pranaykoppula @nataliaElv @MaPirlet @alvarobartt @gabrielmbmb @zlicastro @Jaydip @Chouettecheveche @lilcheaty @ruyrdiaz @robintema @fdaudens @ggcristian @a-r-r-o-w @pates @joheras @stopsatgreen @bezo97 @chachi902 @iamyann @liamcripwell @dmb23 @korbih @anonymous7743 @akbdx18 @OVAWARE @severo @akontra @lichorosario @lhoestq @SebastianBodza @Vishnou @ameerazam08 @appoose @Mukei @mearco @joaquincabezas @Fizzarolli @thomastraum @igortopolski @OxxoCodes @patrickfleith @asoria @bn22 @sitammeur @Krodolf @bergr7f @Sbxxn @wietsevenema @sugatoray @Iamladi @MikeTrizna @feveromo @mokady @Bolero @prath @Dowwie @kfahn @decodingchris @alili2050 @RahulRaman @yzimmermann @Ameeeee @ecyht2 @MattMC001 @hemanthkumarak @Thegorgibus @akos2 @LawRun @ramithuh @SuperMuel @sjans @peterizsak @mosama @Eyel @mtr3 @cfahlgren1 @legentil @clem @Citaman @Aurelien-Morgan @AntoineBourgois @TotoB12 @Stanmey @osanseviero @multimodalart @maxiw @ariG23498 @ngk89 @femboysLover @dvs @tacohiddink @blanchon @DavidJimenez
Great work and let's give this a final push :)
@aashish1904 congrats on your month of HF pro. There is more to win during this sprint!
@aashish1904 @AnyaDesdein @davidberenstein1957 @Malalatiana @beta3 @fffiloni @munish0838 @Reza2kn @bbunzeck @Creazycreator @andrei-saceleanu @jafhaponiuk @rca-etl @kf120 @burtenshaw @mmhamdy @grib0ed0v @Doopus @AnyaDes @ttkap @Xceron @Lewox @davanstrien @Azazelle @adirik @Ashish08 @AntonVic @kenantang @sdiazlor @g-ronimo @dennis-rall @prithivMLmods @girtss3 @flozi00 @WaveCut @Taylor658 @Wildminder @Sara9999 @phaelishall @sararob @dvilasuero @pgabrys @plaguss @CDS899 @timajwilliams @rudzinskimaciej @pavel-ai @aggr8 @ignacioct @MouseAI @Leeps @MaksKul @NicolasDmln @Muinez @kusht55 @caiolang @Jakub-Brand24 @loamy @Demijan @eliab96 @Viewegger @JosephCatrambone @p1atdev @mrshu @o639 @Targezed @Aviv-anthonnyolime @thliang01 @Ahmed-Amine @glards @pranaykoppula @nataliaElv @MaPirlet @alvarobartt @gabrielmbmb @zlicastro @Jaydip @Chouettecheveche @lilcheaty @ruyrdiaz @robintema @fdaudens @ggcristian @a-r-r-o-w @pates @joheras @stopsatgreen @bezo97 @chachi902 @iamyann @liamcripwell @dmb23 @korbih @anonymous7743 @akbdx18 @OVAWARE @severo @akontra @lichorosario @lhoestq @SebastianBodza @Vishnou @ameerazam08 @appoose @Mukei @mearco @joaquincabezas @Fizzarolli @thomastraum @igortopolski @OxxoCodes @patrickfleith @asoria @bn22 @sitammeur @Krodolf @bergr7f @Sbxxn @wietsevenema @sugatoray @Iamladi @MikeTrizna @feveromo @mokady @Bolero @prath @Dowwie @kfahn @decodingchris @alili2050 @RahulRaman @yzimmermann @Ameeeee @ecyht2 @MattMC001 @hemanthkumarak @Thegorgibus @akos2 @LawRun @ramithuh @SuperMuel @sjans @peterizsak @mosama @Eyel @mtr3 @cfahlgren1 @legentil @clem @Citaman @Aurelien-Morgan @AntoineBourgois @TotoB12 @Stanmey @osanseviero @multimodalart @maxiw @ariG23498 @ngk89 @femboysLover @dvs @tacohiddink @blanchon @DavidJimenez

reacted to
clem's
post with π
6 months ago
Post
4759
Six predictions for AI in 2025 (and a review of how my 2024 predictions turned out):
- There will be the first major public protest related to AI
- A big company will see its market cap divided by two or more because of AI
- At least 100,000 personal AI robots will be pre-ordered
- China will start to lead the AI race (as a consequence of leading the open-source AI race).
- There will be big breakthroughs in AI for biology and chemistry.
- We will begin to see the economic and employment growth potential of AI, with 15M AI builders on Hugging Face.
How my predictions for 2024 turned out:
- A hyped AI company will go bankrupt or get acquired for a ridiculously low price
β (Inflexion, AdeptAI,...)
- Open-source LLMs will reach the level of the best closed-source LLMs
β with QwQ and dozens of others
- Big breakthroughs in AI for video, time-series, biology and chemistry
β for video π΄for time-series, biology and chemistry
- We will talk much more about the cost (monetary and environmental) of AI
β Monetary π΄Environmental (π’)
- A popular media will be mostly AI-generated
β with NotebookLM by Google
- 10 millions AI builders on Hugging Face leading to no increase of unemployment
πcurrently 7M of AI builders on Hugging Face
- There will be the first major public protest related to AI
- A big company will see its market cap divided by two or more because of AI
- At least 100,000 personal AI robots will be pre-ordered
- China will start to lead the AI race (as a consequence of leading the open-source AI race).
- There will be big breakthroughs in AI for biology and chemistry.
- We will begin to see the economic and employment growth potential of AI, with 15M AI builders on Hugging Face.
How my predictions for 2024 turned out:
- A hyped AI company will go bankrupt or get acquired for a ridiculously low price
β (Inflexion, AdeptAI,...)
- Open-source LLMs will reach the level of the best closed-source LLMs
β with QwQ and dozens of others
- Big breakthroughs in AI for video, time-series, biology and chemistry
β for video π΄for time-series, biology and chemistry
- We will talk much more about the cost (monetary and environmental) of AI
β Monetary π΄Environmental (π’)
- A popular media will be mostly AI-generated
β with NotebookLM by Google
- 10 millions AI builders on Hugging Face leading to no increase of unemployment
πcurrently 7M of AI builders on Hugging Face

reacted to
merve's
post with ππ₯
6 months ago
Post
2694
small but mighty π₯
you can fine-tune SmolVLM on an L4 with batch size of 4 and it will only take 16.4 GB VRAM π«°π» also with gradient accumulation simulated batch size is 16 β¨
I made a notebook that includes all the goodies: QLoRA, gradient accumulation, gradient checkpointing with explanations on how they work π https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
you can fine-tune SmolVLM on an L4 with batch size of 4 and it will only take 16.4 GB VRAM π«°π» also with gradient accumulation simulated batch size is 16 β¨
I made a notebook that includes all the goodies: QLoRA, gradient accumulation, gradient checkpointing with explanations on how they work π https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb