Daniel Bourke's picture

Building on HF

Daniel Bourke PRO

mrdbourke

·

https://www.mrdbourke.com

AI & ML interests

Computer vision. Small on-device models. VLMs. High-quality tutorials.

Recent Activity

liked a model 42 minutes ago

FireRedTeam/FireRed-Image-Edit-1.1

updated a model about 4 hours ago

mrdbourke/queensland-ai-gemma3-fine-tuned-8bit-mlx

published a model about 4 hours ago

mrdbourke/queensland-ai-gemma3-fine-tuned-8bit-mlx

View all activity

Organizations

New activity in Qwen/Qwen3.5-397B-A17B 22 days ago

Congratulations on this release!

#3 opened 23 days ago by

New activity in nvidia/llama-nemotron-rerank-vl-1b-v2 about 2 months ago

Use default attention implementation with option to override

#2 opened 2 months ago by

nvidia-oliver-holworthy

Add SDPA attention support to models

#3 opened about 2 months ago by

New activity in nvidia/llama-nemotron-embed-vl-1b-v2 about 2 months ago

Use default attention implementation with option to override

#2 opened 2 months ago by

nvidia-oliver-holworthy

New activity in mistralai/Ministral-3-8B-Reasoning-2512 3 months ago

Issue with not all tensors being on the same device using demo code

#1 opened 3 months ago by

commented a paper 5 months ago

LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training

Paper • 2509.23661 • Published Sep 28, 2025 • 48 •

New activity in apple/MobileCLIP2-S0 5 months ago

Add MobileCLIP2-demo

#4 opened 5 months ago by

New activity in UCSC-VLAA/Recap-DataComp-1B 6 months ago

Update Recap-DataComp-1B card with OpenVision 2 context, metadata, and links

#4 opened 6 months ago by

New activity in google/gemma-3n-E4B-it-litert-preview 8 months ago

Conversion script example/documentation?

#42 opened 8 months ago by

New activity in nvidia/C-RADIOv3-B 9 months ago

Paper/results for v3 model

#2 opened 9 months ago by

New activity in BLIP3o/BLIP3o-Pretrain-Long-Caption 10 months ago

Dataset availability

#1 opened 10 months ago by

New activity in ustc-community/dfine-large-obj365 10 months ago

Which is the D-FINE-L E25 model pre-trained on Objects 365?

#1 opened 10 months ago by

commented a paper about 1 year ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20, 2025 • 159 •

New activity in MiniMaxAI/MiniMax-Text-01 about 1 year ago

Links broken on blog post release of these models

#22 opened about 1 year ago by

commented a paper over 1 year ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 133 •

New activity in apple/DataCompDR-1B over 1 year ago

Issue when downloading the data via streaming

#3 opened over 1 year ago by

commented a paper over 2 years ago

mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration

Paper • 2311.04257 • Published Nov 7, 2023 • 22 •

New activity in mrdbourke/food_vision_199_classes over 3 years ago

Dataset Viewer issue

#1 opened over 3 years ago by