Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Haotian Zhang's picture
7 20 9

Haotian Zhang

haotiz
zhideng's profile picture Hong-You's profile picture apple-intelligence's profile picture
·
  • HaotianZhang4AI
  • Haotian-Zhang

AI & ML interests

Vision and Language

Recent Activity

liked a model 23 days ago
reducto/RolmOCR
upvoted a paper 3 months ago
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation
upvoted a paper 6 months ago
STIV: Scalable Text and Image Conditioned Video Generation
View all activity

Organizations

Microsoft's profile picture CVPR Demo Track's profile picture Gradio-Blocks-Party's profile picture Apple's profile picture University of Washington's profile picture GLIPModel's profile picture

haotiz's activity

commented 3 papers 8 months ago

MM-Ego: Towards Building Egocentric Multimodal LLMs

Paper • 2410.07177 • Published Oct 9, 2024 • 22 •
3

Contrastive Localized Language-Image Pre-Training

Paper • 2410.02746 • Published Oct 3, 2024 • 38 •
3

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning

Paper • 2409.20566 • Published Sep 30, 2024 • 57 •
3
commented a paper 12 months ago

What If We Recaption Billions of Web Images with LLaMA-3?

Paper • 2406.08478 • Published Jun 12, 2024 • 42 •
1
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs