Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
pangzs 's Collections
Papers
In-context learning

Papers

updated 3 days ago
Upvote
-

  • Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

    Paper • 2505.04921 • Published May 8 • 185

  • On Path to Multimodal Generalist: General-Level and General-Bench

    Paper • 2505.04620 • Published May 7 • 83

  • StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant

    Paper • 2505.05467 • Published May 8 • 14

  • Adapting Vision-Language Models Without Labels: A Comprehensive Survey

    Paper • 2508.05547 • Published 17 days ago • 11

  • VLM4D: Towards Spatiotemporal Awareness in Vision Language Models

    Paper • 2508.02095 • Published 20 days ago • 6

  • Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

    Paper • 2508.13167 • Published 18 days ago • 103

  • Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

    Paper • 2508.09789 • Published 11 days ago • 5

  • MedSAMix: A Training-Free Model Merging Approach for Medical Image Segmentation

    Paper • 2508.11032 • Published 9 days ago • 2

  • Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory

    Paper • 2508.09736 • Published 11 days ago • 50

  • Ovis2.5 Technical Report

    Paper • 2508.11737 • Published 9 days ago • 99
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs