Min-Hung Chen

cmhungsteve

https://minhungchen.netlify.app/

AI & ML interests

Multimodal AI, Transfer Learning, Unsupervised Learning, Video Understanding, Vision Transformer, Computer Vision, Deep Learning

Recent Activity

upvoted a paper about 20 hours ago

Unified Reinforcement and Imitation Learning for Vision-Language Models

upvoted a paper about 20 hours ago

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

upvoted a paper 6 days ago

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

View all activity

Organizations

upvoted 2 papers about 20 hours ago

Unified Reinforcement and Imitation Learning for Vision-Language Models

Paper • 2510.19307 • Published Oct 22 • 30

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

Paper • 2512.22238 • Published 7 days ago • 13

upvoted 2 papers 6 days ago

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Paper • 2512.20848 • Published 7 days ago • 28

NVIDIA Nemotron 3: Efficient and Open Intelligence

Paper • 2512.20856 • Published 7 days ago • 27

upvoted a paper 7 days ago

FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos

Paper • 2512.10927 • Published 19 days ago • 5

upvoted 2 papers 9 days ago

Generative Refocusing: Flexible Defocus Control from a Single Image

Paper • 2512.16923 • Published 12 days ago • 36

4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation

Paper • 2512.17012 • Published 12 days ago • 42

upvoted a paper 14 days ago

Zoom-Zero: Reinforced Coarse-to-Fine Video Understanding via Temporal Zoom-in

Paper • 2512.14273 • Published 15 days ago • 7

upvoted a collection 21 days ago

Cosmos-Reason2

Collection

Cosmos Reason 2 is an open, customizable, reasoning vision language model (VLM) for physical AI and robotics • 14 items • Updated 5 days ago • 6

upvoted a paper 26 days ago

BlurDM: A Blur Diffusion Model for Image Deblurring

Paper • 2512.03979 • Published 27 days ago • 3

upvoted a paper about 2 months ago

VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models

Paper • 2511.07299 • Published Nov 10 • 5

upvoted a collection 2 months ago

Reasoning Efficiency Research

Collection

Ultra-efficient reasoning model! SOTA Accuracy / CoT Length trade-offs • 3 items • Updated 7 days ago • 11

upvoted a paper 2 months ago

DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning

Paper • 2510.15110 • Published Oct 16 • 15

upvoted 3 papers 3 months ago

TC-LoRA: Temporally Modulated Conditional LoRA for Adaptive Diffusion Control

Paper • 2510.09561 • Published Oct 10 • 7

Temporal Prompting Matters: Rethinking Referring Video Object Segmentation

Paper • 2510.07319 • Published Oct 8 • 2

LEAML: Label-Efficient Adaptation to Out-of-Distribution Visual Tasks for Multimodal Large Language Models

Paper • 2510.03232 • Published Oct 3 • 1

upvoted a collection 3 months ago

NVILA (HuggingFace)

Collection

HuggingFace Transformers can load us. • 5 items • Updated Sep 13 • 5

upvoted 2 papers 3 months ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 143

V2V-GoT: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multimodal Large Language Models and Graph-of-Thoughts

Paper • 2509.18053 • Published Sep 22 • 3

upvoted a paper 4 months ago

MovieCORE: COgnitive REasoning in Movies

Paper • 2508.19026 • Published Aug 26 • 6

Min-Hung Chen

AI & ML interests

Recent Activity

Organizations

cmhungsteve's activity