wybertwang's picture

2 8 12

wybertwang

wybertwang

·

http://ttengwang.com/

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning

liked a model 18 days ago

lmms-lab/Aero-1-Audio

updated a collection 22 days ago

View all activity

Organizations

authored 9 papers 23 days ago

Caption Anything: Interactive Image Description with Diverse Multimodal Controls

Paper • 2305.02677 • Published May 4, 2023

Video Understanding with Large Language Models: A Survey

Paper • 2312.17432 • Published Dec 29, 2023 • 3

Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models

Paper • 2307.14061 • Published Jul 26, 2023

Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

Paper • 2307.16525 • Published Jul 31, 2023

Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models

Paper • 2308.11186 • Published Aug 22, 2023 • 1

LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos

Paper • 2411.19772 • Published Nov 29, 2024

GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers

Paper • 2503.19480 • Published Mar 25 • 16

TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation

Paper • 2505.05422 • Published May 8 • 8

Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?

Paper • 2505.21374 • Published May 27 • 26