Jaehyun Jun's picture

Jaehyun Jun

btjhjeon

·

https://btjhjeon.github.io/

btjhjeon

AI & ML interests

Multimodal

Recent Activity

upvoted a paper 2 days ago

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

updated a collection 2 days ago

updated a collection 2 days ago

View all activity

Organizations

btjhjeon's activity

upvoted 2 papers 2 days ago

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

Paper • 2503.03983 • Published 4 days ago • 18

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published 4 days ago • 64

upvoted 3 papers 3 days ago

Enhancing Abnormality Grounding for Vision Language Models with Knowledge Descriptions

Paper • 2503.03278 • Published 5 days ago • 12

KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding

Paper • 2503.02951 • Published 5 days ago • 25

ABC: Achieving Better Control of Multimodal Embeddings using VLMs

Paper • 2503.00329 • Published 9 days ago • 18

upvoted a paper 5 days ago

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published 6 days ago • 65

upvoted a paper 9 days ago

MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning

Paper • 2502.19634 • Published 11 days ago • 56

upvoted 3 papers 11 days ago

Kanana: Compute-efficient Bilingual Language Models

Paper • 2502.18934 • Published 12 days ago • 60

Introducing Visual Perception Token into Multimodal Large Language Model

Paper • 2502.17425 • Published 13 days ago • 14

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published 12 days ago • 69

upvoted a paper 12 days ago

Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models

Paper • 2502.16033 • Published 16 days ago • 16

upvoted 2 papers 13 days ago

Evaluating Multimodal Generative AI with Korean Educational Standards

Paper • 2502.15422 • Published 16 days ago • 9

VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues

Paper • 2502.12084 • Published 20 days ago • 29

upvoted a paper 16 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 17 days ago • 128

upvoted 2 papers 17 days ago

GIMMICK -- Globally Inclusive Multimodal Multitask Cultural Knowledge Benchmarking

Paper • 2502.13766 • Published 18 days ago • 3

InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning

Paper • 2502.11573 • Published 21 days ago • 9

upvoted 4 papers 18 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 18 days ago • 157

HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation

Paper • 2502.09838 • Published 24 days ago • 10

RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm

Paper • 2502.12513 • Published 20 days ago • 15

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published 19 days ago • 55