36 15 46

linbin

LanguageBind

https://linb203.github.io

AI & ML interests

None yet

Recent Activity

liked a model 6 days ago

NCSOFT/VARCO-VISION-2.0-14B

liked a dataset 8 days ago

LucasFang/FLUX-Reason-6M

commented on a paper 9 days ago

Can Understanding and Generation Truly Benefit Together -- or Just Coexist?

View all activity

Organizations

authored 7 papers 4 months ago

Next Patch Prediction for Autoregressive Visual Generation

Paper • 2412.15321 • Published Dec 19, 2024 • 1

DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses

Paper • 2412.00397 • Published Nov 30, 2024

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

Paper • 2503.07265 • Published Mar 10 • 4

SwapAnyone: Consistent and Realistic Video Synthesis for Swapping Any Person into Any Video

Paper • 2503.09154 • Published Mar 12

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation

Paper • 2505.20292 • Published May 26 • 53

ImgEdit: A Unified Image Editing Dataset and Benchmark

Paper • 2505.20275 • Published May 26 • 18

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Paper • 2506.03147 • Published Jun 3 • 58

authored 4 papers 10 months ago

Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle

Paper • 2407.19548 • Published Jul 28, 2024 • 27

OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model

Paper • 2409.01199 • Published Sep 2, 2024 • 14

WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model

Paper • 2411.17459 • Published Nov 26, 2024 • 11

Open-Sora Plan: Open-Source Large Video Generation Model

Paper • 2412.00131 • Published Nov 28, 2024 • 33

authored 3 papers over 1 year ago

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Paper • 2406.04325 • Published Jun 6, 2024 • 75

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

Paper • 2404.05014 • Published Apr 7, 2024 • 34

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29, 2024 • 53

authored 3 papers almost 2 years ago

Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models

Paper • 2311.16103 • Published Nov 27, 2023 • 1

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Paper • 2310.01852 • Published Oct 3, 2023 • 2

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Paper • 2311.10122 • Published Nov 16, 2023 • 27

linbin

AI & ML interests

Recent Activity

Organizations

LanguageBind's activity