haisong

rabbit19731

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

upvoted a paper 3 months ago

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

upvoted an article 4 months ago

SigLIP 2: A better multilingual vision language encoder

View all activity

Organizations

rabbit19731's activity

upvoted a paper about 1 month ago

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5 • 75

upvoted a paper 3 months ago

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Paper • 2404.07143 • Published Apr 10, 2024 • 110

upvoted an article 4 months ago

Article

SigLIP 2: A better multilingual vision language encoder

and 2 others •

Feb 21

• 165

liked a Space 4 months ago

119

Ovis2 16B

🦫

See, read, and reason—better together.

updated a Space 4 months ago

119

Ovis2 16B

🦫

See, read, and reason—better together.

upvoted a collection 4 months ago

Ovis2

Collection

Our latest advancement in multi-modal large language models (MLLMs) • 15 items • Updated Mar 25 • 63

liked 6 models 4 months ago

liked a model 7 months ago

AIDC-AI/Ovis1.6-Gemma2-27B

Image-Text-to-Text • Updated Feb 26 • 67 • 62

liked a model 8 months ago

AIDC-AI/Ovis1.6-Llama3.2-3B

Image-Text-to-Text • Updated Feb 26 • 10.6k • 47

liked a model 9 months ago

AIDC-AI/Ovis1.6-Gemma2-9B

Image-Text-to-Text • Updated Feb 26 • 9.06k • 270

liked 2 models 10 months ago

AIDC-AI/Ovis1.5-Gemma2-9B

Image-Text-to-Text • Updated Feb 26 • 59 • 19

AIDC-AI/Ovis1.5-Llama3-8B

Image-Text-to-Text • Updated Feb 26 • 51 • 27

liked a model 12 months ago

AIDC-AI/Ovis-Clip-Llama3-8B

Image-Text-to-Text • Updated Jun 14, 2024 • 30 • 7

upvoted 2 papers about 1 year ago

Parrot: Multilingual Visual Instruction Tuning

Paper • 2406.02539 • Published Jun 4, 2024 • 39

Ovis: Structural Embedding Alignment for Multimodal Large Language Model

Paper • 2405.20797 • Published May 31, 2024 • 29