Jiayan Guo's picture

Jiayan Guo

SpaceProduct

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 20 hours ago

RynnEC: Bringing MLLMs into Embodied World

upvoted a paper 10 days ago

Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors

upvoted a paper 12 days ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

View all activity

Organizations

upvoted a paper about 20 hours ago

RynnEC: Bringing MLLMs into Embodied World

Paper • 2508.14160 • Published 4 days ago • 12

upvoted a paper 10 days ago

Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors

Paper • 2508.08896 • Published 11 days ago • 10

upvoted a paper 12 days ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published 15 days ago • 155

upvoted an article 12 days ago

Article

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

By

and 9 others •

12 days ago

• 25

upvoted a collection about 2 months ago

ZeroSearch

21 items • Updated Jul 2 • 3

upvoted 2 papers 4 months ago

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Paper • 2505.04588 • Published May 7 • 66

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 280

upvoted 2 papers 6 months ago

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

Paper • 2502.09621 • Published Feb 13 • 28

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published Jan 22 • 90

upvoted a collection 8 months ago

Qwen2-VL

Vision-language model series based on Qwen2 • 16 items • Updated Jul 21 • 224

upvoted a paper 9 months ago

EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation

Paper • 2411.08380 • Published Nov 13, 2024 • 27

upvoted a paper 10 months ago

Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents

Paper • 2410.13185 • Published Oct 17, 2024 • 6

upvoted 2 collections 10 months ago

LLaVA-Video

Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 8 items • Updated Feb 21 • 62

LLaVA-Critic

as a general evaluator for assessing model performance • 6 items • Updated Oct 6, 2024 • 10

upvoted 2 papers 10 months ago

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 95

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published Oct 3, 2024 • 38

upvoted an article 10 months ago

Article

ColPali: Efficient Document Retrieval with Vision Language Models 👀

By

•

Jul 5, 2024

• 288

upvoted a paper about 1 year ago

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

Paper • 2407.19672 • Published Jul 29, 2024 • 59