An Empirical Study of Autoregressive Pre-training from Videos Paper • 2501.05453 • Published Jan 9 • 37
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published Dec 12, 2024 • 94
VisualLens: Personalization through Visual History Paper • 2411.16034 • Published Nov 25, 2024 • 18
Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents Paper • 2411.06559 • Published Nov 10, 2024 • 13
Sharingan: Extract User Action Sequence from Desktop Recordings Paper • 2411.08768 • Published Nov 13, 2024 • 10
Sharingan: Extract User Action Sequence from Desktop Recordings Paper • 2411.08768 • Published Nov 13, 2024 • 10 • 2
Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model Paper • 2411.04496 • Published Nov 7, 2024 • 23
Personalization of Large Language Models: A Survey Paper • 2411.00027 • Published Oct 29, 2024 • 32
Survey of User Interface Design and Interaction Techniques in Generative AI Applications Paper • 2410.22370 • Published Oct 28, 2024 • 12
Navigating the Unknown: A Chat-Based Collaborative Interface for Personalized Exploratory Tasks Paper • 2410.24032 • Published Oct 31, 2024 • 10 • 2
Unbounded: A Generative Infinite Game of Character Life Simulation Paper • 2410.18975 • Published Oct 24, 2024 • 36
Tracking Universal Features Through Fine-Tuning and Model Merging Paper • 2410.12391 • Published Oct 16, 2024 • 5
HumanEval-V: Benchmarking High-Level Visual Reasoning with Complex Diagrams in Coding Tasks Paper • 2410.12381 • Published Oct 16, 2024 • 44
Running 103 103 Llmlingua 2 💻 Compress lengthy prompts into shorter versions while preserving key information
Agent S: An Open Agentic Framework that Uses Computers Like a Human Paper • 2410.08164 • Published Oct 10, 2024 • 24