Stoney Kang's picture

Stoney Kang

sikang99

·

AI & ML interests

Remote Control based on Vision

Recent Activity

upvoted a paper 2 days ago

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

upvoted a paper 2 days ago

Lyra 2.0: Explorable Generative 3D Worlds

upvoted a paper 2 days ago

Habitat-GS: A High-Fidelity Navigation Simulator with Dynamic Gaussian Splatting

View all activity

Organizations

upvoted 3 papers 2 days ago

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Paper • 2604.12374 • Published 3 days ago • 26

Lyra 2.0: Explorable Generative 3D Worlds

Paper • 2604.13036 • Published 3 days ago • 29

Habitat-GS: A High-Fidelity Navigation Simulator with Dynamic Gaussian Splatting

Paper • 2604.12626 • Published 3 days ago • 13

upvoted 3 papers 3 days ago

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

Paper • 2604.10905 • Published 4 days ago • 25

ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

Paper • 2604.11784 • Published 4 days ago • 129

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published 6 days ago • 72

upvoted 5 papers 4 days ago

GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents

Paper • 2604.07429 • Published 9 days ago • 105

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published 8 days ago • 234

ELT: Elastic Looped Transformers for Visual Generation

Paper • 2604.09168 • Published 7 days ago • 19

Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory

Paper • 2604.08995 • Published 7 days ago • 44

Small Vision-Language Models are Smart Compressors for Long Video Understanding

Paper • 2604.08120 • Published 8 days ago • 20

upvoted a paper 5 days ago

HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents

Paper • 2604.07430 • Published 9 days ago • 181

liked a model 6 days ago

google/tipsv2-b14

Zero-Shot Image Classification • Updated 2 days ago • 4.12k • 48

upvoted 2 papers 6 days ago

Structured Distillation of Web Agent Capabilities Enables Generalization

Paper • 2604.07776 • Published 8 days ago • 20

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published 8 days ago • 255

upvoted 5 papers 7 days ago

OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence

Paper • 2604.07296 • Published 9 days ago • 39

Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference

Paper • 2604.07394 • Published 9 days ago • 16

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Paper • 2604.08377 • Published 8 days ago • 276

MolmoWeb: Open Visual Web Agent and Open Data for the Open Web

Paper • 2604.08516 • Published 8 days ago • 41

SEVerA: Verified Synthesis of Self-Evolving Agents

Paper • 2603.25111 • Published 22 days ago • 31