2 25 23

Zhaoye Fei

ngc7293

https://ngc7292.github.io/

AI & ML interests

NLP & Ro.

Recent Activity

liked a Space 18 days ago

OpenMOSS-Team/MOSS-TTS

authored a paper 21 days ago

Towards More Effective and Economic Sparsely-Activated Model

authored a paper 21 days ago

Query of CC: Unearthing Large Scale Domain-Specific Knowledge from Public Corpora

View all activity

Organizations

upvoted a paper 21 days ago

MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models

Paper • 2602.10934 • Published 22 days ago • 49

upvoted 2 papers 22 days ago

Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning

Paper • 2602.10090 • Published 23 days ago • 51

Prism: Spectral-Aware Block-Sparse Attention

Paper • 2602.08426 • Published 25 days ago • 36

upvoted a paper 24 days ago

MOVA: Towards Scalable and Synchronized Video-Audio Generation

Paper • 2602.08794 • Published 24 days ago • 154

upvoted 3 papers about 1 month ago

HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding

Paper • 2601.14724 • Published Jan 21 • 74

ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development

Paper • 2601.11077 • Published Jan 16 • 65

FutureOmni: Evaluating Future Forecasting from Omni-Modal Context for Multimodal LLMs

Paper • 2601.13836 • Published Jan 20 • 35

upvoted a paper about 2 months ago

MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization

Paper • 2601.01554 • Published Jan 4 • 57

upvoted 2 papers 2 months ago

DiRL: An Efficient Post-Training Framework for Diffusion Language Models

Paper • 2512.22234 • Published Dec 23, 2025 • 22

LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Paper • 2512.23576 • Published Dec 29, 2025 • 65

upvoted 2 papers 3 months ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 240

SRPO: Self-Referential Policy Optimization for Vision-Language-Action Models

Paper • 2511.15605 • Published Nov 19, 2025 • 24

upvoted 2 papers 4 months ago

Emu3.5: Native Multimodal Models are World Learners

Paper • 2510.26583 • Published Oct 30, 2025 • 112

RoboOmni: Proactive Robot Manipulation in Omni-modal Context

Paper • 2510.23763 • Published Oct 27, 2025 • 56

upvoted 3 papers 5 months ago

PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning

Paper • 2510.13809 • Published Oct 15, 2025 • 38

LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models

Paper • 2510.13626 • Published Oct 15, 2025 • 46

MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance

Paper • 2510.00499 • Published Oct 1, 2025 • 20

upvoted a paper 6 months ago

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Paper • 2509.15221 • Published Sep 18, 2025 • 111

upvoted a paper 7 months ago

WideSearch: Benchmarking Agentic Broad Info-Seeking

Paper • 2508.07999 • Published Aug 11, 2025 • 111

upvoted a paper 8 months ago

Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning

Paper • 2506.23127 • Published Jun 29, 2025 • 1

Zhaoye Fei

AI & ML interests

Recent Activity

Organizations

ngc7293's activity