Zhijiang

Zeee

https://cartus.github.io/

AI & ML interests

Large Language Models

Recent Activity

upvoted a paper about 10 hours ago

Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining

upvoted a paper about 20 hours ago

OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond

upvoted a paper about 20 hours ago

ThoughtTrace: Understanding User Thoughts in Real-World LLM Interactions

View all activity

Organizations

upvoted a paper about 10 hours ago

Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining

Paper • 2605.14747 • Published 8 days ago • 86

upvoted 2 papers about 20 hours ago

OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond

Paper • 2605.19660 • Published 3 days ago • 36

ThoughtTrace: Understanding User Thoughts in Real-World LLM Interactions

Paper • 2605.20087 • Published 3 days ago • 11

upvoted a collection 2 days ago

EnvFactory

Collection

This is the checkpoints and dataset for: EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL. • 7 items • Updated 2 days ago • 1

upvoted a paper 2 days ago

EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL

Paper • 2605.18703 • Published 4 days ago • 44

upvoted 3 papers 3 months ago

upvoted a collection 3 months ago

CodeScaler

Collection

5 items • Updated Mar 2 • 6

upvoted 5 papers 3 months ago

CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models

Paper • 2602.17684 • Published Feb 4 • 22

Probability-Entropy Calibration: An Elastic Indicator for Adaptive Fine-tuning

Paper • 2602.01745 • Published Feb 2 • 7

Improving Data and Reward Design for Scientific Reasoning in Large Language Models

Paper • 2602.08321 • Published Feb 9 • 44

LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth

Paper • 2602.07962 • Published Feb 8 • 24

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Paper • 2602.01734 • Published Feb 2 • 32

upvoted a paper 4 months ago

MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents

Paper • 2601.12346 • Published Jan 18 • 52

upvoted 5 papers 7 months ago

OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows

Paper • 2510.24411 • Published Oct 28, 2025 • 73

JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence

Paper • 2510.23538 • Published Oct 27, 2025 • 98

QueST: Incentivizing LLMs to Generate Difficult Problems

Paper • 2510.17715 • Published Oct 20, 2025 • 35

Scaling Language-Centric Omnimodal Representation Learning

Paper • 2510.11693 • Published Oct 13, 2025 • 108

TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios

Paper • 2505.12891 • Published May 19, 2025 • 10

Zhijiang

AI & ML interests

Recent Activity

Organizations

Zeee's activity