1 216 731

Motoki Wu PRO

tokestermw

https://motoki.co

AI & ML interests

None yet

Recent Activity

upvoted a collection 9 days ago

Qwen3-Omni

upvoted a paper 25 days ago

Why Language Models Hallucinate

upvoted a paper 26 days ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

View all activity

Organizations

upvoted a collection 9 days ago

Qwen3-Omni

Collection

5 items • Updated 10 days ago • 145

upvoted a paper 25 days ago

Why Language Models Hallucinate

Paper • 2509.04664 • Published 28 days ago • 183

upvoted a paper 26 days ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published about 1 month ago • 212

upvoted 3 papers about 1 month ago

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published Aug 28 • 108

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

Paper • 2508.16949 • Published Aug 23 • 22

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22 • 149

liked a Space about 1 month ago

580

Sheets

🗂

Create and enrich datasets using AI

liked a model about 1 month ago

xai-org/grok-2

Updated Aug 24 • 2.47k • 959

liked a Space about 1 month ago

232

Jupyter Agent 2

🏃

Run code and analyze data in a Jupyter notebook

liked a model about 1 month ago

stepfun-ai/NextStep-1-Large-Edit

Image-to-Image • 15B • Updated Aug 19 • 91 • 47

upvoted a collection about 1 month ago

NVIDIA Nemotron

Collection

Open, Production-ready Enterprise Models. Nvidia Open Model license. • 4 items • Updated 6 days ago • 62

liked 2 models about 1 month ago

deepseek-ai/DeepSeek-V3.1-Base

Text Generation • 685B • Updated Aug 26 • 16.2k • 1k

nvidia/NVIDIA-Nemotron-Nano-9B-v2

Text Generation • 9B • Updated 1 day ago • 216k • 399

upvoted a paper about 2 months ago

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14 • 93

liked a model about 2 months ago

mistralai/Mistral-Small-3.2-24B-Instruct-2506

24B • Updated Aug 21 • 85.7k • 474

upvoted a paper about 2 months ago

Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models

Paper • 2508.10751 • Published Aug 14 • 27

liked 4 models about 2 months ago