Andrew W

Andre3000

AI & ML interests

None yet

Recent Activity

liked a Space about 2 months ago

HuggingFaceTB/smol-training-playbook

upvoted a paper 3 months ago

Multiplayer Nash Preference Optimization

upvoted a paper 3 months ago

Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models

View all activity

Organizations

liked a Space about 2 months ago

The Smol Training Playbook

📚

2.72k

The secrets to building world-class LLMs

upvoted 3 papers 3 months ago

Multiplayer Nash Preference Optimization

Paper • 2509.23102 • Published Sep 27 • 62

Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models

Paper • 2402.14207 • Published Feb 22, 2024 • 10

VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models

Paper • 2509.19803 • Published Sep 24 • 120

upvoted an article 3 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

•

740

upvoted 7 papers 3 months ago

GTA: A Benchmark for General Tool Agents

Paper • 2407.08713 • Published Jul 11, 2024 • 17

MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents

Paper • 2503.01935 • Published Mar 3 • 29

You Have Thirteen Hours in Which to Solve the Labyrinth: Enhancing AI Game Masters with Function Calling

Paper • 2409.06949 • Published Sep 11, 2024 • 1

Instruction-Driven Game Engine: A Poker Case Study

Paper • 2410.13441 • Published Oct 17, 2024 • 2

Generative Agents: Interactive Simulacra of Human Behavior

Paper • 2304.03442 • Published Apr 7, 2023 • 14

AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents

Paper • 2407.18901 • Published Jul 26, 2024 • 35

WebGames: Challenging General-Purpose Web-Browsing AI Agents

Paper • 2502.18356 • Published Feb 25 • 14

liked a dataset 4 months ago

nvidia/Nemotron-CC-v2

Viewer • Updated 7 days ago • 8.79B • 35.8k • 96

upvoted 2 papers 5 months ago

SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially?

Paper • 2503.12349 • Published Mar 16 • 44

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2 • 238

liked a Space 5 months ago

DeepSite v3

🐳

16.2k

Generate any application by Vibe Coding

upvoted 3 papers 5 months ago

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

Paper • 2307.16789 • Published Jul 31, 2023 • 101

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30 • 50

Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10 • 133

upvoted an article 6 months ago