Building on HF

1 26 11

ℏεsam

hesamation

AI & ML interests

post-training / reasonign models / RAG

Recent Activity

upvoted an article about 1 month ago

We Got Claude to Fine-Tune an Open Source LLM

upvoted a paper about 1 month ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

upvoted a paper about 1 month ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

View all activity

Organizations

upvoted an article about 1 month ago

Article

We Got Claude to Fine-Tune an Open Source LLM

Dec 4, 2025

•

564

upvoted 3 papers about 1 month ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published Dec 2, 2025 • 244

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 96

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 282

upvoted a paper 2 months ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29, 2025 • 221

upvoted a paper 3 months ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 270

upvoted 2 papers 4 months ago

Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers

Paper • 2509.03059 • Published Sep 3, 2025 • 24

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22, 2025 • 160

upvoted a paper 6 months ago

A Survey of Context Engineering for Large Language Models

Paper • 2507.13334 • Published Jul 17, 2025 • 259

upvoted a paper 7 months ago

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Paper • 2506.14965 • Published Jun 17, 2025 • 49

upvoted 2 papers 8 months ago

CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges

Paper • 2504.19093 • Published Apr 27, 2025 • 18

Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning

Paper • 2504.16656 • Published Apr 23, 2025 • 57

upvoted 7 papers 9 months ago

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18, 2025 • 139

WORLDMEM: Long-term Consistent World Simulation with Memory

Paper • 2504.12369 • Published Apr 16, 2025 • 35

Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning

Paper • 2504.08672 • Published Apr 11, 2025 • 55

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14, 2025 • 306

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31, 2025 • 301

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31, 2025 • 62

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Paper • 2503.21614 • Published Mar 27, 2025 • 42

upvoted a collection 9 months ago

🤔 Reasoning about Reasoning

Collection

papers and articles about reasoning LLMs • 12 items • Updated Jun 22, 2025 • 6

ℏεsam

AI & ML interests

Recent Activity

Organizations

hesamation's activity

We Got Claude to Fine-Tune an Open Source LLM