3 54 35

Denis Akhiyarov

dtanow

AI & ML interests

AI Code Generation with LLMs

Recent Activity

upvoted a paper 1 day ago

Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

upvoted a paper 2 days ago

Intern-S1: A Scientific Multimodal Foundation Model

upvoted a paper 5 days ago

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

View all activity

Organizations

upvoted a paper 1 day ago

Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

Paper • 2508.14029 • Published 8 days ago • 109

upvoted a paper 2 days ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published 6 days ago • 236

upvoted a paper 5 days ago

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published 7 days ago • 78

upvoted 2 papers 7 days ago

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published 13 days ago • 88

Reasoning Language Models: A Blueprint

Paper • 2501.11223 • Published Jan 20 • 34

upvoted 2 papers 15 days ago

Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning

Paper • 2508.03501 • Published 22 days ago • 53

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published 20 days ago • 165

upvoted a paper 28 days ago

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

Paper • 2507.21046 • Published 30 days ago • 79

upvoted a paper 29 days ago

CLEAR: Error Analysis via LLM-as-a-Judge Made Easy

Paper • 2507.18392 • Published Jul 24 • 19

upvoted 2 papers about 1 month ago

Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning

Paper • 2507.16784 • Published Jul 22 • 118

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Paper • 2506.05176 • Published Jun 5 • 68

upvoted an article about 1 month ago

Article

SmolLM3: smol, multilingual, long-context reasoner

and 22 others •

Jul 8

• 639

upvoted 4 papers 3 months ago

upvoted 4 papers 4 months ago

Generating Physically Stable and Buildable LEGO Designs from Text

Paper • 2505.05469 • Published May 8 • 28

ReasonIR: Training Retrievers for Reasoning Tasks

Paper • 2504.20595 • Published Apr 29 • 55

The Leaderboard Illusion

Paper • 2504.20879 • Published Apr 29 • 70

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 120

Denis Akhiyarov

AI & ML interests

Recent Activity

Organizations

dtanow's activity

SmolLM3: smol, multilingual, long-context reasoner