In a Training Loop 🔄

75 126 268

Asankhaya Sharma

codelion

http://asankhaya.github.io/

AI & ML interests

Creator of OptiLLM, OpenEvolve, Adaptive Classifier, and Ellora. Pioneering a new category in AI infrastructure: inference-time compute for LLMs.

Recent Activity

liked a model 9 days ago

zai-org/GLM-4.7-Flash

liked a Space 12 days ago

YinmingHuang/StableAvatar

upvoted a paper 15 days ago

PaperBanana: Automating Academic Illustration for AI Scientists

View all activity

Organizations

upvoted a paper 15 days ago

PaperBanana: Automating Academic Illustration for AI Scientists

Paper • 2601.23265 • Published 19 days ago • 186

upvoted an article 26 days ago

Article

Reverse Engineering a $500M Mystery: From HashHop to Memory-Augmented Language Models

26 days ago

•

upvoted an article about 2 months ago

Article

The Optimal Architecture for Small Language Models

Dec 26, 2025

•

117

upvoted a paper 2 months ago

Universal Reasoning Model

Paper • 2512.14693 • Published Dec 16, 2025 • 43

upvoted an article 3 months ago

Article

Ellora: Enhancing LLMs with LoRA - Standardized Recipes for Capability Enhancement

Dec 3, 2025

•

upvoted a paper 3 months ago

Budget-Aware Tool-Use Enables Effective Agent Scaling

Paper • 2511.17006 • Published Nov 21, 2025 • 32

upvoted 2 articles 4 months ago

Article

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Nov 3, 2025

•

Article

Python Is All You Need? Introducing Dria-Agent-α

Jan 10, 2025

•

upvoted a collection 4 months ago

Dhara Foundational Models

Collection

Diffusion Language Models combining deep narrow networks, Canon layers (depthwise causal convolutions), and WSD (Warmup-Stable-Decay) training. • 1 item • Updated Dec 26, 2025 • 2

upvoted a paper 4 months ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 508

upvoted an article 4 months ago

Article

mem-agent: Equipping LLM Agents with Memory Using RL

Oct 9, 2025

•

upvoted a collection 5 months ago

Mem-Agent

Collection

Small sized agents from Dria trained on interacting with an obsidian-like memory system using python tools. Trained on Qwen3-4B-Thinking-2507. • 4 items • Updated Sep 5, 2025 • 5

upvoted a paper 5 months ago

BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining

Paper • 2508.10975 • Published Aug 14, 2025 • 60

upvoted an article 5 months ago

Article

mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL

Sep 11, 2025

•

upvoted a collection 6 months ago

Nemotron-Pre-Training-Datasets

Collection

Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 14 days ago • 97

upvoted an article 6 months ago

Article

Building Enterprise-Ready Text Classifiers in Minutes with Adaptive Learning

Aug 9, 2025

•

upvoted 2 papers 6 months ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 183

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published Aug 7, 2025 • 130

upvoted 2 articles 7 months ago

Article

Towards Open Evolutionary Agents

Aug 4, 2025

•

Article

Unsupervised Model Improvement via Internal Coherence Maximization: Outperforming Human-Supervised Methods Through Self-Elicitation

Aug 3, 2025

•

Asankhaya Sharma

AI & ML interests

Recent Activity

Organizations

codelion's activity

Reverse Engineering a $500M Mystery: From HashHop to Memory-Augmented Language Models

The Optimal Architecture for Small Language Models

Ellora: Enhancing LLMs with LoRA - Standardized Recipes for Capability Enhancement

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Python Is All You Need? Introducing Dria-Agent-α

mem-agent: Equipping LLM Agents with Memory Using RL

mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL

Building Enterprise-Ready Text Classifiers in Minutes with Adaptive Learning

Towards Open Evolutionary Agents

Unsupervised Model Improvement via Internal Coherence Maximization: Outperforming Human-Supervised Methods Through Self-Elicitation