Michal Valko
misovalko
AI & ML interests
large language models, reasoning, fine-tuning, test-time computation, reinforcement learning with human feedback, world models
Recent Activity
upvoted
a
paper
11 days ago
Accelerating Nash Learning from Human Feedback via Mirror Prox
authored
a paper
7 months ago
The Llama 3 Herd of Models
new activity
11 months ago
paris-ai-running-club/README:next run wen?
Organizations
misovalko's activity
next run wen?
๐
๐
9
8
#3 opened 11 months ago
by
julien-c

FOMO
๐ฅ
๐
5
1
#1 opened about 1 year ago
by
osanseviero
