22 4 2

Oliver Stanley

OllieStanley

https://olliestanley.github.io

olliestanley

AI & ML interests

Language models, AI alignment, computer vision

Recent Activity

updated a model about 1 month ago

OllieStanley/Qwen2.5-3B-Instruct-RG-Algorithmic

updated a model about 1 month ago

OllieStanley/Qwen2.5-3B-Instruct-RG-Logic

updated a model about 1 month ago

OllieStanley/Qwen2.5-3B-Instruct-RG-Games

View all activity

Organizations

updated 4 models about 1 month ago

published 4 models about 1 month ago

OllieStanley/Qwen2.5-3B-Instruct-RG-Games

Text Generation • 3B • Updated Jun 6 • 10

OllieStanley/Qwen2.5-3B-Instruct-RG-Logic

Text Generation • 3B • Updated Jun 6 • 4

OllieStanley/Qwen2.5-3B-Instruct-RG-Algorithmic

Text Generation • 3B • Updated Jun 6 • 5

OllieStanley/Qwen2.5-3B-Instruct-RG-Algebra

Text Generation • 3B • Updated Jun 6 • 5

upvoted 3 papers about 1 month ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 133

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 168

Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs

Paper • 2506.00577 • Published May 31 • 11

New activity in OllieStanley/bert-base-ner-nc1 about 1 month ago

Adding `safetensors` variant of this model

#1 opened over 1 year ago by

SFconvertbot

New activity in OllieStanley/distilbert-customer-classifier about 1 month ago

Adding `safetensors` variant of this model

#1 opened over 1 year ago by

SFconvertbot

liked a dataset about 1 month ago

OpenAssistant/oasst1

Viewer • Updated May 2, 2023 • 88.8k • 6.41k • 1.41k

authored a paper about 1 month ago

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Paper • 2505.24760 • Published May 30 • 64

upvoted a paper about 1 month ago

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Paper • 2505.24760 • Published May 30 • 64

updated a model over 1 year ago

OllieStanley/bert-base-ner-nc1

Token Classification • 0.1B • Updated Jun 3 • 24

New activity in OpenAssistant/oasst1 over 1 year ago

Where is the missing data?

#18 opened about 2 years ago by

avacaondata

updated a model over 1 year ago

OllieStanley/distilbert-customer-classifier

Text Classification • 0.1B • Updated Jun 3 • 31 • 1

New activity in togethercomputer/RedPajama-INCITE-7B-Instruct about 2 years ago

Prohibited misuse

#7 opened about 2 years ago by

Aspie96

Oliver Stanley

AI & ML interests

Recent Activity

Organizations

OllieStanley's activity

Adding `safetensors` variant of this model

Adding `safetensors` variant of this model

Where is the missing data?

Prohibited misuse