Robin Williams
bfuzzy1
AI & ML interests
None yet
Organizations
None yet
RL
-
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response
Paper • 2412.14922 • Published • 88 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 47 -
Deliberation in Latent Space via Differentiable Cache Augmentation
Paper • 2412.17747 • Published • 32 -
Outcome-Refining Process Supervision for Code Generation
Paper • 2412.15118 • Published • 19
acheron-m
RL
-
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response
Paper • 2412.14922 • Published • 88 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 47 -
Deliberation in Latent Space via Differentiable Cache Augmentation
Paper • 2412.17747 • Published • 32 -
Outcome-Refining Process Supervision for Code Generation
Paper • 2412.15118 • Published • 19
models
10
bfuzzy1/acheron-m1a-llama
Text Generation
•
Updated
bfuzzy1/acheron-m
Text Generation
•
0.5B
•
Updated
•
1
bfuzzy1/acheron-d
0.5B
•
Updated
•
2
bfuzzy1/llambses-1
Text Generation
•
7B
•
Updated
•
3
bfuzzy1/acheron-o9
0.8B
•
Updated
bfuzzy1/acheron
0.5B
•
Updated
•
3
bfuzzy1/acheron-c
0.5B
•
Updated
•
1
bfuzzy1/Gunny
Text Generation
•
3B
•
Updated
•
17
bfuzzy1/llambses-1_4bit
7B
•
Updated
•
13
bfuzzy1/acheron-x
Text Generation
•
0.7B
•
Updated