Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
ykarout
/
RPT-DeepSeek-R1-0528-Qwen3-8B
like
2
Text Generation
Transformers
Safetensors
4 languages
qwen3
Generated from Trainer
trl
grpo
rpt
conversational
text-generation-inference
arxiv:
2402.03300
arxiv:
2506.08007
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
main
RPT-DeepSeek-R1-0528-Qwen3-8B
Ctrl+K
Ctrl+K
1 contributor
History:
5 commits
ykarout
Update README.md
6c94f95
verified
17 days ago
.gitattributes
Safe
1.57 kB
Upload tokenizer
18 days ago
README.md
1.84 kB
Update README.md
17 days ago
chat_template.jinja
Safe
3.13 kB
Upload tokenizer
18 days ago
config.json
Safe
861 Bytes
GRPO fine-tuned DeepSeek-R1-Qwen3-8B for next token prediction according to paper https://huggingface.co/papers/2506.08007
18 days ago
generation_config.json
Safe
143 Bytes
GRPO fine-tuned DeepSeek-R1-Qwen3-8B for next token prediction according to paper https://huggingface.co/papers/2506.08007
18 days ago
model-00001-of-00004.safetensors
Safe
4.9 GB
xet
GRPO fine-tuned DeepSeek-R1-Qwen3-8B for next token prediction according to paper https://huggingface.co/papers/2506.08007
18 days ago
model-00002-of-00004.safetensors
Safe
4.92 GB
xet
GRPO fine-tuned DeepSeek-R1-Qwen3-8B for next token prediction according to paper https://huggingface.co/papers/2506.08007
18 days ago
model-00003-of-00004.safetensors
Safe
4.98 GB
xet
GRPO fine-tuned DeepSeek-R1-Qwen3-8B for next token prediction according to paper https://huggingface.co/papers/2506.08007
18 days ago
model-00004-of-00004.safetensors
Safe
1.58 GB
xet
GRPO fine-tuned DeepSeek-R1-Qwen3-8B for next token prediction according to paper https://huggingface.co/papers/2506.08007
18 days ago
model.safetensors.index.json
Safe
32.9 kB
GRPO fine-tuned DeepSeek-R1-Qwen3-8B for next token prediction according to paper https://huggingface.co/papers/2506.08007
18 days ago
special_tokens_map.json
Safe
371 Bytes
Upload tokenizer
18 days ago
tokenizer.json
Safe
11.4 MB
xet
Upload tokenizer
18 days ago
tokenizer_config.json
Safe
5.59 kB
Upload tokenizer
18 days ago