Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

MoeReward
/
rl_checkpoints

Safetensors
Model card Files Files and versions Community
rl_checkpoints / qwen1.5_base_rule_base_math_heavy_drgrpo_reward_func
Ctrl+K
Ctrl+K
  • 1 contributor
History: 1 commit
shengyi-qian's picture
shengyi-qian
drgrpo checkpoints
2581e08 about 2 months ago
  • added_tokens.json
    80 Bytes
    LFS
    drgrpo checkpoints about 2 months ago
  • config.json
    1.01 kB
    LFS
    drgrpo checkpoints about 2 months ago
  • generation_config.json
    139 Bytes
    LFS
    drgrpo checkpoints about 2 months ago
  • merges.txt
    1.67 MB
    drgrpo checkpoints about 2 months ago
  • model-00001-of-00006.safetensors
    5 GB
    LFS
    drgrpo checkpoints about 2 months ago
  • model-00002-of-00006.safetensors
    5 GB
    LFS
    drgrpo checkpoints about 2 months ago
  • model-00003-of-00006.safetensors
    5 GB
    LFS
    drgrpo checkpoints about 2 months ago
  • model-00004-of-00006.safetensors
    4.99 GB
    LFS
    drgrpo checkpoints about 2 months ago
  • model-00005-of-00006.safetensors
    5 GB
    LFS
    drgrpo checkpoints about 2 months ago
  • model-00006-of-00006.safetensors
    3.66 GB
    LFS
    drgrpo checkpoints about 2 months ago
  • model.safetensors.index.json
    416 kB
    LFS
    drgrpo checkpoints about 2 months ago
  • special_tokens_map.json
    370 Bytes
    LFS
    drgrpo checkpoints about 2 months ago
  • tokenizer.json
    11.4 MB
    LFS
    drgrpo checkpoints about 2 months ago
  • tokenizer_config.json
    1.33 kB
    LFS
    drgrpo checkpoints about 2 months ago
  • vocab.json
    2.78 MB
    LFS
    drgrpo checkpoints about 2 months ago