Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
7
13
2
Jiang
Dongwei
Follow
AZH04's profile picture
dark-pen's profile picture
rxlqn2's profile picture
6 followers
·
1 following
Some-random
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
1 day ago
The Alignment Waltz: Jointly Training Agents to Collaborate for Safety
upvoted
a
paper
8 days ago
IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning
upvoted
an
article
17 days ago
SmolLM3: smol, multilingual, long-context reasoner
View all activity
Organizations
Papers
4
arxiv:
2506.11930
arxiv:
2410.01044
arxiv:
2409.12183
arxiv:
2407.09007
models
17
Sort: Recently updated
Dongwei/Qwen-2.5-7B_Base_Math_smalllr_newdata
Text Generation
•
8B
•
Updated
Feb 13
Dongwei/Qwen-2.5-7B_Base_Math_smalllr_longer
Text Generation
•
8B
•
Updated
Feb 11
•
1
Dongwei/Qwen-2.5-7B_Base_Math_smallestlr
Text Generation
•
8B
•
Updated
Feb 11
•
2
Dongwei/Qwen-2.5-7B_Base_Math_smallestlr_newdata
Text Generation
•
8B
•
Updated
Feb 5
•
3
Dongwei/Qwen-2.5-7B_Base_Math_smalllr
Text Generation
•
8B
•
Updated
Feb 5
•
1
•
6
Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math_lowlr
Text Generation
•
8B
•
Updated
Feb 4
•
3
Dongwei/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_Math_smalllr
Text Generation
•
2B
•
Updated
Feb 4
•
3
Dongwei/Qwen2.5-1.5B-Open-R1-GRPO_Math_smalllr
Text Generation
•
2B
•
Updated
Feb 4
•
2
Dongwei/Qwen-2.5-7B_Math_smalllr
Text Generation
•
8B
•
Updated
Feb 4
Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math
Text Generation
•
8B
•
Updated
Feb 4
•
2
View 17 models
datasets
3
Sort: Recently updated
Dongwei/Feedback_Friction_Dataset
Viewer
•
Updated
Jun 17
•
394
•
55
•
2
Dongwei/Math_8K_for_GRPO
Viewer
•
Updated
Feb 5
•
8.89k
•
29
•
3
Dongwei/reasoning_world_model
Viewer
•
Updated
Apr 22, 2024
•
15.2k
•
8
•
6