Wei Xiong's picture

Wei Xiong

weqweasdas

·

https://weixiongust.github.io/WeiXiongUST/index.html

AI & ML interests

Machine learning, RLHF

Recent Activity

upvoted a paper 21 days ago

Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation

upvoted a paper 7 months ago

Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning

updated a dataset 7 months ago

weqweasdas/qwen15b_train_simple_subset5k_for_difficulty_transition

View all activity

Organizations

upvoted a paper 21 days ago

Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation

Paper • 2605.03849 • Published 23 days ago • 125

upvoted a paper 7 months ago

Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning

Paper • 2510.27623 • Published Oct 31, 2025 • 13

updated a dataset 7 months ago

weqweasdas/qwen15b_train_simple_subset5k_for_difficulty_transition

Viewer • Updated Oct 26, 2025 • 5k • 9

published a dataset 7 months ago

weqweasdas/qwen15b_train_simple_subset5k_for_difficulty_transition

Viewer • Updated Oct 26, 2025 • 5k • 9

upvoted a paper 7 months ago

GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving

Paper • 2510.11769 • Published Oct 13, 2025 • 26

upvoted 2 papers 8 months ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 276

Reinforce-Ada: An Adaptive Sampling Framework for Reinforce-Style LLM Training

Paper • 2510.04996 • Published Oct 6, 2025 • 16

commented a paper 8 months ago

Reinforce-Ada: An Adaptive Sampling Framework for Reinforce-Style LLM Training

Paper • 2510.04996 • Published Oct 6, 2025 • 16 •

updated a dataset 8 months ago

weqweasdas/ultrafeedback_binarized_processed

Viewer • Updated Oct 4, 2025 • 61.1k • 6

published a dataset 8 months ago

weqweasdas/ultrafeedback_binarized_processed

Viewer • Updated Oct 4, 2025 • 61.1k • 6

updated a dataset 8 months ago

weqweasdas/qwen7b_prompt_difficult

Viewer • Updated Sep 29, 2025 • 15.7k • 7

published a dataset 8 months ago

weqweasdas/qwen7b_prompt_difficult

Viewer • Updated Sep 29, 2025 • 15.7k • 7

updated a dataset 8 months ago

weqweasdas/qwen7b_openr1_with_scores_sub

Viewer • Updated Sep 28, 2025 • 57.7k • 3

published a dataset 8 months ago

weqweasdas/qwen7b_openr1_with_scores_sub

Viewer • Updated Sep 28, 2025 • 57.7k • 3

updated a dataset 8 months ago

weqweasdas/qwen7b_openr1_with_scores_filtered_0375

Viewer • Updated Sep 25, 2025 • 24.3k • 37

published a dataset 8 months ago

weqweasdas/qwen7b_openr1_with_scores_filtered_0375

Viewer • Updated Sep 25, 2025 • 24.3k • 37

updated a dataset 8 months ago

weqweasdas/qwen7b_openr1_with_scores

Viewer • Updated Sep 23, 2025 • 75k • 11

published a dataset 8 months ago

weqweasdas/qwen7b_openr1_with_scores

Viewer • Updated Sep 23, 2025 • 75k • 11

updated a dataset 8 months ago

weqweasdas/from_default_filtered_openr1_with_scores_filtered_05_and_filtered_allwrong

Viewer • Updated Sep 18, 2025 • 25k • 67

published a dataset 8 months ago

weqweasdas/from_default_filtered_openr1_with_scores_filtered_05_and_filtered_allwrong

Viewer • Updated Sep 18, 2025 • 25k • 67