ZHANG Jipeng's picture

21 6

ZHANG Jipeng

OldFriends

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 29 days ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

upvoted a paper 2 months ago

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

upvoted a paper 3 months ago

Self-rewarding correction for mathematical reasoning

View all activity

Organizations

None yet

OldFriends's activity

upvoted a paper 29 days ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published 29 days ago • 89

upvoted a paper 2 months ago

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 57

upvoted a paper 3 months ago

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published Feb 26 • 84

upvoted 4 collections 3 months ago

🎯DART-Math

Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving [NeurIPS 2024] @ https://github.com/hkust-nlp/dart-math • 20 items • Updated Feb 19 • 7

Deita

14 items • Updated May 20, 2024 • 13

M-STAR

Resources of M-STAR (Multimodal Self-Evolving Training for Reasoning) https://mstar-lmm.github.io/ • 2 items • Updated Dec 25, 2024 • 4

CodeI/O

Collection for CodeI/O @ https://codei-o.github.io/ • 16 items • Updated 11 days ago • 7

upvoted a collection 4 months ago

UI Agent

a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics • 362 items • Updated 1 day ago • 52

upvoted 2 papers 4 months ago

Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning

Paper • 2411.18203 • Published Nov 27, 2024 • 37

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published Dec 27, 2024 • 88

upvoted 2 collections 7 months ago

MIT Talk 31/10 Papers

14 items • Updated Oct 28, 2024 • 32

LLaVA-Critic

as a general evaluator for assessing model performance • 6 items • Updated Oct 6, 2024 • 10

upvoted a paper 7 months ago

Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9, 2024 • 71

upvoted an article 10 months ago

Article

SmolLM - blazingly fast and remarkably powerful

By

and 2 others •

Jul 16, 2024

• 366

upvoted a collection 10 months ago

NuminaMath

Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize • 7 items • Updated Feb 10 • 78

upvoted an article 10 months ago

Article

How NuminaMath Won the 1st AIMO Progress Prize

By

and 7 others •

Jul 11, 2024

• 120

upvoted a paper 10 months ago

TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts

Paper • 2407.03203 • Published Jul 3, 2024 • 12

upvoted an article 10 months ago

Article

Large-scale Near-deduplication Behind BigCode

By

•

May 16, 2023

• 25

upvoted a paper 11 months ago

Jailbreaking as a Reward Misspecification Problem

Paper • 2406.14393 • Published Jun 20, 2024 • 13

upvoted a paper about 1 year ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 71