-
BitNet b1.58 2B4T Technical Report
Paper • 2504.12285 • Published • 73 -
DataDecide: How to Predict Best Pretraining Data with Small Experiments
Paper • 2504.11393 • Published • 18 -
Efficient Process Reward Model Training via Active Learning
Paper • 2504.10559 • Published • 13 -
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
Paper • 2504.13161 • Published • 92
HyeonwooKim
choco9966
AI & ML interests
None yet
Recent Activity
new activity
5 days ago
choco9966/open-ko-llm-leaderboard-old:구버전 평가 데이터셋 접근 관련 문의드립니다.
new activity
12 days ago
allenai/IF_multi_constraints_upto5:verify tools ?
liked
a dataset
19 days ago
Josephgflowers/Finance-Instruct-500k