1 33 33

InHo Won

kotmul

AI & ML interests

None yet

Recent Activity

updated a model 29 days ago

kotmul/SAE-Distill-Llama-mlp

published a model 29 days ago

kotmul/SAE-Distill-Llama-mlp

liked a model about 1 month ago

deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

View all activity

Organizations

upvoted 2 papers 5 months ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 126

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 405

upvoted a paper 7 months ago

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 39

upvoted an article 7 months ago

Article

wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR??

•

Sep 27, 2024

• 46

upvoted 6 papers about 1 year ago

Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean

Paper • 2403.10882 • Published Mar 16, 2024 • 6

X-LLaVA: Optimizing Bilingual Large Vision-Language Alignment

Paper • 2403.11399 • Published Mar 18, 2024 • 6

BOK-VQA: Bilingual outside Knowledge-Based Visual Question Answering via Graph Representation Pretraining

Paper • 2401.06443 • Published Jan 12, 2024 • 2

upvoted a collection about 1 year ago

Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 797

upvoted 9 papers over 1 year ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24

Genie: Generative Interactive Environments

Paper • 2402.15391 • Published Feb 23, 2024 • 72

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 17

In deep reinforcement learning, a pruned network is a good network

Paper • 2402.12479 • Published Feb 19, 2024 • 19

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 110

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Paper • 2402.08609 • Published Feb 13, 2024 • 37

ODIN: Disentangled Reward Mitigates Hacking in RLHF

Paper • 2402.07319 • Published Feb 11, 2024 • 14

Policy Improvement using Language Feedback Models

Paper • 2402.07876 • Published Feb 12, 2024 • 9

Large-scale Reinforcement Learning for Diffusion Models

Paper • 2401.12244 • Published Jan 20, 2024 • 30

InHo Won

AI & ML interests

Recent Activity

Organizations

kotmul's activity

wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR??