Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.14460

A powerful open-source multilingual translation language model series, including instruction and reasoning models.

Running on Zero

22

22

Seed X

💻

A powerful multilingual translation language model
Seed-X: Building Strong Multilingual Translation LLM with 7B Parameters

Paper • 2507.13618 • Published Jul 18 • 16
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published 8 days ago • 78
ByteDance-Seed/Seed-X-PPO-7B

Translation • Updated about 1 month ago • 18.5k • 244

🌀 Bytedance Papers

about 19 hours ago

Seed-Coder: Let the Code Model Curate Data for Itself

Paper • 2506.03524 • Published Jun 4 • 6
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning

Paper • 2504.13914 • Published Apr 10 • 4
FlowTok: Flowing Seamlessly Across Text and Image Tokens

Paper • 2503.10772 • Published Mar 13 • 19
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?

Paper • 2503.09949 • Published Mar 13 • 5

Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published 23 days ago • 63
CRINN: Contrastive Reinforcement Learning for Approximate Nearest Neighbor Search

Paper • 2508.02091 • Published 24 days ago • 13
DINOv3

Paper • 2508.10104 • Published 15 days ago • 218
SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published 14 days ago • 88

about 5 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 406 • 95
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 35
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 89

Research Papers/Reviews/Literature

Daily Research papers and review including older relevant content.

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 62
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

Paper • 2503.15265 • Published Mar 19 • 47
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published Mar 18 • 51

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published 8 days ago • 78
MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement

Paper • 2508.09670 • Published 15 days ago
URPO: A Unified Reward & Policy Optimization Framework for Large Language Models

Paper • 2507.17515 • Published Jul 23

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published 8 days ago • 78
On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting

Paper • 2508.11408 • Published 13 days ago • 6

Scaling Test-time Compute for LLM Agents

Paper • 2506.12928 • Published Jun 15 • 61
AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs

Paper • 2507.08616 • Published Jul 11 • 13
ChemDFM-R: An Chemical Reasoner LLM Enhanced with Atomized Chemical Knowledge

Paper • 2507.21990 • Published 30 days ago • 24
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published 8 days ago • 78

ByteDance Papers

ByteDance papers collection

about 18 hours ago

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation

Paper • 2105.09501 • Published May 20, 2021
Cross-modal Contrastive Learning for Speech Translation

Paper • 2205.02444 • Published May 5, 2022
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs

Paper • 2210.03052 • Published Oct 6, 2022
Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning

Paper • 2212.10240 • Published Dec 20, 2022 • 1

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 31
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 5

A powerful open-source multilingual translation language model series, including instruction and reasoning models.

Running on Zero

22

22

Seed X

💻

A powerful multilingual translation language model
Seed-X: Building Strong Multilingual Translation LLM with 7B Parameters

Paper • 2507.13618 • Published Jul 18 • 16
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published 8 days ago • 78
ByteDance-Seed/Seed-X-PPO-7B

Translation • Updated about 1 month ago • 18.5k • 244

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published 8 days ago • 78
MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement

Paper • 2508.09670 • Published 15 days ago
URPO: A Unified Reward & Policy Optimization Framework for Large Language Models

Paper • 2507.17515 • Published Jul 23

🌀 Bytedance Papers

about 19 hours ago

Seed-Coder: Let the Code Model Curate Data for Itself

Paper • 2506.03524 • Published Jun 4 • 6
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning

Paper • 2504.13914 • Published Apr 10 • 4
FlowTok: Flowing Seamlessly Across Text and Image Tokens

Paper • 2503.10772 • Published Mar 13 • 19
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?

Paper • 2503.09949 • Published Mar 13 • 5

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published 8 days ago • 78
On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting

Paper • 2508.11408 • Published 13 days ago • 6

Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published 23 days ago • 63
CRINN: Contrastive Reinforcement Learning for Approximate Nearest Neighbor Search

Paper • 2508.02091 • Published 24 days ago • 13
DINOv3

Paper • 2508.10104 • Published 15 days ago • 218
SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published 14 days ago • 88

Scaling Test-time Compute for LLM Agents

Paper • 2506.12928 • Published Jun 15 • 61
AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs

Paper • 2507.08616 • Published Jul 11 • 13
ChemDFM-R: An Chemical Reasoner LLM Enhanced with Atomized Chemical Knowledge

Paper • 2507.21990 • Published 30 days ago • 24
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published 8 days ago • 78

about 5 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 406 • 95
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 35
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 89

ByteDance Papers

ByteDance papers collection

about 18 hours ago

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation

Paper • 2105.09501 • Published May 20, 2021
Cross-modal Contrastive Learning for Speech Translation

Paper • 2205.02444 • Published May 5, 2022
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs

Paper • 2210.03052 • Published Oct 6, 2022
Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning

Paper • 2212.10240 • Published Dec 20, 2022 • 1

Research Papers/Reviews/Literature

Daily Research papers and review including older relevant content.

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 62
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

Paper • 2503.15265 • Published Mar 19 • 47
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published Mar 18 • 51

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 31
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 5

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs