MoE-Dynamic-Routing

Activity Feed

AI & ML interests

None defined yet.

Spico

authored 7 papers 3 months ago

LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

Paper • 2411.15708 • Published Nov 24, 2024

Iterative Value Function Optimization for Guided Decoding

Paper • 2503.02368 • Published Mar 4, 2025 • 15

Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts

Paper • 2503.05447 • Published Mar 7, 2025 • 8

Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models

Paper • 2503.16779 • Published Mar 21, 2025 • 1

Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts

Paper • 2406.11256 • Published Jun 17, 2024

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13, 2025 • 53

DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

Paper • 2512.24165 • Published Dec 30, 2025 • 52

Spico

submitted a paper to Daily Papers 3 months ago

DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

Paper • 2512.24165 • Published Dec 30, 2025 • 52

huxy912

authored a paper 6 months ago

Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration

Paper • 2509.14760 • Published Sep 18, 2025 • 53

GuanjieChen

authored a paper 10 months ago

OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning

Paper • 2505.08617 • Published May 13, 2025 • 42

huxy912

authored 2 papers about 1 year ago

Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

Paper • 2501.12895 • Published Jan 22, 2025 • 61

LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

Paper • 2411.15708 • Published Nov 24, 2024

Spico

authored 8 papers over 1 year ago

NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models

Paper • 2410.11805 • Published Oct 15, 2024 • 14

ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM

Paper • 2408.12076 • Published Aug 22, 2024 • 12

Timo: Towards Better Temporal Reasoning for Language Models

Paper • 2406.14192 • Published Jun 20, 2024 • 1

Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark

Paper • 2405.08355 • Published May 14, 2024

CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling

Paper • 2409.19291 • Published Sep 28, 2024 • 21

AI & ML interests

Team members 3

MoE-Dynamic-Routing's activity