Malikeh Ehghaghi's picture

Open to Collab

Malikeh Ehghaghi

Malikeh1375

·

AI & ML interests

NLP, Modular ML, Model Merging, Decentralized Training, Efficient LLMs

Recent Activity

upvoted a paper 7 days ago

TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior

authored a paper 12 days ago

Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation

authored a paper 12 days ago

DEPAC: a Corpus for Depression and Anxiety Detection from Speech

View all activity

Organizations

upvoted a paper 7 days ago

TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior

Paper • 2512.20757 • Published 14 days ago • 16

upvoted a collection 22 days ago

TokSuite Tokenization Robustness Benchmark

34 items • Updated Oct 29, 2025 • 1

upvoted a collection 4 months ago

TokSuite Model Collection

14 items • Updated Oct 28, 2025 • 2

upvoted 3 papers 6 months ago

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Paper • 2403.07816 • Published Mar 12, 2024 • 44

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Paper • 2506.05209 • Published Jun 5, 2025 • 59

upvoted a collection 8 months ago

supertoken

The initial checkpoints for the token comparison research. • 20 items • Updated May 22, 2025 • 2

upvoted a collection 9 months ago

Gemstone Models

Our 22 open source Gemstone models for scaling laws range from 50M to 2B parameters, spanning 11 widths from 256 to 3072 and 18 depths from 3 to 80. • 69 items • Updated Jul 4, 2025 • 10

upvoted 2 papers about 1 year ago

Revealing the Barriers of Language Agents in Planning

Paper • 2410.12409 • Published Oct 16, 2024 • 27

EchoPrime: A Multi-Video View-Informed Vision-Language Model for Comprehensive Echocardiography Interpretation

Paper • 2410.09704 • Published Oct 13, 2024 • 14

upvoted a collection over 1 year ago

LLM Reasoning Papers

Papers to improve reasoning capabilities of LLMs • 20 items • Updated Jan 15, 2025 • 123

upvoted a paper over 1 year ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 140

upvoted 3 collections over 1 year ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 7 days ago • 673

Korean Datasets I've released so far.

지금까지 업로드한 한국어 데이터셋 콜렉션입니다. • 8 items • Updated May 24, 2024 • 21

Arabic Light Benchmarks

10% sample of the original benchmarks for each dataset from lighteval • 7 items • Updated Sep 10, 2024 • 2

upvoted an article over 1 year ago

Article

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

Aug 19, 2024

•

79

upvoted a collection over 1 year ago

Arabic ORPO-DPO Datasets

12 items • Updated Aug 17, 2024 • 2

upvoted 3 papers over 1 year ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3, 2024 • 51

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 167

BOND: Aligning LLMs with Best-of-N Distillation

Paper • 2407.14622 • Published Jul 19, 2024 • 20