anthonyivn (Anthony Ivan S)

upvoted 2 papers 4 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 232

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 153

upvoted an article 4 months ago

Article

Open-source DeepResearch – Freeing our search agents

By

and 4 others •

Feb 4

• 1.25k

upvoted a paper 5 months ago

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 42

upvoted 2 articles 5 months ago

Article

🪆 Introduction to Matryoshka Embedding Models

By

and 2 others •

Feb 23, 2024

• 124

Article

Train 400x faster Static Embedding Models with Sentence Transformers

By

•

Jan 15

• 185

upvoted a paper 5 months ago

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published Jan 9 • 101

upvoted a paper 7 months ago

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 68

upvoted an article 9 months ago

Article

Document Similarity Search with ColPali

By

•

Sep 21, 2024

• 50

upvoted 3 papers 9 months ago

upvoted an article 11 months ago

Article

The Rise of Agentic Data Generation

By

•

Jul 15, 2024

• 83

upvoted a paper 11 months ago

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Paper • 2407.09025 • Published Jul 12, 2024 • 137

upvoted a collection 11 months ago

InternLM2.5

Collection

14 items • Updated Feb 11 • 71

upvoted 3 papers 12 months ago

LongIns: A Challenging Long-context Instruction-based Exam for LLMs

Paper • 2406.17588 • Published Jun 25, 2024 • 23

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 98

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 71

upvoted 2 articles 12 months ago

Article

Uncensor any LLM with abliteration

By

•

Jun 13, 2024

• 605

Article

Putting RL back in RLHF

By

and 1 other •

Jun 12, 2024

• 92

Anthony Ivan S

AI & ML interests

Organizations

anthonyivn's activity

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Open-source DeepResearch – Freeing our search agents

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

🪆 Introduction to Matryoshka Embedding Models

Train 400x faster Static Embedding Models with Sentence Transformers

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Document Similarity Search with ColPali

Training Language Models to Self-Correct via Reinforcement Learning

Gemma 2: Improving Open Language Models at a Practical Size

Generative Verifiers: Reward Modeling as Next-Token Prediction

The Rise of Agentic Data Generation

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

InternLM2.5

LongIns: A Challenging Long-context Instruction-based Exam for LLMs

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Uncensor any LLM with abliteration

Putting RL back in RLHF