13 15 15

Garreth Lee

garrethlee

AI & ML interests

None yet

Recent Activity

upvoted a paper 15 days ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

upvoted a paper 22 days ago

OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning

upvoted a changelog about 2 months ago

Xet is now the default storage option for new users and organizations

View all activity

Organizations

upvoted a paper 15 days ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published 17 days ago • 61

upvoted a paper 22 days ago

OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning

Paper • 2506.00338 • Published May 31 • 10

upvoted a changelog about 2 months ago

Changelog

Xet is now the default storage option for new users and organizations

May 23

• 67

upvoted a collection 3 months ago

Llama 4

Collection

Llama 4 release • 13 items • Updated Apr 29 • 568

upvoted an article 4 months ago

Article

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

and 8 others •

Mar 24

• 19

upvoted 4 articles 5 months ago

Article

FastRTC: The Real-Time Communication Library for Python

and 1 other •

Feb 25

• 169

Article

1 Billion Classifications

•

Feb 13

• 43

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

•

Jan 30

• 91

Article

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents

•

Jan 29

• 17

upvoted a paper 7 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 369

upvoted 2 articles 9 months ago

Article

SmolLM - blazingly fast and remarkably powerful

and 2 others •

Jul 16, 2024

• 392

Article

🇨🇿 BenCzechMark - Can your LLM Understand Czech?

and 12 others •

Oct 1, 2024

• 21

upvoted a paper 10 months ago

Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale

Paper • 2409.17115 • Published Sep 25, 2024 • 63

Garreth Lee

AI & ML interests

Recent Activity

Organizations

garrethlee's activity

Xet is now the default storage option for new users and organizations

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

FastRTC: The Real-Time Communication Library for Python

1 Billion Classifications

KV Caching Explained: Optimizing Transformer Inference Efficiency

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents

SmolLM - blazingly fast and remarkably powerful

🇨🇿 BenCzechMark - Can your LLM Understand Czech?