3 19 66

Pushkar Patel PRO

thepushkarp

https://thepushkarp.com/

AI & ML interests

Natural Language Processing

Recent Activity

upvoted an article 3 days ago

SmolLM3: smol, multilingual, long-context reasoner

upvoted an article about 2 months ago

Vision Language Models (Better, Faster, Stronger)

upvoted an article 3 months ago

Mixture of Depth is Vibe

View all activity

Organizations

upvoted an article 3 days ago

Article

SmolLM3: smol, multilingual, long-context reasoner

and 22 others •

4 days ago

• 479

upvoted an article about 2 months ago

Article

Vision Language Models (Better, Faster, Stronger)

and 4 others •

May 12

• 475

upvoted an article 3 months ago

Article

Mixture of Depth is Vibe

•

Apr 22, 2024

• 48

updated a model 3 months ago

thepushkarp/Dia-1.6B-safetensors-fp16

Text-to-Speech • Updated Apr 23 • 95 • 7

published a model 3 months ago

thepushkarp/Dia-1.6B-safetensors-fp16

Text-to-Speech • Updated Apr 23 • 95 • 7

upvoted an article 3 months ago

Article

Enabling Long Context Training with Sequence Parallelism in Axolotl

and 1 other •

Apr 4

• 10

New activity in thepushkarp/csm-1b-safetensors-fp16 4 months ago

fix small typo

👍 1

#2 opened 4 months ago by

gianpaj

upvoted 2 articles 4 months ago

Article

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 626

Article

Open-Source Handwritten Signature Detection Model

•

Mar 14

• 114

updated a model 4 months ago

thepushkarp/csm-1b-safetensors-fp16

Text-to-Speech • 2B • Updated Mar 23 • 1

New activity in thepushkarp/csm-1b-safetensors-fp16 4 months ago

Can you quantize further? Like FP4 maybe?

#1 opened 4 months ago by

etohimself

published a model 4 months ago

thepushkarp/csm-1b-safetensors-fp16

Text-to-Speech • 2B • Updated Mar 23 • 1

liked a Space 4 months ago

1.53k

GGUF My Repo

🦙

Create and quantize Hugging Face models

upvoted an article 4 months ago

Article

Open R1: Update #3

and 9 others •

Mar 11

• 295

liked a Space 5 months ago

2.8k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

upvoted an article 5 months ago

Article

Open R1: Update #2

and 6 others •

Feb 10

• 216

updated a collection 5 months ago

HF Deep RL Course

Collection

Models cooked in HF Deep RL Course (https://huggingface.co/learn/deep-rl-course) • 1 item • Updated Feb 7

updated a model 5 months ago

thepushkarp/ppo-LunarLander-v2

Reinforcement Learning • Updated Feb 7 • 3

published a model 5 months ago

thepushkarp/ppo-LunarLander-v2

Reinforcement Learning • Updated Feb 7 • 3

liked a Space 7 months ago

573

Scaling test-time compute

📈

Enhance math problem solving by scaling test-time compute

Pushkar Patel PRO

AI & ML interests

Recent Activity

Organizations

thepushkarp's activity

SmolLM3: smol, multilingual, long-context reasoner

Vision Language Models (Better, Faster, Stronger)

Mixture of Depth is Vibe

Enabling Long Context Training with Sequence Parallelism in Axolotl

fix small typo

Uncensor any LLM with abliteration

Open-Source Handwritten Signature Detection Model

Can you quantize further? Like FP4 maybe?

GGUF My Repo

Open R1: Update #3

The Ultra-Scale Playbook

Open R1: Update #2

Scaling test-time compute