Pouya Esmaeili

Pouyae

https://pouyae.xyz

AI & ML interests

RAG/LLM/Agents

Recent Activity

upvoted an article about 1 month ago

Illustrating Reinforcement Learning from Human Feedback (RLHF)

upvoted an article about 2 months ago

KV Caching Explained: Optimizing Transformer Inference Efficiency

upvoted an article about 2 months ago

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance

View all activity

Organizations

None yet

upvoted an article about 1 month ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

and 3 others •

Dec 9, 2022

• 296

upvoted 2 articles about 2 months ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

•

Jan 30

• 91

Article

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance

•

Apr 16

• 20

upvoted 2 collections about 2 months ago

Llama 3.2

Collection

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 622

Phi-4

Collection

Phi-4 family of small language, multi-modal and reasoning models. • 17 items • Updated 3 days ago • 168

upvoted 2 papers over 1 year ago

Grandmaster-Level Chess Without Search

Paper • 2402.04494 • Published Feb 7, 2024 • 70

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 257

Pouya Esmaeili

AI & ML interests

Recent Activity

Organizations

Pouyae's activity

Illustrating Reinforcement Learning from Human Feedback (RLHF)

KV Caching Explained: Optimizing Transformer Inference Efficiency

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance