Edmond Jacoupeau's picture

Edmond Jacoupeau

edmond

·

AI & ML interests

None yet

Recent Activity

liked a model 5 days ago

Qwen/Qwen3-1.7B-Base

liked a model 10 days ago

meta-llama/Llama-3.1-8B

liked a model 16 days ago

deepseek-ai/DeepSeek-V3-Base

View all activity

Organizations

edmond's activity

upvoted an article 22 days ago

Article

Finally, a Replacement for BERT: Introducing ModernBERT

By

and 14 others •

Dec 19, 2024

• 629

upvoted a collection about 1 month ago

Llama 4

Llama 4 release • 13 items • Updated 18 days ago • 506

upvoted a collection 2 months ago

DeepSeek-V3

4 items • Updated Mar 25 • 250

upvoted a paper 3 months ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 121

upvoted a paper 4 months ago

The GAN is dead; long live the GAN! A Modern GAN Baseline

Paper • 2501.05441 • Published Jan 9 • 92

upvoted a collection 7 months ago

Llama3-8B-1.58

A trio of powerful models: fine-tuned from Llama3-8b-Instruct, with BitNet architecture! • 3 items • Updated Sep 14, 2024 • 12

upvoted an article 8 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

By

and 5 others •

Sep 18, 2024

• 244

upvoted a paper 10 months ago

KAN or MLP: A Fairer Comparison

Paper • 2407.16674 • Published Jul 23, 2024 • 44

upvoted 2 collections 11 months ago

Gemma 2 Release

15 items • Updated Apr 3 • 217

Florence

9 items • Updated 16 days ago • 168

upvoted a paper 12 months ago

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

Paper • 2405.15071 • Published May 23, 2024 • 42

upvoted an article about 1 year ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

By

and 2 others •

May 14, 2024

• 251

upvoted 2 collections about 1 year ago

PaliGemma Release

Pretrained and mix checkpoints for PaliGemma • 16 items • Updated Apr 3 • 146

LLaVA++ (LLaMA-3 and Phi-3-Mini)

Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 • 11 items • Updated Jun 11, 2024 • 23

upvoted 3 papers about 1 year ago

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Paper • 2404.07143 • Published Apr 10, 2024 • 110

Voyager: An Open-Ended Embodied Agent with Large Language Models

Paper • 2305.16291 • Published May 25, 2023 • 10

MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge

Paper • 2206.08853 • Published Jun 17, 2022 • 1

upvoted a collection about 1 year ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 763

upvoted a paper about 1 year ago

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18, 2024 • 56

upvoted a paper over 1 year ago

Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels

Paper • 2312.17090 • Published Dec 28, 2023 • 4