12599 31 14

Loïck BOURDOIS

lbourdois

https://lbourdois.github.io/blog/

AI & ML interests

👀

Recent Activity

updated a dataset about 10 hours ago

lbourdois/radios_et_podcasts_en_ligne_0

updated a dataset about 10 hours ago

lbourdois/radios_et_podcasts_en_ligne_50

updated a dataset about 10 hours ago

lbourdois/radios_et_podcasts_en_ligne_49

View all activity

Organizations

Posts 4

Post

2806

We introduce FAT5 (Flash Attention T5) ⚡

An implementation of T5 in PyTorch with UL2 objective optimized for GPGPU for both training and inference thanks to 13 different optimizations.
The main one is that we have designed a CUDA kernel to expand the Flash Attention by @tridao with RPE biases and supports other PE such as RoPE, ALiBi or FIRE.
The result kernel is 2 times faster than a SPDA implementation.
We also use Triton kernels to optimize certain parts of the architecture, such as the cross-entropy and RMSNorm layer.

The various kernels have been carefully built to be compatible with BF16 and torch.compile to go even faster and achieve efficient pretraining.

All other optimizations are described in a 📝 subsequent blog post available on @huggingface 🤗: CATIE-AQ/FAT5-report.

This methodology enabled us to efficiently pretrain as a proof of concept a FAT5 with 147M parameters in French in a reasonable time (1,461H for 419B tokens), with limited resources (1 A100 i.e. a computational budget of ~ €1,900) and a low carbon footprint (13.5kg eq CO2).

The model's weights are also available on Hugging Face: CATIE-AQ/FAT5-small.
Not very useful in practice, it's a PoC and not an instructed model (it's planned for later).

All the code is available on GitHub if you want to pretrain your own model in your own language or for a specific domain: https://github.com/catie-aq/flashT5 ⭐

Ending by indicating that was a joint project with @BorisAlbar at hf.co/CATIE-AQ.

View all Posts

Articles 3

Article

139

Introduction to State Space Models (SSM)

View all Articles

Collections 10

spaces 2

Running

Free online AI courses in French

📚

French translations of four AI courses

Running

SSM Blog Posts

📝

Blog posts about State Space Models (SSM)

models 0

None public yet

datasets 61

Loïck BOURDOIS

AI & ML interests

Recent Activity

Organizations

Posts 4

Articles 3

Introduction to State Space Models (SSM)

Collections 10

Free online AI courses in French

lbourdois/en-fr-nyu-dl-course-corpus

SSM Blog Posts

FAT5 (Flash Attention T5) report

Le FAT5 : Flash Attention T5

CATIE-AQ/FAT5-small

spaces 2

Free online AI courses in French

SSM Blog Posts

models 0

datasets 61

lbourdois/radios_et_podcasts_en_ligne_0

lbourdois/radios_et_podcasts_en_ligne_50

lbourdois/radios_et_podcasts_en_ligne_49

lbourdois/radios_et_podcasts_en_ligne_48

lbourdois/radios_et_podcasts_en_ligne_47

lbourdois/radios_et_podcasts_en_ligne_46

lbourdois/radios_et_podcasts_en_ligne_45

lbourdois/radios_et_podcasts_en_ligne_44

lbourdois/radios_et_podcasts_en_ligne_43

lbourdois/radios_et_podcasts_en_ligne_42

Loïck BOURDOIS

AI & ML interests

Recent Activity

Organizations

Posts 4

Articles 3

Introduction to State Space Models (SSM)

Collections 10

Free online AI courses in French

SSM Blog Posts

FAT5 (Flash Attention T5) report

Le FAT5 : Flash Attention T5

spaces 2 Sort: Recently updated

Free online AI courses in French

SSM Blog Posts

models 0

datasets 61 Sort: Recently updated

spaces 2

datasets 61