Lukas

sirluk

AI & ML interests

None yet

Recent Activity

Organizations

Blog-explorers's profile picture Institute for Machine Learning, Johannes Kepler University Linz's profile picture

sirluk's activity

view reply

Hey @shantanuagarwal , glad you enjoyed the article! Even though I havent tried it out myself you should be able to leverage pytorch flexattention api for this. Have a look at the tutorial here https://pytorch.org/blog/flexattention/. Section "Document Masking/Jagged Sequences" talks about these packed sequence masks.

published an article 7 months ago
view article
Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

By sirluk
38
published an article over 1 year ago
view article
Article

Multilabel Classification using Mistral-7B on a single GPU with quantization and LoRA

By sirluk
19