-
Attention Is All You Need
Paper • 1706.03762 • Published • 70 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 19 -
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper • 1907.11692 • Published • 8 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 17
Taufiq Dwi Purnomo
taufiqdp
AI & ML interests
SLM, VLM
Recent Activity
liked
a model
2 days ago
bosonai/higgs-audio-v2-generation-3B-base
liked
a model
5 days ago
Qwen/Qwen3-235B-A22B-Instruct-2507
liked
a model
5 days ago
Qwen/Qwen3-Coder-480B-A35B-Instruct