Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization Paper • 2409.12903 • Published Sep 19, 2024 • 23
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data Paper • 2404.15653 • Published Apr 24, 2024 • 30
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models Paper • 2310.04564 • Published Oct 6, 2023 • 2
CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement Paper • 2310.14108 • Published Oct 21, 2023 • 1
Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement Paper • 2303.08983 • Published Mar 15, 2023
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 257
Weight subcloning: direct initialization of transformers using larger pretrained ones Paper • 2312.09299 • Published Dec 14, 2023 • 19
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding Paper • 2310.15308 • Published Oct 23, 2023 • 23