SmolVLM: Redefining small and efficient multimodal models
Paper
•
2504.05299
•
Published
•
172
JAX, Flax, TPU, 🤗
sudo apt -y install cuda-toolkit-12-8 nvidia-open
NVIDIA Graphics Device with CUDA capability sm_120 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_50 sm_60 sm_70 sm_75 sm_80 sm_86 sm_90.
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu128
transformers
in dedicated releases!v4.49.0-SmolVLM-2
and v4.49.0-SigLIP-2
.timm
use, but will work great with transformers
and other libs. Updated the base image, Python 3.12, Pillow-SIMD before better CPU use with image preprocessing, and made a number of other tweaks. From the Jupyter launcher you can run the terminal and setup a timm
environment in moments with setup_timm_dev
or setup_timm_scripts
helpers. Give it a try,
timm/jupyterlab-timm
timm
1.0.13 and OpenCLIP
2.30.0 releases to start the year. Both modest but worthwhile updates.timm
added a number of new model weights, supporting loading of: OpenCLIP
and timm
for two CLIP models that were missed. The DFN L/14 is 🔥timm
remapping from OpenCLIP got their own timm hub instances to allow use with the upcoming Transformers TimmWrapperModel
timm
release, v 1.0.12, with a focus on optimizers. The optimizer factory has been refactored, there's now a timm.optim.list_optimizers()
and new way to register optimizers and their attributes. As always you can use an timm
optimizer like a torch
one, just replace torch.optim
with timm.optim
adfactorbv
adopt
/ adoptw
(decoupled decay)mars
laprop
c
as well as cadamw
, cnadamw
, csgdw
, clamb
, crmsproptf
timm
, OpenCLIP
, and hopefully more.timm
scripts soon:
timm/plant-pathology-2021 timm
support for object detection, eventually segmentation, is finally under development :Otimm
model to use before commiting to download or training with a large dataset? Try mini-imagenet:
timm/mini-imagenet