Transformers documentation

TimmWrapper

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

TimmWrapper

Overview

Helper class to enable loading timm models to be used with the transformers library and its autoclasses.

>>> import torch
>>> from PIL import Image
>>> from urllib.request import urlopen
>>> from transformers import AutoModelForImageClassification, AutoImageProcessor

>>> # Load image
>>> image = Image.open(urlopen(
...     'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
... ))

>>> # Load model and image processor
>>> checkpoint = "timm/resnet50.a1_in1k"
>>> image_processor = AutoImageProcessor.from_pretrained(checkpoint)
>>> model = AutoModelForImageClassification.from_pretrained(checkpoint).eval()

>>> # Preprocess image
>>> inputs = image_processor(image)

>>> # Forward pass
>>> with torch.no_grad():
...     logits = model(**inputs).logits

>>> # Get top 5 predictions
>>> top5_probabilities, top5_class_indices = torch.topk(logits.softmax(dim=1) * 100, k=5)

TimmWrapperConfig

class transformers.TimmWrapperConfig

< >

( initializer_range: float = 0.02 do_pooling: bool = True **kwargs )

Parameters

  • initializer_range (float, optional, defaults to 0.02) — The standard deviation of the truncated_normal_initializer for initializing all weight matrices.
  • do_pooling (bool, optional, defaults to True) — Whether to do pooling for the last_hidden_state in TimmWrapperModel or not.

This is the configuration class to store the configuration for a timm backbone TimmWrapper.

It is used to instantiate a timm model according to the specified arguments, defining the model.

Configuration objects inherit from PretrainedConfig and can be used to control the model outputs. Read the documentation from PretrainedConfig for more information.

Config loads imagenet label descriptions and stores them in id2label attribute, label2id attribute for default imagenet models is set to None due to occlusions in the label descriptions.

Example:

>>> from transformers import TimmWrapperModel

>>> # Initializing a timm model
>>> model = TimmWrapperModel.from_pretrained("timm/resnet18.a1_in1k")

>>> # Accessing the model configuration
>>> configuration = model.config

TimmWrapperImageProcessor

class transformers.TimmWrapperImageProcessor

< >

( pretrained_cfg: typing.Dict[str, typing.Any] architecture: typing.Optional[str] = None **kwargs )

Parameters

  • pretrained_cfg (Dict[str, Any]) — The configuration of the pretrained model used to resolve evaluation and training transforms.
  • architecture (Optional[str], optional) — Name of the architecture of the model.

Wrapper class for timm models to be used within transformers.

preprocess

< >

( images: typing.Union[ForwardRef('PIL.Image.Image'), numpy.ndarray, ForwardRef('torch.Tensor'), typing.List[ForwardRef('PIL.Image.Image')], typing.List[numpy.ndarray], typing.List[ForwardRef('torch.Tensor')]] return_tensors: typing.Union[str, transformers.utils.generic.TensorType, NoneType] = 'pt' )

Parameters

  • images (ImageInput) — Image to preprocess. Expects a single or batch of images
  • return_tensors (str or TensorType, optional) — The type of tensors to return.

Preprocess an image or batch of images.

TimmWrapperModel

class transformers.TimmWrapperModel

< >

( config: TimmWrapperConfig )

Wrapper class for timm models to be used in transformers.

forward

< >

( pixel_values: FloatTensor output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Union[bool, typing.List[int], NoneType] = None return_dict: typing.Optional[bool] = None do_pooling: typing.Optional[bool] = None **kwargs ) β†’ transformers.models.timm_wrapper.modeling_timm_wrapper.TimmWrapperModelOutput or tuple(torch.FloatTensor)

Parameters

  • pixel_values (torch.FloatTensor of shape (batch_size, num_channels, height, width)) — Pixel values. Pixel values can be obtained using AutoImageProcessor. See TimmWrapperImageProcessor.preprocess() for details.
  • output_attentions (bool, optional) — Whether or not to return the attentions tensors of all attention layers. Not compatible with timm wrapped models.
  • output_hidden_states (bool, optional) — Whether or not to return the hidden states of all layers. Not compatible with timm wrapped models.
  • return_dict (bool, optional) — Whether or not to return a ModelOutput instead of a plain tuple.
  • **kwargs — Additional keyword arguments passed along to the timm model forward.
  • do_pooling (bool, optional) — Whether to do pooling for the last_hidden_state in TimmWrapperModel or not. If None is passed, the do_pooling value from the config is used.

Returns

transformers.models.timm_wrapper.modeling_timm_wrapper.TimmWrapperModelOutput or tuple(torch.FloatTensor)

A transformers.models.timm_wrapper.modeling_timm_wrapper.TimmWrapperModelOutput or a tuple of torch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (<class 'transformers.models.timm_wrapper.configuration_timm_wrapper.TimmWrapperConfig'>) and inputs.

  • last_hidden_state (torch.FloatTensor) β€” The last hidden state of the model, output before applying the classification head.
  • pooler_output (torch.FloatTensor, optional) β€” The pooled output derived from the last hidden state, if applicable.
  • hidden_states (tuple(torch.FloatTensor), optional) β€” A tuple containing the intermediate hidden states of the model at the output of each layer or specified layers. Returned if output_hidden_states=True is set or if config.output_hidden_states=True.
  • attentions (tuple(torch.FloatTensor), optional) β€” A tuple containing the intermediate attention weights of the model at the output of each layer. Returned if output_attentions=True is set or if config.output_attentions=True. Note: Currently, Timm models do not support attentions output.

The TimmWrapperModel forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Examples:

>>> import torch
>>> from PIL import Image
>>> from urllib.request import urlopen
>>> from transformers import AutoModel, AutoImageProcessor

>>> # Load image
>>> image = Image.open(urlopen(
...     'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
... ))

>>> # Load model and image processor
>>> checkpoint = "timm/resnet50.a1_in1k"
>>> image_processor = AutoImageProcessor.from_pretrained(checkpoint)
>>> model = AutoModel.from_pretrained(checkpoint).eval()

>>> # Preprocess image
>>> inputs = image_processor(image)

>>> # Forward pass
>>> with torch.no_grad():
...     outputs = model(**inputs)

>>> # Get pooled output
>>> pooled_output = outputs.pooler_output

>>> # Get last hidden state
>>> last_hidden_state = outputs.last_hidden_state

TimmWrapperForImageClassification

class transformers.TimmWrapperForImageClassification

< >

( config: TimmWrapperConfig )

Wrapper class for timm models to be used in transformers for image classification.

forward

< >

( pixel_values: FloatTensor labels: typing.Optional[torch.LongTensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Union[bool, typing.List[int], NoneType] = None return_dict: typing.Optional[bool] = None **kwargs ) β†’ transformers.modeling_outputs.ImageClassifierOutput or tuple(torch.FloatTensor)

Parameters

  • pixel_values (torch.FloatTensor of shape (batch_size, num_channels, height, width)) — Pixel values. Pixel values can be obtained using AutoImageProcessor. See TimmWrapperImageProcessor.preprocess() for details.
  • output_attentions (bool, optional) — Whether or not to return the attentions tensors of all attention layers. Not compatible with timm wrapped models.
  • output_hidden_states (bool, optional) — Whether or not to return the hidden states of all layers. Not compatible with timm wrapped models.
  • return_dict (bool, optional) — Whether or not to return a ModelOutput instead of a plain tuple.
  • **kwargs — Additional keyword arguments passed along to the timm model forward.
  • labels (torch.LongTensor of shape (batch_size,), optional) — Labels for computing the image classification/regression loss. Indices should be in [0, ..., config.num_labels - 1]. If config.num_labels == 1 a regression loss is computed (Mean-Square loss), If config.num_labels > 1 a classification loss is computed (Cross-Entropy).

Returns

transformers.modeling_outputs.ImageClassifierOutput or tuple(torch.FloatTensor)

A transformers.modeling_outputs.ImageClassifierOutput or a tuple of torch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (<class 'transformers.models.timm_wrapper.configuration_timm_wrapper.TimmWrapperConfig'>) and inputs.

  • loss (torch.FloatTensor of shape (1,), optional, returned when labels is provided) β€” Classification (or regression if config.num_labels==1) loss.

  • logits (torch.FloatTensor of shape (batch_size, config.num_labels)) β€” Classification (or regression if config.num_labels==1) scores (before SoftMax).

  • hidden_states (tuple(torch.FloatTensor), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) β€” Tuple of torch.FloatTensor (one for the output of the embeddings, if the model has an embedding layer, + one for the output of each stage) of shape (batch_size, sequence_length, hidden_size). Hidden-states (also called feature maps) of the model at the output of each stage.

  • attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) β€” Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, patch_size, sequence_length).

    Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.

The TimmWrapperForImageClassification forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Examples:

>>> import torch
>>> from PIL import Image
>>> from urllib.request import urlopen
>>> from transformers import AutoModelForImageClassification, AutoImageProcessor

>>> # Load image
>>> image = Image.open(urlopen(
...     'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
... ))

>>> # Load model and image processor
>>> checkpoint = "timm/resnet50.a1_in1k"
>>> image_processor = AutoImageProcessor.from_pretrained(checkpoint)
>>> model = AutoModelForImageClassification.from_pretrained(checkpoint).eval()

>>> # Preprocess image
>>> inputs = image_processor(image)

>>> # Forward pass
>>> with torch.no_grad():
...     logits = model(**inputs).logits

>>> # Get top 5 predictions
>>> top5_probabilities, top5_class_indices = torch.topk(logits.softmax(dim=1) * 100, k=5)
< > Update on GitHub