feat: test upload - Trendyol DinoV2 Product Similarity and Retrieval Embedding Model

🧪 Test Upload Details:
- Personal account testing before company publication
- Architecture: ConvNeXt-Base + ArcFace loss
- Embedding dimension: 256
- Task: Product similarity and retrieval

📁 Repository Contents:
- Model weights in safetensors format
- Complete model card with usage examples
- Apache 2.0 license
- Demo notebook for inference

🔒 Security: Scanned and validated
📋 RFC Compliance: Ready for company publication

Test upload by: Personal Account

Files changed (11) hide show

LICENSE +189 -0
README.md +130 -0
__init__.py +23 -0
__pycache__/modeling_trendyol_dinov2.cpython-312.pyc +0 -0
config.json +54 -0
image_processing_trendyol_dinov2.py +163 -0
model.safetensors +3 -0
modeling_trendyol_dinov2.py +142 -0
preprocessor_config.json +43 -0
pytorch_model.bin +3 -0
requirements.txt +7 -0

LICENSE ADDED Viewed

	@@ -0,0 +1,189 @@

+Apache License
+Version 2.0, January 2004
+http://www.apache.org/licenses/
+TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+1. Definitions.
+"License" shall mean the terms and conditions for use, reproduction,
+and distribution as defined by Sections 1 through 9 of this document.
+"Licensor" shall mean the copyright owner or entity granting the License.
+"Legal Entity" shall mean the union of the acting entity and all
+other entities that control, are controlled by, or are under common
+control with that entity. For the purposes of the definition of
+"control", an entity controls another entity when such entity:
+(i) has the power, direct or indirect, to cause the direction or
+management of such other entity, whether by contract or otherwise,
+(ii) owns fifty percent (50%) or more of the outstanding shares, or
+(iii) has beneficial ownership of such entity.
+"You" (or "Your") shall mean an individual or Legal Entity
+exercising permissions granted by this License.
+"Source" shall mean the preferred form for making modifications,
+including but not limited to software source code, documentation
+source, and configuration files.
+"Object" shall mean any form resulting from mechanical
+transformation or translation of a Source form, including but
+not limited to compiled object code, generated documentation,
+and conversions to other media types.
+"Work" shall mean the work of authorship, whether in Source or
+Object form, made available under the License, as indicated by a
+copyright notice that is included in or attached to the work
+(which shall not include communication that is conspicuously
+marked or otherwise designated in writing by the copyright owner
+as "Not a Contribution").
+"Derivative Works" shall mean any work, whether in Source or Object
+form, that is based upon (or derived from) the Work and for which the
+editorial revisions, annotations, elaborations, or other modifications
+represent, as a whole, an original work of authorship. For the purposes
+of this License, Derivative Works shall not include works that remain
+separable from, or merely link (or bind by name) to the interfaces of,
+the Work and derivative works thereof.
+"Contribution" shall mean any work of authorship, including
+the original version of the Work and any modifications or additions
+to that Work or Derivative Works thereof, that is intentionally
+submitted to Licensor for inclusion in the Work by the copyright owner
+or by an individual or Legal Entity authorized to submit on behalf of
+the copyright owner. For the purposes of the definition of "Contribution",
+any such Contribution intentionally submitted for inclusion in the Work
+by You to the Licensor shall be deemed to have been made under the
+terms and conditions of this License, without any additional terms or
+conditions. Notwithstanding the above, nothing herein shall supersede or
+modify the terms of any separate license agreement you may have executed
+with Licensor regarding such Contributions.
+2. Grant of Copyright License. Subject to the terms and conditions of
+this License, each Contributor hereby grants to You a perpetual,
+worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+copyright license to use, reproduce, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Work, and to
+permit persons to whom the Work is furnished to do so, subject to
+the following conditions:
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Work.
+3. Grant of Patent License. Subject to the terms and conditions of
+this License, each Contributor hereby grants to You a perpetual,
+worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+(except as stated in this section) patent license to make, have made,
+use, offer to sell, sell, import, and otherwise transfer the Work,
+where such license applies only to those patent claims licensable
+by such Contributor that are necessarily infringed by their
+Contribution(s) alone or by combination of their Contribution(s)
+with the Work to which such Contribution(s) was submitted. If You
+institute patent litigation against any entity (including a
+cross-claim or counterclaim in a lawsuit) alleging that the Work
+or a Contribution incorporated within the Work constitutes direct
+or contributory patent infringement, then any patent licenses
+granted to You under this License for that Work shall terminate
+as of the date such litigation is filed.
+4. Redistribution. You may reproduce and distribute copies of the
+Work or Derivative Works thereof in any medium, with or without
+modifications, and in Source or Object form, provided that You
+meet the following conditions:
+(a) You must give any other recipients of the Work or
+    Derivative Works a copy of this License; and
+(b) You must cause any modified files to carry prominent notices
+    stating that You changed the files; and
+(c) You must retain, in the Source form of any Derivative Works
+    that You distribute, all copyright, trademark, patent,
+    and attribution notices from the Source form of the Work,
+    excluding those notices that do not pertain to any part of
+    the Derivative Works; and
+(d) If the Work includes a "NOTICE" text file as part of its
+    distribution, then any Derivative Works that You distribute must
+    include a readable copy of the attribution notices contained
+    within such NOTICE file, excluding those notices that do not
+    pertain to any part of the Derivative Works, in at least one
+    of the following places: within a NOTICE text file distributed
+    as part of the Derivative Works; within the Source form or
+    documentation, if provided along with the Derivative Works; or,
+    within a display generated by the Derivative Works, if and
+    wherever such third-party notices normally appear. The contents
+    of the NOTICE file are for informational purposes only and
+    do not modify the License. You may add Your own attribution
+    notices within Derivative Works that You distribute, alongside
+    or as an addendum to the NOTICE text from the Work, provided
+    that such additional attribution notices cannot be construed
+    as modifying the License.
+You may add Your own copyright notice to Your modifications and
+may provide additional or different license terms and conditions
+for use, reproduction, or distribution of Your modifications, or
+for any such Derivative Works as a whole, provided Your use,
+reproduction, and distribution of the Work otherwise complies with
+the conditions stated in this License.
+5. Submission of Contributions. Unless You explicitly state otherwise,
+any Contribution intentionally submitted for inclusion in the Work
+by You to the Licensor shall be under the terms and conditions of
+this License, without any additional terms or conditions.
+Notwithstanding the above, nothing herein shall supersede or modify
+the terms of any separate license agreement you may have executed
+with Licensor regarding such Contributions.
+6. Trademarks. This License does not grant permission to use the trade
+names, trademarks, service marks, or product names of the Licensor,
+except as required for reasonable and customary use in describing the
+origin of the Work and reproducing the content of the NOTICE file.
+7. Disclaimer of Warranty. Unless required by applicable law or
+agreed to in writing, Licensor provides the Work (and each
+Contributor provides its Contributions) on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+implied, including, without limitation, any warranties or conditions
+of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+PARTICULAR PURPOSE. You are solely responsible for determining the
+appropriateness of using or redistributing the Work and assume any
+risks associated with Your exercise of permissions under this License.
+8. Limitation of Liability. In no event and under no legal theory,
+whether in tort (including negligence), contract, or otherwise,
+unless required by applicable law (such as deliberate and grossly
+negligent acts) or agreed to in writing, shall any Contributor be
+liable to You for damages, including any direct, indirect, special,
+incidental, or consequential damages of any character arising as a
+result of this License or out of the use or inability to use the
+Work (including but not limited to damages for loss of goodwill,
+work stoppage, computer failure or malfunction, or any and all
+other commercial damages or losses), even if such Contributor
+has been advised of the possibility of such damages.
+9. Accepting Warranty or Support. You are not required to accept
+warranty or support for the Work under this License. However, if You
+choose to accept warranty or support, You may act only on Your own
+behalf and on Your sole responsibility, not on behalf of any other
+Contributor, and only if You agree to indemnify, defend, and hold each
+Contributor harmless for any liability incurred by, or claims asserted
+against, such Contributor by reason of your accepting any such warranty
+or support.
+END OF TERMS AND CONDITIONS
+Copyright 2025 Trendyol
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+    http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.

README.md ADDED Viewed

	@@ -0,0 +1,130 @@

+# Trendyol DinoV2 Image Similarity Model
+This repository contains a fine-tuned DinoV2 model for image similarity and retrieval tasks, specifically trained on e-commerce product images.
+## Model Details
+- **Model Type**: Image Similarity/Retrieval
+- **Architecture**: DinoV2 ViT-B/14 with ArcFace loss
+- **Embedding Dimension**: 256
+- **Input Size**: 224x224
+- **Framework**: PyTorch
+- **Format**: SafeTensors
+## Usage
+### Quick Start
+```python
+import torch
+from PIL import Image
+from transformers import AutoModel, AutoImageProcessor
+device = 'cuda'
+# Load model and processor from Hugging Face Hub
+model = AutoModel.from_pretrained("Trendyol/trendyol-dino-v2-ecommerce-256d", trust_remote_code=True)
+processor = AutoImageProcessor.from_pretrained("Trendyol/trendyol-dino-v2-ecommerce-256d", trust_remote_code=True)
+# Load and process an image
+image = Image.open('your_image.jpg').convert('RGB')
+inputs = processor(images=image, return_tensors="pt")
+# Move inputs to CUDA
+inputs = {k: v.to(device) for k, v in inputs.items()}
+# Get embeddings
+with torch.no_grad():
+    outputs = model(**inputs)
+    embeddings = outputs.last_hidden_state  # Shape: [1, 256]
+print("Generated dimensional embedding shape:", embeddings.shape[1])
+```
+### Preprocessing Pipeline
+The model uses a specific preprocessing pipeline that's crucial for good performance:
+1. **DownScale (Lanczos)**: Resize to max dimension of 332px
+2. **JPEG Compression**: Apply quality=75 compression
+3. **Scale Image**: Scale to max dimension of 332px
+4. **Pad to Square**: Pad with color value 255
+5. **Resize**: Resize to 224x224
+6. **ToTensor**: Convert to PyTorch tensor
+7. **Normalize**: ImageNet normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+### Using with AutoModel and AutoImageProcessor
+```python
+from transformers import AutoModel, AutoImageProcessor
+# Load from Hugging Face Hub
+model = AutoModel.from_pretrained("Trendyol/trendyol-dino-v2-ecommerce-256d")
+processor = AutoImageProcessor.from_pretrained("Trendyol/trendyol-dino-v2-ecommerce-256d")
+# Full inference pipeline
+import torch
+from PIL import Image
+image = Image.open('your_image.jpg')
+inputs = processor(images=image, return_tensors="pt")
+with torch.no_grad():
+    outputs = model(**inputs)
+    embeddings = outputs.last_hidden_state  # Shape: [1, 256]
+print("Embedding shape:", embeddings.shape)
+```
+## Installation
+Install the required dependencies:
+```bash
+pip install transformers torch torchvision safetensors pillow numpy opencv-python
+```
+## Model Architecture
+The model consists of:
+- **Backbone**: DinoV2 ViT-B/14 (frozen during training)
+- **Projection Head**: Linear layer mapping to 256 dimensions
+- **Normalization**: L2 normalization for similarity computation
+## Training Details
+- **Loss Function**: ArcFace loss for metric learning
+- **Training Data**: E-commerce product images
+- **Epoch**: 9
+- **PyTorch Version**: 2.8.0
+## Intended Use
+This model is designed for:
+- Product image similarity search
+- Visual product recommendations
+- Duplicate product detection
+- Content-based image retrieval in e-commerce
+## Limitations
+- Optimized specifically for product/e-commerce images
+- May not generalize well to other image domains
+- Requires specific preprocessing pipeline for optimal performance
+- Requires transformers library for feature extractor functionality
+## License
+This model is released under the Apache 2.0 License. See LICENSE file for details.
+## Citation
+```
+@misc{trendyol-dinov2-ecommerce,
+  title={Trendyol DinoV2 E-commerce Image Similarity Model},
+  author={Trendyol Machine Learning Team},
+  year={2025},
+  url={https://huggingface.co/Trendyol/trendyol-dino-v2-ecommerce-256d}
+}
+```

__init__.py ADDED Viewed

	@@ -0,0 +1,23 @@

+"""
+Trendyol DinoV2 Image Similarity Model
+This package contains a fine-tuned DinoV2 model for e-commerce image similarity.
+Fully compatible with Hugging Face transformers.
+"""
+from .modeling_trendyol_dinov2 import TrendyolDinoV2Model, TrendyolDinoV2Config
+from .image_processing_trendyol_dinov2 import TrendyolDinoV2ImageProcessor
+# Register for AutoModel and AutoImageProcessor
+from transformers import AutoConfig, AutoModel, AutoImageProcessor
+AutoConfig.register("trendyol_dinov2", TrendyolDinoV2Config)
+AutoModel.register(TrendyolDinoV2Config, TrendyolDinoV2Model)
+AutoImageProcessor.register(TrendyolDinoV2Config, TrendyolDinoV2ImageProcessor)
+__version__ = "1.0.0"
+__all__ = [
+    "TrendyolDinoV2Model",
+    "TrendyolDinoV2Config",
+    "TrendyolDinoV2ImageProcessor"
+]

__pycache__/modeling_trendyol_dinov2.cpython-312.pyc ADDED Viewed

Binary file (7.02 kB). View file

config.json ADDED Viewed

	@@ -0,0 +1,54 @@

+{
+  "model_type": "trendyol_dinov2",
+  "architectures": [
+    "TrendyolDinoV2Model"
+  ],
+  "auto_map": {
+    "AutoConfig": "modeling_trendyol_dinov2.TrendyolDinoV2Config",
+    "AutoModel": "modeling_trendyol_dinov2.TrendyolDinoV2Model",
+    "AutoImageProcessor": "image_processing_trendyol_dinov2.TrendyolDinoV2ImageProcessor"
+  },
+  "backbone_name": "dinov2_vitb14",
+  "embedding_dim": 256,
+  "hidden_size": 256,
+  "in_features": 768,
+  "use_arcface_loss": true,
+  "input_size": 224,
+  "downscale_size": 332,
+  "pad_color": 255,
+  "jpeg_quality": 75,
+  "normalization": {
+    "mean": [
+      0.485,
+      0.456,
+      0.406
+    ],
+    "std": [
+      0.229,
+      0.224,
+      0.225
+    ]
+  },
+  "preprocessing": {
+    "input_size": 224,
+    "downscale_size": 332,
+    "pad_color": 255,
+    "jpeg_quality": 75,
+    "transforms": [
+      "DownScaleLanczos",
+      "JPEGCompression",
+      "ScaleImage",
+      "PadToSquare",
+      "Resize",
+      "ToTensor",
+      "Normalize"
+    ]
+  },
+  "task_type": "image-retrieval",
+  "training_info": {
+    "epoch": "9",
+    "torch_version": "2.8.0"
+  },
+  "torch_dtype": "float32",
+  "transformers_version": "4.20.0"
+}

image_processing_trendyol_dinov2.py ADDED Viewed

	@@ -0,0 +1,163 @@

+"""
+Hugging Face compatible image processor for Trendyol DinoV2
+"""
+from transformers import ImageProcessingMixin, BatchFeature
+from transformers.utils import TensorType
+from PIL import Image
+import torch
+import numpy as np
+import cv2
+from torchvision import transforms
+import torchvision.transforms.functional as TF
+from io import BytesIO
+from typing import Union, List, Optional
+def downscale_image(image: Image.Image, max_dimension: int) -> Image.Image:
+    """Downscale image while maintaining aspect ratio"""
+    original_width, original_height = image.size
+    if max(original_width, original_height) <= max_dimension:
+        return image
+    aspect_ratio = original_width / original_height
+    if original_width > original_height:
+        new_width = max_dimension
+        new_height = int(max_dimension / aspect_ratio)
+    else:
+        new_height = max_dimension
+        new_width = int(max_dimension * aspect_ratio)
+    return image.resize((new_width, new_height), Image.LANCZOS)
+class DownScaleLanczos:
+    def __init__(self, target_size=384):
+        self.target_size = target_size
+    def __call__(self, img):
+        return downscale_image(img, self.target_size)
+class JPEGCompression:
+    def __init__(self, quality=75):
+        self.quality = quality
+    def __call__(self, img):
+        buffer = BytesIO()
+        img.save(buffer, format='JPEG', quality=self.quality)
+        buffer.seek(0)
+        return Image.open(buffer)
+class ScaleImage:
+    def __init__(self, target_size):
+        self.target_size = target_size
+    def __call__(self, img):
+        w, h = img.size
+        max_size = max(h, w)
+        scale = self.target_size / max_size
+        new_size = int(w * scale), int(h * scale)
+        return img.resize(new_size, Image.BILINEAR)
+class PadToSquare:
+    def __init__(self, color=255):
+        self.color = color
+    def __call__(self, img):
+        if isinstance(img, np.ndarray):
+            img = Image.fromarray(img)
+        width, height = img.size
+        if self.color != -1:
+            padding = abs(width - height) // 2
+            if width < height:
+                return TF.pad(img, (padding, 0, padding + (height - width) % 2, 0), fill=self.color, padding_mode='constant')
+            elif width > height:
+                return TF.pad(img, (0, padding, 0, padding + (width - height) % 2), fill=self.color, padding_mode='constant')
+        return img
+class TrendyolDinoV2ImageProcessor(ImageProcessingMixin):
+    """
+    Hugging Face compatible image processor for TrendyolDinoV2 model.
+    """
+    model_input_names = ["pixel_values"]
+    def __init__(
+        self,
+        input_size=224,
+        downscale_size=332,
+        pad_color=255,
+        jpeg_quality=75,
+        do_normalize=True,
+        image_mean=(0.485, 0.456, 0.406),
+        image_std=(0.229, 0.224, 0.225),
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        self.input_size = input_size
+        self.downscale_size = downscale_size
+        self.pad_color = pad_color
+        self.jpeg_quality = jpeg_quality
+        self.do_normalize = do_normalize
+        self.image_mean = image_mean
+        self.image_std = image_std
+    def _get_preprocess_fn(self):
+        """Create the preprocessing pipeline (not stored as attribute to avoid JSON serialization issues)"""
+        return transforms.Compose([
+            DownScaleLanczos(self.downscale_size),
+            JPEGCompression(self.jpeg_quality),
+            ScaleImage(self.downscale_size),
+            PadToSquare(self.pad_color),
+            transforms.Resize((self.input_size, self.input_size)),
+            transforms.ToTensor(),
+            transforms.Normalize(self.image_mean, self.image_std)
+        ])
+    def __call__(
+        self,
+        images: Union[Image.Image, np.ndarray, torch.Tensor, List[Image.Image], List[np.ndarray], List[torch.Tensor]],
+        return_tensors: Optional[Union[str, TensorType]] = None,
+        **kwargs
+    ) -> BatchFeature:
+        """
+        Preprocess images for the model.
+        """
+        # Handle single image
+        if not isinstance(images, list):
+            images = [images]
+        # Get preprocessing pipeline
+        preprocess_fn = self._get_preprocess_fn()
+        # Preprocess all images
+        processed_images = []
+        for image in images:
+            if isinstance(image, str):
+                image = Image.open(image).convert('RGB')
+            elif isinstance(image, np.ndarray):
+                image = Image.fromarray(image).convert('RGB')
+            elif not isinstance(image, Image.Image):
+                raise ValueError(f"Unsupported image type: {type(image)}")
+            # Apply preprocessing
+            processed_tensor = preprocess_fn(image)
+            processed_images.append(processed_tensor)
+        # Stack tensors
+        pixel_values = torch.stack(processed_images)
+        # Return BatchFeature
+        data = {"pixel_values": pixel_values}
+        return BatchFeature(data=data, tensor_type=return_tensors)
+# Register for auto class
+TrendyolDinoV2ImageProcessor.register_for_auto_class("AutoImageProcessor")

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cb41c67595af4eb4ce357fbf55c7fc238436f0b24cc2b53a46f35f3cca0e0424
+size 547685752

modeling_trendyol_dinov2.py ADDED Viewed

	@@ -0,0 +1,142 @@

+"""
+Hugging Face compatible model implementation for Trendyol DinoV2
+"""
+import torch
+import torch.nn as nn
+from transformers import PreTrainedModel, PretrainedConfig
+from transformers.modeling_outputs import BaseModelOutput
+from typing import Optional, Tuple, Union
+import torch.nn.functional as F
+class TrendyolDinoV2Config(PretrainedConfig):
+    """
+    Configuration class for TrendyolDinoV2 model.
+    """
+    model_type = "trendyol_dinov2"
+    def __init__(
+        self,
+        embedding_dim=256,
+        input_size=224,
+        hidden_size=256,
+        backbone_name="dinov2_vitb14",
+        in_features=768,
+        downscale_size=332,
+        pad_color=255,
+        jpeg_quality=75,
+        **kwargs
+    ):
+        super().__init__(**kwargs)
+        self.embedding_dim = embedding_dim
+        self.input_size = input_size
+        self.hidden_size = hidden_size
+        self.backbone_name = backbone_name
+        self.in_features = in_features
+        self.downscale_size = downscale_size
+        self.pad_color = pad_color
+        self.jpeg_quality = jpeg_quality
+class TYArcFaceDinoV2(nn.Module):
+    """Core model architecture"""
+    def __init__(self, config):
+        super(TYArcFaceDinoV2, self).__init__()
+        self.config = config
+        # Load DinoV2 backbone
+        try:
+            self.backbone = torch.hub.load('facebookresearch/dinov2', config.backbone_name)
+        except Exception as e:
+            raise RuntimeError(f"Failed to load DinoV2 backbone: {e}")
+        self.hidden_size = config.hidden_size
+        self.in_features = config.in_features
+        self.embedding_dim = config.embedding_dim
+        self.bn1 = nn.BatchNorm2d(self.in_features)
+        # Freeze backbone
+        self.backbone.requires_grad_(False)
+        # Projection layers
+        self.fc11 = nn.Linear(self.in_features * self.hidden_size, self.embedding_dim)
+        self.bn11 = nn.BatchNorm1d(self.embedding_dim)
+    def forward(self, pixel_values):
+        try:
+            features = self.backbone.get_intermediate_layers(
+                pixel_values, return_class_token=True, reshape=True
+            )
+            features = features[0][0]  # Get the features
+            features = self.bn1(features)
+            features = features.flatten(start_dim=1)
+            features = self.fc11(features)
+            features = self.bn11(features)
+            features = F.normalize(features)
+            return features
+        except Exception as e:
+            raise RuntimeError(f"Forward pass failed: {e}")
+class TrendyolDinoV2Model(PreTrainedModel):
+    """
+    Hugging Face compatible wrapper for TrendyolDinoV2
+    """
+    config_class = TrendyolDinoV2Config
+    base_model_prefix = "model"
+    def __init__(self, config):
+        super().__init__(config)
+        self.model = TYArcFaceDinoV2(config)
+        # Initialize weights
+        self.init_weights()
+    def _init_weights(self, module):
+        """Initialize weights (required by PreTrainedModel)"""
+        if isinstance(module, nn.Linear):
+            module.weight.data.normal_(mean=0.0, std=0.02)
+            if module.bias is not None:
+                module.bias.data.zero_()
+        elif isinstance(module, nn.BatchNorm1d) or isinstance(module, nn.BatchNorm2d):
+            module.bias.data.zero_()
+            module.weight.data.fill_(1.0)
+    def init_weights(self):
+        """Initialize all weights in the model"""
+        self.apply(self._init_weights)
+    def forward(
+        self,
+        pixel_values: Optional[torch.Tensor] = None,
+        output_hidden_states: Optional[bool] = None,
+        return_dict: Optional[bool] = None,
+        **kwargs
+    ):
+        return_dict = return_dict if return_dict is not None else getattr(self.config, 'use_return_dict', True)
+        if pixel_values is None:
+            raise ValueError("pixel_values cannot be None")
+        # Get embeddings from the model
+        embeddings = self.model(pixel_values)
+        if not return_dict:
+            return (embeddings,)
+        return BaseModelOutput(
+            last_hidden_state=embeddings,
+            hidden_states=None,
+            attentions=None
+        )
+    def get_embeddings(self, pixel_values):
+        """Convenience method to get embeddings directly"""
+        with torch.no_grad():
+            outputs = self.forward(pixel_values, return_dict=True)
+            return outputs.last_hidden_state
+# Register the configuration
+TrendyolDinoV2Config.register_for_auto_class()
+TrendyolDinoV2Model.register_for_auto_class("AutoModel")

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,43 @@

+{
+  "image_processor_type": "TrendyolDinoV2ImageProcessor",
+  "processor_class": "TrendyolDinoV2ImageProcessor",
+  "auto_map": {
+    "AutoImageProcessor": "image_processing_trendyol_dinov2.TrendyolDinoV2ImageProcessor"
+  },
+  "input_size": 224,
+  "downscale_size": 332,
+  "pad_color": 255,
+  "jpeg_quality": 75,
+  "do_normalize": true,
+  "image_mean": [
+    0.485,
+    0.456,
+    0.406
+  ],
+  "image_std": [
+    0.229,
+    0.224,
+    0.225
+  ],
+  "do_resize": true,
+  "size": {
+    "height": 224,
+    "width": 224
+  },
+  "resample": 3,
+  "do_center_crop": false,
+  "crop_size": {
+    "height": 224,
+    "width": 224
+  },
+  "do_convert_rgb": true,
+  "transforms": [
+    "DownScaleLanczos",
+    "JPEGCompression",
+    "ScaleImage",
+    "PadToSquare",
+    "Resize",
+    "ToTensor",
+    "Normalize"
+  ]
+}

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:60a38364dc18e4dd31a5bda0e8c36223a9b3518112ceeee7650ef59fd072a6cd
+size 547728271

requirements.txt ADDED Viewed

	@@ -0,0 +1,7 @@

+torch>=1.9.0
+torchvision>=0.10.0
+safetensors>=0.3.0
+Pillow>=8.0.0
+numpy>=1.20.0
+opencv-python>=4.5.0
+transformers>=4.20.0