File size: 2,473 Bytes
4acb6fe f62ee94 4acb6fe 2d40c7b 4acb6fe 2d40c7b 4acb6fe 2d40c7b 4acb6fe 2d40c7b 4acafc7 4acb6fe 2d40c7b 4acafc7 2d40c7b 4acafc7 2d40c7b 4acb6fe 2d40c7b 4acb6fe 2d40c7b 4acb6fe 4acafc7 2d40c7b 4acb6fe 2d40c7b 4acb6fe 2d40c7b 4acb6fe 2d40c7b 4acb6fe 2d40c7b 4acb6fe 2d40c7b 4acb6fe 2d40c7b 4acb6fe 2d40c7b 4acb6fe 4acafc7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
---
library_name: transformers
tags:
- image-classification
- vision
- avatar
- katara
datasets:
- deepghs/nozomi_standalone_full
language:
- en
metrics:
- f1
base_model:
- facebook/dinov2-small
pipeline_tag: image-classification
---
# Model Card for Katara Detector
This model identifies whether an image contains Katara from Avatar: The Last Airbender. It achieves 96% accuracy and 96.1% F1 score on the validation set.
## Model Details
### Model Description
A binary image classifier that determines if Katara from the animated series "Avatar: The Last Airbender" is present in an image.
- **Developed by:** Your Name/Organization
- **Model type:** Image Classification
- **License:** MIT
- **Finetuned from model:** facebook/dinov2-small
## Uses
### Direct Use
This model can be used to:
- Identify Katara in screenshots or fan art
- Filter or categorize ATLA-related image collections
- Power fan applications that track character appearances
```python
# Use a pipeline as a high-level helper
from PIL import Image
from transformers import pipeline
pipe = pipeline("image-classification", model="lumenggan/katara-detector")
image = Image.open("yourimage.png")
pipe(image)
```
### Out-of-Scope Use
This model should not be used for:
- Critical identification tasks
- Monitoring or surveillance purposes
- Making judgments about real people
## Training Details
### Training Data
The model was trained on a custom dataset of Katara images and non-Katara images from Avatar: The Last Airbender. The dataset was split 80/20 for training and validation.
### Training Procedure
The model was fine-tuned from DINOv2-small using the following techniques:
- Dropout regularization (rate=0.3)
- Weight decay (0.01-0.05)
- Cosine learning rate schedule with restarts
#### Training Hyperparameters
- **Learning rate:** 2e-5
- **Weight decay:** 0.01-0.05
- **Epochs:** 5-15
- **Batch size:** 16 (effective 32 with gradient accumulation)
- **Training regime:** fp16 mixed precision
## Evaluation
### Metrics
- **Accuracy:** 96.0%
- **F1 Score:** 96.1%
- **Precision:** 96.8%
- **Recall:** 95.5%
- **ROC AUC:** 99.4%
## Technical Specifications
### Model Architecture
- Base model: facebook/dinov2-with-registers-small
- Custom classification head with dropout
- Input size: 224x224 RGB images
### Compute Infrastructure
- GPU: (e.g., NVIDIA T4, A100, etc.)
- Training time: Approximately 1-2 hours
## Model Card Contact
https://github.com/unLomTrois/ |