nobodyPerfecZ commited on
Commit
d912803
·
1 Parent(s): e6eba41

- Updated README.md

Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -5,17 +5,17 @@ tags:
5
  - image classification
6
  - recaptchav2
7
  datasets:
8
- - recaptchav2-dataset
9
  ---
10
 
11
  # Finetuned Vision Transformer
12
 
13
- This repository contains a Vision Transformer (ViT) model fine-tuned on the ReCAPTCHAv2 dataset.
14
  The dataset comprises 29,568 labeled images spanning 5 classes, each resized to a resolution of 224×224 pixels.
15
 
16
  ## Model description
17
 
18
- This model builds on a pre-trained ViT backbone and is fine-tuned on the ReCAPTCHAv2 dataset.
19
  It leverages the transformer-based architecture to capture global contextual information effectively, making it well-suited for tasks with diverse visual patterns like ReCAPTCHA classification.
20
 
21
  ## Intended uses & limitations
@@ -31,7 +31,7 @@ The model is particularly useful in academic and experimental contexts where und
31
 
32
  ## How to use
33
 
34
- Here is how to use this model to classify an image of the ReCAPTCHAv2 dataset into one of the 5 classes:
35
 
36
  ```python
37
  import requests
@@ -39,7 +39,7 @@ import torch
39
  from PIL import Image
40
  from transformers import ViTForImageClassification, ViTImageProcessor
41
 
42
- url = "https://raw.githubusercontent.com/nobodyPerfecZ/recaptchav2-dataset/refs/heads/master/data/bicycle/bicycle_0.png"
43
  image = Image.open(requests.get(url, stream=True).raw)
44
  processor = ViTImageProcessor.from_pretrained(
45
  "nobodyPerfecZ/vit-finetuned-patch16-224-recaptchav2-v1"
@@ -59,7 +59,7 @@ print(f"Predicted labels: {labels}")
59
 
60
  ## Training data
61
 
62
- The ViT model was fine-tuned on [ReCAPTCHAv2 dataset](https://huggingface.co/datasets/nobodyPerfecZ/recaptchav2-dataset), a dataset consisting of 29.568 images and 5 classes.
63
 
64
  ## Training procedure
65
 
@@ -71,7 +71,7 @@ Images are resized/rescaled to the same resolution (224x224) and normalized acro
71
 
72
  ## Evaluation results
73
 
74
- The ViT model was evaluated on a held-out test set from the ReCAPTCHAv2 dataset.
75
  Two key metrics were used to assess performance:
76
 
77
  | Metric | Score |
 
5
  - image classification
6
  - recaptchav2
7
  datasets:
8
+ - recaptchav2-29k
9
  ---
10
 
11
  # Finetuned Vision Transformer
12
 
13
+ This repository contains a Vision Transformer (ViT) model fine-tuned on the ReCAPTCHAv2-29k dataset.
14
  The dataset comprises 29,568 labeled images spanning 5 classes, each resized to a resolution of 224×224 pixels.
15
 
16
  ## Model description
17
 
18
+ This model builds on a pre-trained ViT backbone and is fine-tuned on the ReCAPTCHAv2-29k dataset.
19
  It leverages the transformer-based architecture to capture global contextual information effectively, making it well-suited for tasks with diverse visual patterns like ReCAPTCHA classification.
20
 
21
  ## Intended uses & limitations
 
31
 
32
  ## How to use
33
 
34
+ Here is how to use this model to classify an image of the ReCAPTCHAv2-29k dataset into one of the 5 classes:
35
 
36
  ```python
37
  import requests
 
39
  from PIL import Image
40
  from transformers import ViTForImageClassification, ViTImageProcessor
41
 
42
+ url = "https://raw.githubusercontent.com/nobodyPerfecZ/recaptchav2-29k/refs/heads/master/data/bicycle/bicycle_0.png"
43
  image = Image.open(requests.get(url, stream=True).raw)
44
  processor = ViTImageProcessor.from_pretrained(
45
  "nobodyPerfecZ/vit-finetuned-patch16-224-recaptchav2-v1"
 
59
 
60
  ## Training data
61
 
62
+ The ViT model was fine-tuned on [ReCAPTCHAv2-29k dataset](https://huggingface.co/datasets/nobodyPerfecZ/recaptchav2-29k), a dataset consisting of 29.568 images and 5 classes.
63
 
64
  ## Training procedure
65
 
 
71
 
72
  ## Evaluation results
73
 
74
+ The ViT model was evaluated on a held-out test set from the ReCAPTCHAv2-29k dataset.
75
  Two key metrics were used to assess performance:
76
 
77
  | Metric | Score |