bnina-ayoub commited on
Commit
b231bbf
·
verified ·
1 Parent(s): 4ff63d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -8
README.md CHANGED
@@ -1,16 +1,22 @@
1
  ---
2
  library_name: transformers
3
- base_model: bninaos/fine-tuned-vit
 
4
  tags:
5
  - generated_from_trainer
 
 
6
  model-index:
7
  - name: finetuned-ViT-model
8
  results: []
 
 
 
 
 
 
9
  ---
10
 
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
13
-
14
  # finetuned-ViT-model
15
 
16
  This model is a fine-tuned version of [bninaos/fine-tuned-vit](https://huggingface.co/bninaos/fine-tuned-vit) on an unknown dataset.
@@ -19,18 +25,37 @@ It achieves the following results on the evaluation set:
19
 
20
  ## Model description
21
 
22
- More information needed
 
 
 
23
 
24
  ## Intended uses & limitations
25
 
26
- More information needed
 
27
 
28
  ## Training and evaluation data
29
 
30
- More information needed
 
 
 
31
 
32
  ## Training procedure
33
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
@@ -52,4 +77,4 @@ The following hyperparameters were used during training:
52
  - Transformers 4.50.1
53
  - Pytorch 2.5.1+cu121
54
  - Datasets 3.4.1
55
- - Tokenizers 0.21.0
 
1
  ---
2
  library_name: transformers
3
+ base_model:
4
+ - facebook/detr-resnet-50
5
  tags:
6
  - generated_from_trainer
7
+ - industry
8
+ - construction
9
  model-index:
10
  - name: finetuned-ViT-model
11
  results: []
12
+ license: mit
13
+ datasets:
14
+ - hf-vision/hardhat
15
+ language:
16
+ - en
17
+ pipeline_tag: object-detection
18
  ---
19
 
 
 
 
20
  # finetuned-ViT-model
21
 
22
  This model is a fine-tuned version of [bninaos/fine-tuned-vit](https://huggingface.co/bninaos/fine-tuned-vit) on an unknown dataset.
 
25
 
26
  ## Model description
27
 
28
+ This model is a demonstration project for the Hugging Face Certification assignment and was created for educational purpose.
29
+ It is a fine-tuned Vision Transformer (ViT) for object detection, specifically trained to detect hard hats, heads, and people in images. It uses the `facebook/detr-resnet-50-dc5` checkpoint as a base and is further trained on the `hf-vision/hardhat` dataset.
30
+
31
+ The model leverages the transformer architecture to process image patches and predict bounding boxes and labels for the objects of interest.
32
 
33
  ## Intended uses & limitations
34
 
35
+ - **Intented Uses:** This model can be used to demonstrate object detection with ViT. It can potentially be used in safety applications to identify individuals wearing or not wearing hardhats in construction sites or industrial environments.
36
+ - **Limitations:** This model has been limitedly trained and may not generalize well to images with significantly different characteristics, viewpoints, or lighting conditions. It is not intended for production use without further evaluation and validation.
37
 
38
  ## Training and evaluation data
39
 
40
+ - **Dataset:** The model was trained on the `hf-vision/hardhat` dataset from Hugging Face Datasets. This dataset contains images of construction sites and industrial settings with annotations for hardhats, heads, and people.
41
+ - **Data splits:** The dataset is divided into "train" and "test" splits.
42
+ - **Data augmentation:** Data augmentation was applied during training using `albumentations` to improve model generalization. These included random horizontal flipping and random brightness/contrast adjustments.
43
+
44
 
45
  ## Training procedure
46
 
47
+ - **Base model:** The model was initialized from the `facebook/detr-resnet-50-dc5` checkpoint, a pre-trained DETR model with a ResNet-50 backbone.
48
+ - **Fine-tuning:** The model was fine-tuned using the Hugging Face `Trainer` with the following hyperparameters:
49
+ - Learning rate: 1e-6
50
+ - Weight decay: 1e-4
51
+ - Batch size: 1
52
+ - Epochs: 3
53
+ - Max steps: 500
54
+ - Optimizer: AdamW
55
+ - **Evaluation:** The model was evaluated on the test set using standard object detection metrics, including COCO metrics (Average Precision, Average Recall).
56
+ - **Hardware:** Training was performed on Google Colab using GPU acceleration.
57
+
58
+
59
  ### Training hyperparameters
60
 
61
  The following hyperparameters were used during training:
 
77
  - Transformers 4.50.1
78
  - Pytorch 2.5.1+cu121
79
  - Datasets 3.4.1
80
+ - Tokenizers 0.21.0