k4tel commited on
Commit
bd10bd4
·
verified ·
1 Parent(s): 433aca4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -37
README.md CHANGED
@@ -11,7 +11,8 @@ tags:
11
 
12
  **Scope:** Processing of images, training and evaluation of ViT model,
13
  input file/directory processing, class (category) results of top
14
- N predictions output, predictions summarizing into a tabular format
 
15
 
16
  ## Model description:
17
 
@@ -21,29 +22,29 @@ Training set of the model: **8950** images
21
 
22
  #### Categories:
23
 
24
- - **DRAW**: 1182 (11.89%) - drawings, maps, paintings
25
 
26
- - **DRAW_L**: 813 (8.17%) - drawings, maps, paintings inside tabular layout
27
 
28
- - **LINE_HW**: 596 (5.99%) - handwritten text lines inside tabular layout
29
 
30
- - **LINE_P**: 603 (6.06%) - printed text lines inside tabular layout
31
 
32
- - **LINE_T**: 1332 (13.39%) - typed text lines inside tabular layout
33
 
34
  - **PHOTO**: 1015 (10.21%) - photos with text
35
 
36
- - **PHOTO_L**: 782 (7.86%) - photos inside tabular layout
37
 
38
- - **TEXT**: 853 (8.58%) - mixed types, printed, and handwritten texts
39
 
40
- - **TEXT_HW**: 732 (7.36%) - handwritten text
41
 
42
- - **TEXT_P**: 691 (6.95%) - printed text
43
 
44
- - **TEXT_T**: 1346 (13.53%) - typed text
45
 
46
- Evaluation set (10% of the above stats): **995** images - percentage correct (Top-3): **99.6%**
47
 
48
  ### Result tables:
49
 
@@ -68,7 +69,7 @@ Page images classification in to 11 predefined categories.
68
 
69
  #### Preprocessing
70
 
71
- train_transforms = transforms.Compose([
72
 
73
  transforms.RandomApply([
74
 
@@ -86,37 +87,33 @@ train_transforms = transforms.Compose([
86
 
87
  ], p=0.5),
88
 
89
- ...
90
 
91
- ])
92
 
93
  eval_transforms - basic.
94
 
95
  #### Training Hyperparameters
96
 
97
- training_args = TrainingArguments(
98
 
99
- eval_strategy="epoch",
100
-
101
- save_strategy="epoch",
102
-
103
- learning_rate=5e-5,
104
-
105
- per_device_train_batch_size=8,
106
-
107
- per_device_eval_batch_size=8,
108
-
109
- num_train_epochs=3,
110
-
111
- warmup_ratio=0.1,
112
-
113
- logging_steps=10,
114
-
115
- load_best_model_at_end=True,
116
-
117
- metric_for_best_model="accuracy",
118
-
119
- )
120
 
121
  ## Evaluation
122
 
 
11
 
12
  **Scope:** Processing of images, training and evaluation of ViT model,
13
  input file/directory processing, class (category) results of top
14
+ N predictions output, predictions summarizing into a tabular format,
15
+ HF hub support for the model
16
 
17
  ## Model description:
18
 
 
22
 
23
  #### Categories:
24
 
25
+ - **DRAW**: 1182 (11.89%) - drawings, maps, paintings with text
26
 
27
+ - **DRAW_L**: 813 (8.17%) - drawings, maps, paintings with a table legend or inside tabular layout / forms
28
 
29
+ - **LINE_HW**: 596 (5.99%) - handwritten text lines inside tabular layout / forms
30
 
31
+ - **LINE_P**: 603 (6.06%) - printed text lines inside tabular layout / forms
32
 
33
+ - **LINE_T**: 1332 (13.39%) - machine typed text lines inside tabular layout / forms
34
 
35
  - **PHOTO**: 1015 (10.21%) - photos with text
36
 
37
+ - **PHOTO_L**: 782 (7.86%) - photos inside tabular layout / forms
38
 
39
+ - **TEXT**: 853 (8.58%) - mixed types, printed, and handwritten texts
40
 
41
+ - **TEXT_HW**: 732 (7.36%) - only handwritten text
42
 
43
+ - **TEXT_P**: 691 (6.95%) - only printed text
44
 
45
+ - **TEXT_T**: 1346 (13.53%) - only machine typed text
46
 
47
+ Evaluation set (10% of the above stats): **995** images
48
 
49
  ### Result tables:
50
 
 
69
 
70
  #### Preprocessing
71
 
72
+ train_transforms:
73
 
74
  transforms.RandomApply([
75
 
 
87
 
88
  ], p=0.5),
89
 
 
90
 
 
91
 
92
  eval_transforms - basic.
93
 
94
  #### Training Hyperparameters
95
 
 
96
 
97
+ eval_strategy="epoch",
98
+
99
+ save_strategy="epoch",
100
+
101
+ learning_rate=5e-5,
102
+
103
+ per_device_train_batch_size=8,
104
+
105
+ per_device_eval_batch_size=8,
106
+
107
+ num_train_epochs=3,
108
+
109
+ warmup_ratio=0.1,
110
+
111
+ logging_steps=10,
112
+
113
+ load_best_model_at_end=True,
114
+
115
+ metric_for_best_model="accuracy",
116
+
 
117
 
118
  ## Evaluation
119