Thibaut commited on
Commit
3ed150d
·
verified ·
1 Parent(s): 3e6ecca

End of training

Browse files
README.md CHANGED
@@ -3,6 +3,8 @@ library_name: transformers
3
  license: other
4
  base_model: nvidia/segformer-b3-finetuned-cityscapes-1024-1024
5
  tags:
 
 
6
  - generated_from_trainer
7
  model-index:
8
  - name: route_background_semantic
@@ -14,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  # route_background_semantic
16
 
17
- This model is a fine-tuned version of [nvidia/segformer-b3-finetuned-cityscapes-1024-1024](https://huggingface.co/nvidia/segformer-b3-finetuned-cityscapes-1024-1024) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
  - Loss: 0.2360
20
  - Mean Iou: 0.1916
 
3
  license: other
4
  base_model: nvidia/segformer-b3-finetuned-cityscapes-1024-1024
5
  tags:
6
+ - image-segmentation
7
+ - vision
8
  - generated_from_trainer
9
  model-index:
10
  - name: route_background_semantic
 
16
 
17
  # route_background_semantic
18
 
19
+ This model is a fine-tuned version of [nvidia/segformer-b3-finetuned-cityscapes-1024-1024](https://huggingface.co/nvidia/segformer-b3-finetuned-cityscapes-1024-1024) on the Logiroad/route_background_semantic dataset.
20
  It achieves the following results on the evaluation set:
21
  - Loss: 0.2360
22
  - Mean Iou: 0.1916
all_results.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 4.120313143798929,
3
+ "eval_accuracy_Autre r\u00e9paration": 0.34369405810457515,
4
+ "eval_accuracy_D\u00e9coupe": 0.2864541960267422,
5
+ "eval_accuracy_Emergence": 0.5548598133737452,
6
+ "eval_accuracy_Gla\u00e7age ou Ressuage": 0.03860482159488221,
7
+ "eval_accuracy_Reflet m\u00e9t\u00e9o": 0.0,
8
+ "eval_accuracy_Unlabeled": NaN,
9
+ "eval_iou_Autre r\u00e9paration": 0.32304877421180617,
10
+ "eval_iou_D\u00e9coupe": 0.2515107459482324,
11
+ "eval_iou_Emergence": 0.5379450939388203,
12
+ "eval_iou_Gla\u00e7age ou Ressuage": 0.03692047935180606,
13
+ "eval_iou_Reflet m\u00e9t\u00e9o": 0.0,
14
+ "eval_iou_Unlabeled": 0.0,
15
+ "eval_loss": 0.23602528870105743,
16
+ "eval_mean_accuracy": 0.244722577819989,
17
+ "eval_mean_iou": 0.19157084890844414,
18
+ "eval_overall_accuracy": 0.29617685609695316,
19
+ "eval_runtime": 139.8938,
20
+ "eval_samples_per_second": 12.96,
21
+ "eval_steps_per_second": 3.245,
22
+ "total_flos": 8.912029734867567e+18,
23
+ "train_loss": 0.29374205589294433,
24
+ "train_runtime": 4666.1544,
25
+ "train_samples_per_second": 8.572,
26
+ "train_steps_per_second": 2.143
27
+ }
eval_results.json ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 4.120313143798929,
3
+ "eval_accuracy_Autre r\u00e9paration": 0.34369405810457515,
4
+ "eval_accuracy_D\u00e9coupe": 0.2864541960267422,
5
+ "eval_accuracy_Emergence": 0.5548598133737452,
6
+ "eval_accuracy_Gla\u00e7age ou Ressuage": 0.03860482159488221,
7
+ "eval_accuracy_Reflet m\u00e9t\u00e9o": 0.0,
8
+ "eval_accuracy_Unlabeled": NaN,
9
+ "eval_iou_Autre r\u00e9paration": 0.32304877421180617,
10
+ "eval_iou_D\u00e9coupe": 0.2515107459482324,
11
+ "eval_iou_Emergence": 0.5379450939388203,
12
+ "eval_iou_Gla\u00e7age ou Ressuage": 0.03692047935180606,
13
+ "eval_iou_Reflet m\u00e9t\u00e9o": 0.0,
14
+ "eval_iou_Unlabeled": 0.0,
15
+ "eval_loss": 0.23602528870105743,
16
+ "eval_mean_accuracy": 0.244722577819989,
17
+ "eval_mean_iou": 0.19157084890844414,
18
+ "eval_overall_accuracy": 0.29617685609695316,
19
+ "eval_runtime": 139.8938,
20
+ "eval_samples_per_second": 12.96,
21
+ "eval_steps_per_second": 3.245
22
+ }
runs/Apr03_14-14-32_algo-1/events.out.tfevents.1743694546.algo-1.68.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:59595d0d3c23c844777eb70744722185bd0829a3684dd30cd84350fd4d945dd0
3
+ size 1288
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 4.120313143798929,
3
+ "total_flos": 8.912029734867567e+18,
4
+ "train_loss": 0.29374205589294433,
5
+ "train_runtime": 4666.1544,
6
+ "train_samples_per_second": 8.572,
7
+ "train_steps_per_second": 2.143
8
+ }
trainer_state.json ADDED
@@ -0,0 +1,857 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 4.120313143798929,
5
+ "eval_steps": 500,
6
+ "global_step": 10000,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.04120313143798929,
13
+ "grad_norm": 2.6929988861083984,
14
+ "learning_rate": 5.9401e-05,
15
+ "loss": 1.2381,
16
+ "step": 100
17
+ },
18
+ {
19
+ "epoch": 0.08240626287597858,
20
+ "grad_norm": 0.9781515002250671,
21
+ "learning_rate": 5.8802000000000004e-05,
22
+ "loss": 0.5408,
23
+ "step": 200
24
+ },
25
+ {
26
+ "epoch": 0.12360939431396786,
27
+ "grad_norm": 1.4702354669570923,
28
+ "learning_rate": 5.8203e-05,
29
+ "loss": 0.3679,
30
+ "step": 300
31
+ },
32
+ {
33
+ "epoch": 0.16481252575195715,
34
+ "grad_norm": 0.458916038274765,
35
+ "learning_rate": 5.7604e-05,
36
+ "loss": 0.4243,
37
+ "step": 400
38
+ },
39
+ {
40
+ "epoch": 0.20601565718994644,
41
+ "grad_norm": 2.584094524383545,
42
+ "learning_rate": 5.7005e-05,
43
+ "loss": 0.341,
44
+ "step": 500
45
+ },
46
+ {
47
+ "epoch": 0.24721878862793573,
48
+ "grad_norm": 4.845738410949707,
49
+ "learning_rate": 5.6406e-05,
50
+ "loss": 0.3247,
51
+ "step": 600
52
+ },
53
+ {
54
+ "epoch": 0.288421920065925,
55
+ "grad_norm": 0.6360086798667908,
56
+ "learning_rate": 5.5806999999999996e-05,
57
+ "loss": 0.3736,
58
+ "step": 700
59
+ },
60
+ {
61
+ "epoch": 0.3296250515039143,
62
+ "grad_norm": 0.5916720628738403,
63
+ "learning_rate": 5.5208000000000004e-05,
64
+ "loss": 0.3775,
65
+ "step": 800
66
+ },
67
+ {
68
+ "epoch": 0.37082818294190356,
69
+ "grad_norm": 1.5302131175994873,
70
+ "learning_rate": 5.4609000000000005e-05,
71
+ "loss": 0.3549,
72
+ "step": 900
73
+ },
74
+ {
75
+ "epoch": 0.4120313143798929,
76
+ "grad_norm": 1.928609848022461,
77
+ "learning_rate": 5.401e-05,
78
+ "loss": 0.311,
79
+ "step": 1000
80
+ },
81
+ {
82
+ "epoch": 0.45323444581788214,
83
+ "grad_norm": 4.012447834014893,
84
+ "learning_rate": 5.3411e-05,
85
+ "loss": 0.3266,
86
+ "step": 1100
87
+ },
88
+ {
89
+ "epoch": 0.49443757725587145,
90
+ "grad_norm": 1.2283966541290283,
91
+ "learning_rate": 5.2812e-05,
92
+ "loss": 0.3581,
93
+ "step": 1200
94
+ },
95
+ {
96
+ "epoch": 0.5356407086938607,
97
+ "grad_norm": 2.1378538608551025,
98
+ "learning_rate": 5.2213e-05,
99
+ "loss": 0.3654,
100
+ "step": 1300
101
+ },
102
+ {
103
+ "epoch": 0.57684384013185,
104
+ "grad_norm": 1.014821171760559,
105
+ "learning_rate": 5.1614000000000004e-05,
106
+ "loss": 0.3019,
107
+ "step": 1400
108
+ },
109
+ {
110
+ "epoch": 0.6180469715698393,
111
+ "grad_norm": 0.6981366872787476,
112
+ "learning_rate": 5.1015e-05,
113
+ "loss": 0.297,
114
+ "step": 1500
115
+ },
116
+ {
117
+ "epoch": 0.6592501030078286,
118
+ "grad_norm": 0.8797981142997742,
119
+ "learning_rate": 5.0416e-05,
120
+ "loss": 0.3543,
121
+ "step": 1600
122
+ },
123
+ {
124
+ "epoch": 0.7004532344458179,
125
+ "grad_norm": 0.34348738193511963,
126
+ "learning_rate": 4.9817e-05,
127
+ "loss": 0.3102,
128
+ "step": 1700
129
+ },
130
+ {
131
+ "epoch": 0.7416563658838071,
132
+ "grad_norm": 0.9645235538482666,
133
+ "learning_rate": 4.9218e-05,
134
+ "loss": 0.2859,
135
+ "step": 1800
136
+ },
137
+ {
138
+ "epoch": 0.7828594973217965,
139
+ "grad_norm": 1.9135812520980835,
140
+ "learning_rate": 4.8619e-05,
141
+ "loss": 0.3493,
142
+ "step": 1900
143
+ },
144
+ {
145
+ "epoch": 0.8240626287597858,
146
+ "grad_norm": 1.7853527069091797,
147
+ "learning_rate": 4.8020000000000004e-05,
148
+ "loss": 0.3256,
149
+ "step": 2000
150
+ },
151
+ {
152
+ "epoch": 0.865265760197775,
153
+ "grad_norm": 1.7780035734176636,
154
+ "learning_rate": 4.7421000000000006e-05,
155
+ "loss": 0.2621,
156
+ "step": 2100
157
+ },
158
+ {
159
+ "epoch": 0.9064688916357643,
160
+ "grad_norm": 0.8148425221443176,
161
+ "learning_rate": 4.6822e-05,
162
+ "loss": 0.3273,
163
+ "step": 2200
164
+ },
165
+ {
166
+ "epoch": 0.9476720230737536,
167
+ "grad_norm": 2.2365009784698486,
168
+ "learning_rate": 4.6223e-05,
169
+ "loss": 0.2879,
170
+ "step": 2300
171
+ },
172
+ {
173
+ "epoch": 0.9888751545117429,
174
+ "grad_norm": 1.7118935585021973,
175
+ "learning_rate": 4.5624e-05,
176
+ "loss": 0.2715,
177
+ "step": 2400
178
+ },
179
+ {
180
+ "epoch": 1.0,
181
+ "eval_accuracy_Autre r\u00e9paration": 0.2533258414054248,
182
+ "eval_accuracy_D\u00e9coupe": 0.08133862998794246,
183
+ "eval_accuracy_Emergence": 0.0,
184
+ "eval_accuracy_Gla\u00e7age ou Ressuage": 0.0,
185
+ "eval_accuracy_Reflet m\u00e9t\u00e9o": 0.0,
186
+ "eval_accuracy_Unlabeled": NaN,
187
+ "eval_iou_Autre r\u00e9paration": 0.23621639872040598,
188
+ "eval_iou_D\u00e9coupe": 0.07661059644544693,
189
+ "eval_iou_Emergence": 0.0,
190
+ "eval_iou_Gla\u00e7age ou Ressuage": 0.0,
191
+ "eval_iou_Reflet m\u00e9t\u00e9o": 0.0,
192
+ "eval_iou_Unlabeled": 0.0,
193
+ "eval_loss": 0.2682347893714905,
194
+ "eval_mean_accuracy": 0.06693289427867345,
195
+ "eval_mean_iou": 0.05213783252764215,
196
+ "eval_overall_accuracy": 0.18279080675579093,
197
+ "eval_runtime": 150.5227,
198
+ "eval_samples_per_second": 12.045,
199
+ "eval_steps_per_second": 3.016,
200
+ "step": 2427
201
+ },
202
+ {
203
+ "epoch": 1.0300782859497322,
204
+ "grad_norm": 3.537090539932251,
205
+ "learning_rate": 4.5025000000000003e-05,
206
+ "loss": 0.2763,
207
+ "step": 2500
208
+ },
209
+ {
210
+ "epoch": 1.0712814173877214,
211
+ "grad_norm": 2.1730239391326904,
212
+ "learning_rate": 4.4426000000000005e-05,
213
+ "loss": 0.2981,
214
+ "step": 2600
215
+ },
216
+ {
217
+ "epoch": 1.1124845488257107,
218
+ "grad_norm": 1.0320223569869995,
219
+ "learning_rate": 4.3827e-05,
220
+ "loss": 0.3227,
221
+ "step": 2700
222
+ },
223
+ {
224
+ "epoch": 1.1536876802637002,
225
+ "grad_norm": 4.7768635749816895,
226
+ "learning_rate": 4.3228e-05,
227
+ "loss": 0.3398,
228
+ "step": 2800
229
+ },
230
+ {
231
+ "epoch": 1.1948908117016894,
232
+ "grad_norm": 1.5758723020553589,
233
+ "learning_rate": 4.2629e-05,
234
+ "loss": 0.334,
235
+ "step": 2900
236
+ },
237
+ {
238
+ "epoch": 1.2360939431396787,
239
+ "grad_norm": 4.915160655975342,
240
+ "learning_rate": 4.203e-05,
241
+ "loss": 0.2577,
242
+ "step": 3000
243
+ },
244
+ {
245
+ "epoch": 1.277297074577668,
246
+ "grad_norm": 0.7495476603507996,
247
+ "learning_rate": 4.1431e-05,
248
+ "loss": 0.2807,
249
+ "step": 3100
250
+ },
251
+ {
252
+ "epoch": 1.3185002060156572,
253
+ "grad_norm": 1.0287623405456543,
254
+ "learning_rate": 4.0832e-05,
255
+ "loss": 0.3277,
256
+ "step": 3200
257
+ },
258
+ {
259
+ "epoch": 1.3597033374536465,
260
+ "grad_norm": 3.6160237789154053,
261
+ "learning_rate": 4.0233e-05,
262
+ "loss": 0.3073,
263
+ "step": 3300
264
+ },
265
+ {
266
+ "epoch": 1.4009064688916357,
267
+ "grad_norm": 6.738962173461914,
268
+ "learning_rate": 3.9634e-05,
269
+ "loss": 0.2744,
270
+ "step": 3400
271
+ },
272
+ {
273
+ "epoch": 1.442109600329625,
274
+ "grad_norm": 0.7060651779174805,
275
+ "learning_rate": 3.9035e-05,
276
+ "loss": 0.2976,
277
+ "step": 3500
278
+ },
279
+ {
280
+ "epoch": 1.4833127317676142,
281
+ "grad_norm": 4.404435634613037,
282
+ "learning_rate": 3.8436e-05,
283
+ "loss": 0.2646,
284
+ "step": 3600
285
+ },
286
+ {
287
+ "epoch": 1.5245158632056035,
288
+ "grad_norm": 1.1246055364608765,
289
+ "learning_rate": 3.7837000000000004e-05,
290
+ "loss": 0.3497,
291
+ "step": 3700
292
+ },
293
+ {
294
+ "epoch": 1.5657189946435928,
295
+ "grad_norm": 3.132385015487671,
296
+ "learning_rate": 3.7238000000000005e-05,
297
+ "loss": 0.2437,
298
+ "step": 3800
299
+ },
300
+ {
301
+ "epoch": 1.6069221260815822,
302
+ "grad_norm": 0.3945494592189789,
303
+ "learning_rate": 3.6639e-05,
304
+ "loss": 0.2616,
305
+ "step": 3900
306
+ },
307
+ {
308
+ "epoch": 1.6481252575195715,
309
+ "grad_norm": 0.8652153015136719,
310
+ "learning_rate": 3.604e-05,
311
+ "loss": 0.2466,
312
+ "step": 4000
313
+ },
314
+ {
315
+ "epoch": 1.6893283889575608,
316
+ "grad_norm": 0.44899633526802063,
317
+ "learning_rate": 3.544100000000001e-05,
318
+ "loss": 0.2562,
319
+ "step": 4100
320
+ },
321
+ {
322
+ "epoch": 1.73053152039555,
323
+ "grad_norm": 3.39601993560791,
324
+ "learning_rate": 3.4842e-05,
325
+ "loss": 0.2795,
326
+ "step": 4200
327
+ },
328
+ {
329
+ "epoch": 1.7717346518335395,
330
+ "grad_norm": 2.5917625427246094,
331
+ "learning_rate": 3.4243000000000004e-05,
332
+ "loss": 0.2933,
333
+ "step": 4300
334
+ },
335
+ {
336
+ "epoch": 1.8129377832715288,
337
+ "grad_norm": 1.0517610311508179,
338
+ "learning_rate": 3.3644000000000005e-05,
339
+ "loss": 0.2632,
340
+ "step": 4400
341
+ },
342
+ {
343
+ "epoch": 1.854140914709518,
344
+ "grad_norm": 1.573089361190796,
345
+ "learning_rate": 3.3045000000000006e-05,
346
+ "loss": 0.2554,
347
+ "step": 4500
348
+ },
349
+ {
350
+ "epoch": 1.8953440461475073,
351
+ "grad_norm": 1.3932527303695679,
352
+ "learning_rate": 3.2446e-05,
353
+ "loss": 0.2676,
354
+ "step": 4600
355
+ },
356
+ {
357
+ "epoch": 1.9365471775854965,
358
+ "grad_norm": 7.98951530456543,
359
+ "learning_rate": 3.1847e-05,
360
+ "loss": 0.2906,
361
+ "step": 4700
362
+ },
363
+ {
364
+ "epoch": 1.9777503090234858,
365
+ "grad_norm": 0.578360378742218,
366
+ "learning_rate": 3.1248e-05,
367
+ "loss": 0.2815,
368
+ "step": 4800
369
+ },
370
+ {
371
+ "epoch": 2.0,
372
+ "eval_accuracy_Autre r\u00e9paration": 0.19815143518295517,
373
+ "eval_accuracy_D\u00e9coupe": 0.11079467411500263,
374
+ "eval_accuracy_Emergence": 0.4089615931721195,
375
+ "eval_accuracy_Gla\u00e7age ou Ressuage": 0.0,
376
+ "eval_accuracy_Reflet m\u00e9t\u00e9o": 0.0,
377
+ "eval_accuracy_Unlabeled": NaN,
378
+ "eval_iou_Autre r\u00e9paration": 0.19162433877536195,
379
+ "eval_iou_D\u00e9coupe": 0.10140688937641373,
380
+ "eval_iou_Emergence": 0.40571014840298464,
381
+ "eval_iou_Gla\u00e7age ou Ressuage": 0.0,
382
+ "eval_iou_Reflet m\u00e9t\u00e9o": 0.0,
383
+ "eval_iou_Unlabeled": 0.0,
384
+ "eval_loss": 0.26819199323654175,
385
+ "eval_mean_accuracy": 0.14358154049401545,
386
+ "eval_mean_iou": 0.11645689609246006,
387
+ "eval_overall_accuracy": 0.15928522569775833,
388
+ "eval_runtime": 140.1786,
389
+ "eval_samples_per_second": 12.934,
390
+ "eval_steps_per_second": 3.239,
391
+ "step": 4854
392
+ },
393
+ {
394
+ "epoch": 2.018953440461475,
395
+ "grad_norm": 0.5752081871032715,
396
+ "learning_rate": 3.0649000000000004e-05,
397
+ "loss": 0.2768,
398
+ "step": 4900
399
+ },
400
+ {
401
+ "epoch": 2.0601565718994643,
402
+ "grad_norm": 0.6111757755279541,
403
+ "learning_rate": 3.0050000000000002e-05,
404
+ "loss": 0.2226,
405
+ "step": 5000
406
+ },
407
+ {
408
+ "epoch": 2.1013597033374536,
409
+ "grad_norm": 0.48088550567626953,
410
+ "learning_rate": 2.9451e-05,
411
+ "loss": 0.334,
412
+ "step": 5100
413
+ },
414
+ {
415
+ "epoch": 2.142562834775443,
416
+ "grad_norm": 1.2190054655075073,
417
+ "learning_rate": 2.8851999999999998e-05,
418
+ "loss": 0.2868,
419
+ "step": 5200
420
+ },
421
+ {
422
+ "epoch": 2.183765966213432,
423
+ "grad_norm": 2.414565324783325,
424
+ "learning_rate": 2.8253e-05,
425
+ "loss": 0.3291,
426
+ "step": 5300
427
+ },
428
+ {
429
+ "epoch": 2.2249690976514214,
430
+ "grad_norm": 0.2674981653690338,
431
+ "learning_rate": 2.7653999999999996e-05,
432
+ "loss": 0.2687,
433
+ "step": 5400
434
+ },
435
+ {
436
+ "epoch": 2.2661722290894106,
437
+ "grad_norm": 2.053374767303467,
438
+ "learning_rate": 2.7054999999999998e-05,
439
+ "loss": 0.2559,
440
+ "step": 5500
441
+ },
442
+ {
443
+ "epoch": 2.3073753605274003,
444
+ "grad_norm": 3.9835445880889893,
445
+ "learning_rate": 2.6455999999999995e-05,
446
+ "loss": 0.282,
447
+ "step": 5600
448
+ },
449
+ {
450
+ "epoch": 2.348578491965389,
451
+ "grad_norm": 3.391972303390503,
452
+ "learning_rate": 2.5857e-05,
453
+ "loss": 0.3191,
454
+ "step": 5700
455
+ },
456
+ {
457
+ "epoch": 2.389781623403379,
458
+ "grad_norm": 0.4526354968547821,
459
+ "learning_rate": 2.5258e-05,
460
+ "loss": 0.2732,
461
+ "step": 5800
462
+ },
463
+ {
464
+ "epoch": 2.430984754841368,
465
+ "grad_norm": 1.3189719915390015,
466
+ "learning_rate": 2.4659e-05,
467
+ "loss": 0.242,
468
+ "step": 5900
469
+ },
470
+ {
471
+ "epoch": 2.4721878862793574,
472
+ "grad_norm": 1.6163711547851562,
473
+ "learning_rate": 2.406e-05,
474
+ "loss": 0.278,
475
+ "step": 6000
476
+ },
477
+ {
478
+ "epoch": 2.5133910177173466,
479
+ "grad_norm": 1.5330442190170288,
480
+ "learning_rate": 2.3460999999999998e-05,
481
+ "loss": 0.29,
482
+ "step": 6100
483
+ },
484
+ {
485
+ "epoch": 2.554594149155336,
486
+ "grad_norm": 4.686217784881592,
487
+ "learning_rate": 2.2862e-05,
488
+ "loss": 0.2586,
489
+ "step": 6200
490
+ },
491
+ {
492
+ "epoch": 2.595797280593325,
493
+ "grad_norm": 3.333735942840576,
494
+ "learning_rate": 2.2263e-05,
495
+ "loss": 0.2794,
496
+ "step": 6300
497
+ },
498
+ {
499
+ "epoch": 2.6370004120313144,
500
+ "grad_norm": 1.2093195915222168,
501
+ "learning_rate": 2.1663999999999998e-05,
502
+ "loss": 0.2466,
503
+ "step": 6400
504
+ },
505
+ {
506
+ "epoch": 2.6782035434693037,
507
+ "grad_norm": 1.6071631908416748,
508
+ "learning_rate": 2.1065e-05,
509
+ "loss": 0.21,
510
+ "step": 6500
511
+ },
512
+ {
513
+ "epoch": 2.719406674907293,
514
+ "grad_norm": 1.4164949655532837,
515
+ "learning_rate": 2.0465999999999997e-05,
516
+ "loss": 0.2822,
517
+ "step": 6600
518
+ },
519
+ {
520
+ "epoch": 2.760609806345282,
521
+ "grad_norm": 8.471506118774414,
522
+ "learning_rate": 1.9866999999999998e-05,
523
+ "loss": 0.2475,
524
+ "step": 6700
525
+ },
526
+ {
527
+ "epoch": 2.8018129377832715,
528
+ "grad_norm": 8.533307075500488,
529
+ "learning_rate": 1.9267999999999996e-05,
530
+ "loss": 0.2806,
531
+ "step": 6800
532
+ },
533
+ {
534
+ "epoch": 2.8430160692212607,
535
+ "grad_norm": 0.49498608708381653,
536
+ "learning_rate": 1.8669e-05,
537
+ "loss": 0.2682,
538
+ "step": 6900
539
+ },
540
+ {
541
+ "epoch": 2.88421920065925,
542
+ "grad_norm": 1.339969515800476,
543
+ "learning_rate": 1.807e-05,
544
+ "loss": 0.2435,
545
+ "step": 7000
546
+ },
547
+ {
548
+ "epoch": 2.9254223320972392,
549
+ "grad_norm": 1.8642264604568481,
550
+ "learning_rate": 1.7471e-05,
551
+ "loss": 0.2518,
552
+ "step": 7100
553
+ },
554
+ {
555
+ "epoch": 2.9666254635352285,
556
+ "grad_norm": 2.9471471309661865,
557
+ "learning_rate": 1.6872e-05,
558
+ "loss": 0.2638,
559
+ "step": 7200
560
+ },
561
+ {
562
+ "epoch": 3.0,
563
+ "eval_accuracy_Autre r\u00e9paration": 0.30393303904730357,
564
+ "eval_accuracy_D\u00e9coupe": 0.23455367948789083,
565
+ "eval_accuracy_Emergence": 0.5085131571199683,
566
+ "eval_accuracy_Gla\u00e7age ou Ressuage": 0.003045137463105984,
567
+ "eval_accuracy_Reflet m\u00e9t\u00e9o": 0.0,
568
+ "eval_accuracy_Unlabeled": NaN,
569
+ "eval_iou_Autre r\u00e9paration": 0.2853778307692313,
570
+ "eval_iou_D\u00e9coupe": 0.21276477560584842,
571
+ "eval_iou_Emergence": 0.49725063677040354,
572
+ "eval_iou_Gla\u00e7age ou Ressuage": 0.002998539305038369,
573
+ "eval_iou_Reflet m\u00e9t\u00e9o": 0.0,
574
+ "eval_iou_Unlabeled": 0.0,
575
+ "eval_loss": 0.2419871985912323,
576
+ "eval_mean_accuracy": 0.2100090026236537,
577
+ "eval_mean_iou": 0.16639863040842026,
578
+ "eval_overall_accuracy": 0.2563620151228916,
579
+ "eval_runtime": 137.8421,
580
+ "eval_samples_per_second": 13.153,
581
+ "eval_steps_per_second": 3.294,
582
+ "step": 7281
583
+ },
584
+ {
585
+ "epoch": 3.0078285949732178,
586
+ "grad_norm": 0.4593660533428192,
587
+ "learning_rate": 1.6272999999999998e-05,
588
+ "loss": 0.2486,
589
+ "step": 7300
590
+ },
591
+ {
592
+ "epoch": 3.0490317264112075,
593
+ "grad_norm": 0.8246074318885803,
594
+ "learning_rate": 1.5674e-05,
595
+ "loss": 0.2251,
596
+ "step": 7400
597
+ },
598
+ {
599
+ "epoch": 3.0902348578491967,
600
+ "grad_norm": 0.9824215769767761,
601
+ "learning_rate": 1.5075000000000002e-05,
602
+ "loss": 0.2386,
603
+ "step": 7500
604
+ },
605
+ {
606
+ "epoch": 3.131437989287186,
607
+ "grad_norm": 6.623724937438965,
608
+ "learning_rate": 1.4476e-05,
609
+ "loss": 0.2635,
610
+ "step": 7600
611
+ },
612
+ {
613
+ "epoch": 3.1726411207251752,
614
+ "grad_norm": 0.816888689994812,
615
+ "learning_rate": 1.3877e-05,
616
+ "loss": 0.2821,
617
+ "step": 7700
618
+ },
619
+ {
620
+ "epoch": 3.2138442521631645,
621
+ "grad_norm": 0.45224809646606445,
622
+ "learning_rate": 1.3277999999999999e-05,
623
+ "loss": 0.2238,
624
+ "step": 7800
625
+ },
626
+ {
627
+ "epoch": 3.2550473836011538,
628
+ "grad_norm": 0.9230859279632568,
629
+ "learning_rate": 1.2678999999999998e-05,
630
+ "loss": 0.2238,
631
+ "step": 7900
632
+ },
633
+ {
634
+ "epoch": 3.296250515039143,
635
+ "grad_norm": 2.5414812564849854,
636
+ "learning_rate": 1.2079999999999998e-05,
637
+ "loss": 0.2046,
638
+ "step": 8000
639
+ },
640
+ {
641
+ "epoch": 3.3374536464771323,
642
+ "grad_norm": 1.6467418670654297,
643
+ "learning_rate": 1.1480999999999997e-05,
644
+ "loss": 0.2343,
645
+ "step": 8100
646
+ },
647
+ {
648
+ "epoch": 3.3786567779151215,
649
+ "grad_norm": 0.6073494553565979,
650
+ "learning_rate": 1.0882000000000004e-05,
651
+ "loss": 0.2162,
652
+ "step": 8200
653
+ },
654
+ {
655
+ "epoch": 3.419859909353111,
656
+ "grad_norm": 2.7378017902374268,
657
+ "learning_rate": 1.0283000000000003e-05,
658
+ "loss": 0.2868,
659
+ "step": 8300
660
+ },
661
+ {
662
+ "epoch": 3.4610630407911,
663
+ "grad_norm": 1.4614454507827759,
664
+ "learning_rate": 9.684000000000002e-06,
665
+ "loss": 0.2145,
666
+ "step": 8400
667
+ },
668
+ {
669
+ "epoch": 3.5022661722290893,
670
+ "grad_norm": 2.336061954498291,
671
+ "learning_rate": 9.085000000000002e-06,
672
+ "loss": 0.2918,
673
+ "step": 8500
674
+ },
675
+ {
676
+ "epoch": 3.5434693036670786,
677
+ "grad_norm": 1.7232545614242554,
678
+ "learning_rate": 8.486000000000001e-06,
679
+ "loss": 0.2854,
680
+ "step": 8600
681
+ },
682
+ {
683
+ "epoch": 3.584672435105068,
684
+ "grad_norm": 0.514677882194519,
685
+ "learning_rate": 7.887000000000001e-06,
686
+ "loss": 0.2514,
687
+ "step": 8700
688
+ },
689
+ {
690
+ "epoch": 3.6258755665430575,
691
+ "grad_norm": 0.9662112593650818,
692
+ "learning_rate": 7.2879999999999995e-06,
693
+ "loss": 0.2714,
694
+ "step": 8800
695
+ },
696
+ {
697
+ "epoch": 3.6670786979810464,
698
+ "grad_norm": 10.60983657836914,
699
+ "learning_rate": 6.688999999999999e-06,
700
+ "loss": 0.2548,
701
+ "step": 8900
702
+ },
703
+ {
704
+ "epoch": 3.708281829419036,
705
+ "grad_norm": 2.669593572616577,
706
+ "learning_rate": 6.0899999999999984e-06,
707
+ "loss": 0.2591,
708
+ "step": 9000
709
+ },
710
+ {
711
+ "epoch": 3.749484960857025,
712
+ "grad_norm": 1.071542501449585,
713
+ "learning_rate": 5.490999999999998e-06,
714
+ "loss": 0.2763,
715
+ "step": 9100
716
+ },
717
+ {
718
+ "epoch": 3.7906880922950146,
719
+ "grad_norm": 2.664677381515503,
720
+ "learning_rate": 4.891999999999997e-06,
721
+ "loss": 0.2178,
722
+ "step": 9200
723
+ },
724
+ {
725
+ "epoch": 3.831891223733004,
726
+ "grad_norm": 9.70131778717041,
727
+ "learning_rate": 4.292999999999997e-06,
728
+ "loss": 0.2674,
729
+ "step": 9300
730
+ },
731
+ {
732
+ "epoch": 3.873094355170993,
733
+ "grad_norm": 4.843862056732178,
734
+ "learning_rate": 3.694000000000003e-06,
735
+ "loss": 0.2581,
736
+ "step": 9400
737
+ },
738
+ {
739
+ "epoch": 3.9142974866089824,
740
+ "grad_norm": 0.8629316091537476,
741
+ "learning_rate": 3.0950000000000026e-06,
742
+ "loss": 0.2642,
743
+ "step": 9500
744
+ },
745
+ {
746
+ "epoch": 3.9555006180469716,
747
+ "grad_norm": 4.216986179351807,
748
+ "learning_rate": 2.496000000000002e-06,
749
+ "loss": 0.1965,
750
+ "step": 9600
751
+ },
752
+ {
753
+ "epoch": 3.996703749484961,
754
+ "grad_norm": 2.241065502166748,
755
+ "learning_rate": 1.8970000000000013e-06,
756
+ "loss": 0.2703,
757
+ "step": 9700
758
+ },
759
+ {
760
+ "epoch": 4.0,
761
+ "eval_accuracy_Autre r\u00e9paration": 0.36122042935066995,
762
+ "eval_accuracy_D\u00e9coupe": 0.28433059478878114,
763
+ "eval_accuracy_Emergence": 0.5473337114203988,
764
+ "eval_accuracy_Gla\u00e7age ou Ressuage": 0.0446288018012878,
765
+ "eval_accuracy_Reflet m\u00e9t\u00e9o": 0.0,
766
+ "eval_accuracy_Unlabeled": NaN,
767
+ "eval_iou_Autre r\u00e9paration": 0.33834292206885924,
768
+ "eval_iou_D\u00e9coupe": 0.251159370886517,
769
+ "eval_iou_Emergence": 0.5319718670461905,
770
+ "eval_iou_Gla\u00e7age ou Ressuage": 0.042908421138837,
771
+ "eval_iou_Reflet m\u00e9t\u00e9o": 0.0,
772
+ "eval_iou_Unlabeled": 0.0,
773
+ "eval_loss": 0.2333020269870758,
774
+ "eval_mean_accuracy": 0.24750270747222752,
775
+ "eval_mean_iou": 0.1940637635234006,
776
+ "eval_overall_accuracy": 0.30742270034207847,
777
+ "eval_runtime": 138.4117,
778
+ "eval_samples_per_second": 13.099,
779
+ "eval_steps_per_second": 3.28,
780
+ "step": 9708
781
+ },
782
+ {
783
+ "epoch": 4.03790688092295,
784
+ "grad_norm": 1.052063226699829,
785
+ "learning_rate": 1.298000000000001e-06,
786
+ "loss": 0.242,
787
+ "step": 9800
788
+ },
789
+ {
790
+ "epoch": 4.07911001236094,
791
+ "grad_norm": 6.82352876663208,
792
+ "learning_rate": 6.990000000000005e-07,
793
+ "loss": 0.2482,
794
+ "step": 9900
795
+ },
796
+ {
797
+ "epoch": 4.120313143798929,
798
+ "grad_norm": 2.648499011993408,
799
+ "learning_rate": 1e-07,
800
+ "loss": 0.2197,
801
+ "step": 10000
802
+ },
803
+ {
804
+ "epoch": 4.120313143798929,
805
+ "eval_accuracy_Autre r\u00e9paration": 0.34369405810457515,
806
+ "eval_accuracy_D\u00e9coupe": 0.2864541960267422,
807
+ "eval_accuracy_Emergence": 0.5548598133737452,
808
+ "eval_accuracy_Gla\u00e7age ou Ressuage": 0.03860482159488221,
809
+ "eval_accuracy_Reflet m\u00e9t\u00e9o": 0.0,
810
+ "eval_accuracy_Unlabeled": NaN,
811
+ "eval_iou_Autre r\u00e9paration": 0.32304877421180617,
812
+ "eval_iou_D\u00e9coupe": 0.2515107459482324,
813
+ "eval_iou_Emergence": 0.5379450939388203,
814
+ "eval_iou_Gla\u00e7age ou Ressuage": 0.03692047935180606,
815
+ "eval_iou_Reflet m\u00e9t\u00e9o": 0.0,
816
+ "eval_iou_Unlabeled": 0.0,
817
+ "eval_loss": 0.23602528870105743,
818
+ "eval_mean_accuracy": 0.244722577819989,
819
+ "eval_mean_iou": 0.19157084890844414,
820
+ "eval_overall_accuracy": 0.29617685609695316,
821
+ "eval_runtime": 141.1695,
822
+ "eval_samples_per_second": 12.843,
823
+ "eval_steps_per_second": 3.216,
824
+ "step": 10000
825
+ },
826
+ {
827
+ "epoch": 4.120313143798929,
828
+ "step": 10000,
829
+ "total_flos": 8.912029734867567e+18,
830
+ "train_loss": 0.29374205589294433,
831
+ "train_runtime": 4666.1544,
832
+ "train_samples_per_second": 8.572,
833
+ "train_steps_per_second": 2.143
834
+ }
835
+ ],
836
+ "logging_steps": 100,
837
+ "max_steps": 10000,
838
+ "num_input_tokens_seen": 0,
839
+ "num_train_epochs": 5,
840
+ "save_steps": 500,
841
+ "stateful_callbacks": {
842
+ "TrainerControl": {
843
+ "args": {
844
+ "should_epoch_stop": false,
845
+ "should_evaluate": false,
846
+ "should_log": false,
847
+ "should_save": true,
848
+ "should_training_stop": true
849
+ },
850
+ "attributes": {}
851
+ }
852
+ },
853
+ "total_flos": 8.912029734867567e+18,
854
+ "train_batch_size": 4,
855
+ "trial_name": null,
856
+ "trial_params": null
857
+ }