mervenoyan commited on
Commit
96b7b35
·
1 Parent(s): 8256d51

remove colab badges

Browse files
ColPali_+_Qwen2_VL.ipynb CHANGED
@@ -8,9 +8,6 @@
8
  "source": [
9
  "# Multimodal RAG using ColPali (with Byaldi) and Qwen2-VL\n",
10
  "\n",
11
- "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/merveenoyan/smol-vision/blob/main/ColPali_%2B_Qwen2_VL.ipynb)\n",
12
- "\n",
13
- "\n",
14
  "[ColPali](https://huggingface.co/blog/manu/colpali) is a multimodal retriever that removes the need for hefty and brittle document processors. It natively handles images and processes and encodes image patches to be compatible with text, thus removing need to do OCR, or image captioning.\n",
15
  "\n",
16
  "![ColPali](https://cdn-uploads.huggingface.co/production/uploads/60f2e021adf471cbdf8bb660/La8vRJ_dtobqs6WQGKTzB.png)\n",
@@ -4462,4 +4459,4 @@
4462
  },
4463
  "nbformat": 4,
4464
  "nbformat_minor": 0
4465
- }
 
8
  "source": [
9
  "# Multimodal RAG using ColPali (with Byaldi) and Qwen2-VL\n",
10
  "\n",
 
 
 
11
  "[ColPali](https://huggingface.co/blog/manu/colpali) is a multimodal retriever that removes the need for hefty and brittle document processors. It natively handles images and processes and encodes image patches to be compatible with text, thus removing need to do OCR, or image captioning.\n",
12
  "\n",
13
  "![ColPali](https://cdn-uploads.huggingface.co/production/uploads/60f2e021adf471cbdf8bb660/La8vRJ_dtobqs6WQGKTzB.png)\n",
 
4459
  },
4460
  "nbformat": 4,
4461
  "nbformat_minor": 0
4462
+ }
Faster_Zero_shot_Object_Detection_with_Optimum.ipynb CHANGED
The diff for this file is too large to render. See raw diff
 
Faster_foundation_models_with_torch_compile.ipynb CHANGED
@@ -1,42 +1,28 @@
1
  {
2
- "nbformat": 4,
3
- "nbformat_minor": 0,
4
- "metadata": {
5
- "colab": {
6
- "provenance": [],
7
- "machine_shape": "hm",
8
- "gpuType": "L4"
9
- },
10
- "kernelspec": {
11
- "name": "python3",
12
- "display_name": "Python 3"
13
- },
14
- "language_info": {
15
- "name": "python"
16
- },
17
- "accelerator": "GPU"
18
- },
19
  "cells": [
20
  {
21
  "cell_type": "markdown",
22
- "source": [
23
- "# Faster Foundation Models with `torch.compile`"
24
- ],
25
  "metadata": {
26
  "id": "axYlcDTznci4"
27
- }
 
 
 
28
  },
29
  {
30
  "cell_type": "markdown",
31
- "source": [
32
- "## Introduction to `torch.compile()`"
33
- ],
34
  "metadata": {
35
  "id": "B-yw8KMWsjfY"
36
- }
 
 
 
37
  },
38
  {
39
  "cell_type": "markdown",
 
 
 
40
  "source": [
41
  "This guide aims to provide a benchmark on the inference speed-ups introduced with `torch.compile()` with no reduction in model performance for foundation models in 🤗 Transformers.\n",
42
  "\n",
@@ -47,71 +33,57 @@
47
  "- \"reduce-overhead\" reduces the overhead of python with CUDA graphs, useful for small batches, consumes a lot of memory. As of now only works for CUDA only graphs which do not mutate inputs.\n",
48
  "\n",
49
  "If you have a lot of memory to use, the best speed-up is through `reduce-overhead`. How much speed-up one can get depends on the model, so in this tutorial we will check the most used foundation models."
50
- ],
51
- "metadata": {
52
- "id": "AmmT4aDnqgOB"
53
- }
54
  },
55
  {
56
  "cell_type": "markdown",
 
 
 
57
  "source": [
58
  "## OWLv2\n",
59
  "\n",
60
  "OWLv2 is a zero-shot object detection model released by Google Brain. We will load base version."
61
- ],
62
- "metadata": {
63
- "id": "5sCfbPTn7wBE"
64
- }
65
  },
66
  {
67
  "cell_type": "markdown",
68
- "source": [
69
- "Let's load the model and processor for OWLv2."
70
- ],
71
  "metadata": {
72
  "id": "joeX3J315K0G"
73
- }
 
 
 
74
  },
75
  {
76
  "cell_type": "code",
 
 
 
 
 
77
  "source": [
78
  "from PIL import Image\n",
79
  "import requests\n",
80
  "\n",
81
  "url = 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg'\n",
82
  "image = Image.open(requests.get(url, stream=True).raw)"
83
- ],
84
- "metadata": {
85
- "id": "Ztfcdqkul62z"
86
- },
87
- "execution_count": 1,
88
- "outputs": []
89
  },
90
  {
91
  "cell_type": "code",
92
- "source": [
93
- "from transformers import AutoProcessor, Owlv2ForObjectDetection\n",
94
- "import torch\n",
95
- "import numpy as np\n",
96
- "\n",
97
- "processor = AutoProcessor.from_pretrained(\"google/owlv2-base-patch16-ensemble\")\n",
98
- "model = Owlv2ForObjectDetection.from_pretrained(\"google/owlv2-base-patch16-ensemble\").to(\"cuda\")\n",
99
- "\n",
100
- "texts = [[\"a photo of a bee\", \"a photo of a bird\"]]\n",
101
- "inputs = processor(text=texts, images=image, return_tensors=\"pt\").to(\"cuda\")"
102
- ],
103
  "metadata": {
104
- "id": "84npPHCQpHZ6",
105
  "colab": {
106
  "base_uri": "https://localhost:8080/"
107
  },
 
108
  "outputId": "f30c41c7-b897-460d-d2a4-a1276bf2263e"
109
  },
110
- "execution_count": 2,
111
  "outputs": [
112
  {
113
- "output_type": "stream",
114
  "name": "stderr",
 
115
  "text": [
116
  "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:89: UserWarning: \n",
117
  "The secret `HF_TOKEN` does not exist in your Colab secrets.\n",
@@ -121,96 +93,83 @@
121
  " warnings.warn(\n"
122
  ]
123
  }
 
 
 
 
 
 
 
 
 
 
 
124
  ]
125
  },
126
  {
127
  "cell_type": "markdown",
128
- "source": [
129
- "We can now get to benchmarking. We will benchmark the model itself and the compiled model."
130
- ],
131
  "metadata": {
132
  "id": "3AedkjLu5PRo"
133
- }
 
 
 
134
  },
135
  {
136
  "cell_type": "code",
137
- "source": [
138
- "starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)\n",
139
- "repetitions = 30\n",
140
- "timings=np.zeros((repetitions,1))\n",
141
- "\n",
142
- "for _ in range(10):\n",
143
- " _ = model(**inputs)\n",
144
- "\n",
145
- "with torch.no_grad():\n",
146
- " for rep in range(repetitions):\n",
147
- " torch.cuda.synchronize()\n",
148
- " starter.record()\n",
149
- " output = model(**inputs)\n",
150
- " ender.record()\n",
151
- " torch.cuda.synchronize()\n",
152
- " curr_time = starter.elapsed_time(ender)\n",
153
- " timings[rep] = curr_time\n",
154
- "\n",
155
- "mean_syn = np.sum(timings) / repetitions\n",
156
- "print(mean_syn)\n"
157
- ],
158
  "metadata": {
159
- "id": "RQQSEgkQtXEV",
160
  "colab": {
161
  "base_uri": "https://localhost:8080/"
162
  },
 
163
  "outputId": "8003590b-c4bc-4b3d-9b1b-dade853b8dd8"
164
  },
165
- "execution_count": 3,
166
  "outputs": [
167
  {
168
- "output_type": "stream",
169
  "name": "stdout",
 
170
  "text": [
171
  "255.7331792195638\n"
172
  ]
173
  }
174
- ]
175
- },
176
- {
177
- "cell_type": "code",
178
  "source": [
179
  "starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)\n",
 
180
  "timings=np.zeros((repetitions,1))\n",
181
  "\n",
182
- "compiled_model = torch.compile(model, mode=\"reduce-overhead\").to(\"cuda\")\n",
183
- "\n",
184
- "for _ in range(30):\n",
185
- " with torch.no_grad():\n",
186
- " _ = compiled_model(**inputs)\n",
187
- "\n",
188
  "\n",
189
  "with torch.no_grad():\n",
190
  " for rep in range(repetitions):\n",
191
  " torch.cuda.synchronize()\n",
192
  " starter.record()\n",
193
- " output = compiled_model(**inputs)\n",
194
  " ender.record()\n",
195
  " torch.cuda.synchronize()\n",
196
  " curr_time = starter.elapsed_time(ender)\n",
197
  " timings[rep] = curr_time\n",
198
  "\n",
199
  "mean_syn = np.sum(timings) / repetitions\n",
200
- "print(mean_syn)"
201
- ],
 
 
 
 
202
  "metadata": {
203
- "id": "bEZiNgaupOx6",
204
  "colab": {
205
  "base_uri": "https://localhost:8080/"
206
  },
 
207
  "outputId": "e5d47875-1e40-4997-e533-94bf0ff34d14"
208
  },
209
- "execution_count": 4,
210
  "outputs": [
211
  {
212
- "output_type": "stream",
213
  "name": "stderr",
 
214
  "text": [
215
  "/usr/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.\n",
216
  " self.pid = os.fork()\n",
@@ -225,38 +184,79 @@
225
  ]
226
  },
227
  {
228
- "output_type": "stream",
229
  "name": "stdout",
 
230
  "text": [
231
  "154.6884775797526\n"
232
  ]
233
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
234
  ]
235
  },
236
  {
237
  "cell_type": "markdown",
238
- "source": [
239
- "We got nearly 40 percent speed-up! You can also increase the batch size and see how much further speed-up you can get."
240
- ],
241
  "metadata": {
242
  "id": "d_0d7DwN6gBt"
243
- }
 
 
 
244
  },
245
  {
246
  "cell_type": "code",
 
 
 
 
 
247
  "source": [
248
  "texts = [[\"a photo of a bee\", \"a photo of a bird\"] for _ in range(8)]\n",
249
  "images = [image for _ in range(8)]\n",
250
  "inputs = processor(text=texts, images=image, return_tensors=\"pt\").to(\"cuda\")"
251
- ],
252
- "metadata": {
253
- "id": "exKoOptB61UL"
254
- },
255
- "execution_count": 11,
256
- "outputs": []
257
  },
258
  {
259
  "cell_type": "code",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
260
  "source": [
261
  "starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)\n",
262
  "repetitions = 30\n",
@@ -277,27 +277,27 @@
277
  "\n",
278
  "mean_syn = np.sum(timings) / repetitions\n",
279
  "print(mean_syn)"
280
- ],
 
 
 
 
281
  "metadata": {
282
  "colab": {
283
  "base_uri": "https://localhost:8080/"
284
  },
285
- "id": "EFj9Pgra7Km8",
286
- "outputId": "5fefb8c0-9e86-478c-e9e2-0dbc0fa8a37b"
287
  },
288
- "execution_count": 12,
289
  "outputs": [
290
  {
291
- "output_type": "stream",
292
  "name": "stdout",
 
293
  "text": [
294
- "269.3023401896159\n"
295
  ]
296
  }
297
- ]
298
- },
299
- {
300
- "cell_type": "code",
301
  "source": [
302
  "starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)\n",
303
  "timings=np.zeros((repetitions,1))\n",
@@ -321,24 +321,24 @@
321
  "\n",
322
  "mean_syn = np.sum(timings) / repetitions\n",
323
  "print(mean_syn)"
324
- ],
325
- "metadata": {
326
- "colab": {
327
- "base_uri": "https://localhost:8080/"
328
- },
329
- "id": "OuQZmgTK7UCo",
330
- "outputId": "7184eb1d-b545-4bb6-b544-3effd5c2545a"
331
- },
332
- "execution_count": 13,
333
- "outputs": [
334
- {
335
- "output_type": "stream",
336
- "name": "stdout",
337
- "text": [
338
- "159.77137603759766\n"
339
- ]
340
- }
341
  ]
342
  }
343
- ]
344
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  "cells": [
3
  {
4
  "cell_type": "markdown",
 
 
 
5
  "metadata": {
6
  "id": "axYlcDTznci4"
7
+ },
8
+ "source": [
9
+ "# Faster Foundation Models with `torch.compile`"
10
+ ]
11
  },
12
  {
13
  "cell_type": "markdown",
 
 
 
14
  "metadata": {
15
  "id": "B-yw8KMWsjfY"
16
+ },
17
+ "source": [
18
+ "## Introduction to `torch.compile()`"
19
+ ]
20
  },
21
  {
22
  "cell_type": "markdown",
23
+ "metadata": {
24
+ "id": "AmmT4aDnqgOB"
25
+ },
26
  "source": [
27
  "This guide aims to provide a benchmark on the inference speed-ups introduced with `torch.compile()` with no reduction in model performance for foundation models in 🤗 Transformers.\n",
28
  "\n",
 
33
  "- \"reduce-overhead\" reduces the overhead of python with CUDA graphs, useful for small batches, consumes a lot of memory. As of now only works for CUDA only graphs which do not mutate inputs.\n",
34
  "\n",
35
  "If you have a lot of memory to use, the best speed-up is through `reduce-overhead`. How much speed-up one can get depends on the model, so in this tutorial we will check the most used foundation models."
36
+ ]
 
 
 
37
  },
38
  {
39
  "cell_type": "markdown",
40
+ "metadata": {
41
+ "id": "5sCfbPTn7wBE"
42
+ },
43
  "source": [
44
  "## OWLv2\n",
45
  "\n",
46
  "OWLv2 is a zero-shot object detection model released by Google Brain. We will load base version."
47
+ ]
 
 
 
48
  },
49
  {
50
  "cell_type": "markdown",
 
 
 
51
  "metadata": {
52
  "id": "joeX3J315K0G"
53
+ },
54
+ "source": [
55
+ "Let's load the model and processor for OWLv2."
56
+ ]
57
  },
58
  {
59
  "cell_type": "code",
60
+ "execution_count": 1,
61
+ "metadata": {
62
+ "id": "Ztfcdqkul62z"
63
+ },
64
+ "outputs": [],
65
  "source": [
66
  "from PIL import Image\n",
67
  "import requests\n",
68
  "\n",
69
  "url = 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg'\n",
70
  "image = Image.open(requests.get(url, stream=True).raw)"
71
+ ]
 
 
 
 
 
72
  },
73
  {
74
  "cell_type": "code",
75
+ "execution_count": 2,
 
 
 
 
 
 
 
 
 
 
76
  "metadata": {
 
77
  "colab": {
78
  "base_uri": "https://localhost:8080/"
79
  },
80
+ "id": "84npPHCQpHZ6",
81
  "outputId": "f30c41c7-b897-460d-d2a4-a1276bf2263e"
82
  },
 
83
  "outputs": [
84
  {
 
85
  "name": "stderr",
86
+ "output_type": "stream",
87
  "text": [
88
  "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:89: UserWarning: \n",
89
  "The secret `HF_TOKEN` does not exist in your Colab secrets.\n",
 
93
  " warnings.warn(\n"
94
  ]
95
  }
96
+ ],
97
+ "source": [
98
+ "from transformers import AutoProcessor, Owlv2ForObjectDetection\n",
99
+ "import torch\n",
100
+ "import numpy as np\n",
101
+ "\n",
102
+ "processor = AutoProcessor.from_pretrained(\"google/owlv2-base-patch16-ensemble\")\n",
103
+ "model = Owlv2ForObjectDetection.from_pretrained(\"google/owlv2-base-patch16-ensemble\").to(\"cuda\")\n",
104
+ "\n",
105
+ "texts = [[\"a photo of a bee\", \"a photo of a bird\"]]\n",
106
+ "inputs = processor(text=texts, images=image, return_tensors=\"pt\").to(\"cuda\")"
107
  ]
108
  },
109
  {
110
  "cell_type": "markdown",
 
 
 
111
  "metadata": {
112
  "id": "3AedkjLu5PRo"
113
+ },
114
+ "source": [
115
+ "We can now get to benchmarking. We will benchmark the model itself and the compiled model."
116
+ ]
117
  },
118
  {
119
  "cell_type": "code",
120
+ "execution_count": 3,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
121
  "metadata": {
 
122
  "colab": {
123
  "base_uri": "https://localhost:8080/"
124
  },
125
+ "id": "RQQSEgkQtXEV",
126
  "outputId": "8003590b-c4bc-4b3d-9b1b-dade853b8dd8"
127
  },
 
128
  "outputs": [
129
  {
 
130
  "name": "stdout",
131
+ "output_type": "stream",
132
  "text": [
133
  "255.7331792195638\n"
134
  ]
135
  }
136
+ ],
 
 
 
137
  "source": [
138
  "starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)\n",
139
+ "repetitions = 30\n",
140
  "timings=np.zeros((repetitions,1))\n",
141
  "\n",
142
+ "for _ in range(10):\n",
143
+ " _ = model(**inputs)\n",
 
 
 
 
144
  "\n",
145
  "with torch.no_grad():\n",
146
  " for rep in range(repetitions):\n",
147
  " torch.cuda.synchronize()\n",
148
  " starter.record()\n",
149
+ " output = model(**inputs)\n",
150
  " ender.record()\n",
151
  " torch.cuda.synchronize()\n",
152
  " curr_time = starter.elapsed_time(ender)\n",
153
  " timings[rep] = curr_time\n",
154
  "\n",
155
  "mean_syn = np.sum(timings) / repetitions\n",
156
+ "print(mean_syn)\n"
157
+ ]
158
+ },
159
+ {
160
+ "cell_type": "code",
161
+ "execution_count": 4,
162
  "metadata": {
 
163
  "colab": {
164
  "base_uri": "https://localhost:8080/"
165
  },
166
+ "id": "bEZiNgaupOx6",
167
  "outputId": "e5d47875-1e40-4997-e533-94bf0ff34d14"
168
  },
 
169
  "outputs": [
170
  {
 
171
  "name": "stderr",
172
+ "output_type": "stream",
173
  "text": [
174
  "/usr/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.\n",
175
  " self.pid = os.fork()\n",
 
184
  ]
185
  },
186
  {
 
187
  "name": "stdout",
188
+ "output_type": "stream",
189
  "text": [
190
  "154.6884775797526\n"
191
  ]
192
  }
193
+ ],
194
+ "source": [
195
+ "starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)\n",
196
+ "timings=np.zeros((repetitions,1))\n",
197
+ "\n",
198
+ "compiled_model = torch.compile(model, mode=\"reduce-overhead\").to(\"cuda\")\n",
199
+ "\n",
200
+ "for _ in range(30):\n",
201
+ " with torch.no_grad():\n",
202
+ " _ = compiled_model(**inputs)\n",
203
+ "\n",
204
+ "\n",
205
+ "with torch.no_grad():\n",
206
+ " for rep in range(repetitions):\n",
207
+ " torch.cuda.synchronize()\n",
208
+ " starter.record()\n",
209
+ " output = compiled_model(**inputs)\n",
210
+ " ender.record()\n",
211
+ " torch.cuda.synchronize()\n",
212
+ " curr_time = starter.elapsed_time(ender)\n",
213
+ " timings[rep] = curr_time\n",
214
+ "\n",
215
+ "mean_syn = np.sum(timings) / repetitions\n",
216
+ "print(mean_syn)"
217
  ]
218
  },
219
  {
220
  "cell_type": "markdown",
 
 
 
221
  "metadata": {
222
  "id": "d_0d7DwN6gBt"
223
+ },
224
+ "source": [
225
+ "We got nearly 40 percent speed-up! You can also increase the batch size and see how much further speed-up you can get."
226
+ ]
227
  },
228
  {
229
  "cell_type": "code",
230
+ "execution_count": 11,
231
+ "metadata": {
232
+ "id": "exKoOptB61UL"
233
+ },
234
+ "outputs": [],
235
  "source": [
236
  "texts = [[\"a photo of a bee\", \"a photo of a bird\"] for _ in range(8)]\n",
237
  "images = [image for _ in range(8)]\n",
238
  "inputs = processor(text=texts, images=image, return_tensors=\"pt\").to(\"cuda\")"
239
+ ]
 
 
 
 
 
240
  },
241
  {
242
  "cell_type": "code",
243
+ "execution_count": 12,
244
+ "metadata": {
245
+ "colab": {
246
+ "base_uri": "https://localhost:8080/"
247
+ },
248
+ "id": "EFj9Pgra7Km8",
249
+ "outputId": "5fefb8c0-9e86-478c-e9e2-0dbc0fa8a37b"
250
+ },
251
+ "outputs": [
252
+ {
253
+ "name": "stdout",
254
+ "output_type": "stream",
255
+ "text": [
256
+ "269.3023401896159\n"
257
+ ]
258
+ }
259
+ ],
260
  "source": [
261
  "starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)\n",
262
  "repetitions = 30\n",
 
277
  "\n",
278
  "mean_syn = np.sum(timings) / repetitions\n",
279
  "print(mean_syn)"
280
+ ]
281
+ },
282
+ {
283
+ "cell_type": "code",
284
+ "execution_count": 13,
285
  "metadata": {
286
  "colab": {
287
  "base_uri": "https://localhost:8080/"
288
  },
289
+ "id": "OuQZmgTK7UCo",
290
+ "outputId": "7184eb1d-b545-4bb6-b544-3effd5c2545a"
291
  },
 
292
  "outputs": [
293
  {
 
294
  "name": "stdout",
295
+ "output_type": "stream",
296
  "text": [
297
+ "159.77137603759766\n"
298
  ]
299
  }
300
+ ],
 
 
 
301
  "source": [
302
  "starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)\n",
303
  "timings=np.zeros((repetitions,1))\n",
 
321
  "\n",
322
  "mean_syn = np.sum(timings) / repetitions\n",
323
  "print(mean_syn)"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
324
  ]
325
  }
326
+ ],
327
+ "metadata": {
328
+ "accelerator": "GPU",
329
+ "colab": {
330
+ "gpuType": "L4",
331
+ "machine_shape": "hm",
332
+ "provenance": []
333
+ },
334
+ "kernelspec": {
335
+ "display_name": "Python 3",
336
+ "name": "python3"
337
+ },
338
+ "language_info": {
339
+ "name": "python"
340
+ }
341
+ },
342
+ "nbformat": 4,
343
+ "nbformat_minor": 0
344
+ }
Fine_tune_Florence_2.ipynb CHANGED
The diff for this file is too large to render. See raw diff
 
Fine_tune_PaliGemma.ipynb CHANGED
@@ -1,15 +1,5 @@
1
  {
2
  "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {
6
- "id": "view-in-github",
7
- "colab_type": "text"
8
- },
9
- "source": [
10
- "<a href=\"https://colab.research.google.com/github/merveenoyan/smol-vision/blob/main/Fine_tune_PaliGemma.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
11
- ]
12
- },
13
  {
14
  "cell_type": "markdown",
15
  "metadata": {
@@ -23,21 +13,18 @@
23
  },
24
  {
25
  "cell_type": "code",
26
- "source": [
27
- "!pip install -q -U datasets bitsandbytes peft git+https://github.com/huggingface/transformers.git"
28
- ],
29
  "metadata": {
30
- "id": "EB0gv8OzHfLV",
31
  "colab": {
32
  "base_uri": "https://localhost:8080/"
33
  },
 
34
  "outputId": "9de07e75-ddf4-4347-fc41-432a23774e2c"
35
  },
36
- "execution_count": 1,
37
  "outputs": [
38
  {
39
- "output_type": "stream",
40
  "name": "stdout",
 
41
  "text": [
42
  " Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
43
  " Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
@@ -55,6 +42,9 @@
55
  "\u001b[0m"
56
  ]
57
  }
 
 
 
58
  ]
59
  },
60
  {
@@ -70,7 +60,6 @@
70
  "cell_type": "code",
71
  "execution_count": 2,
72
  "metadata": {
73
- "id": "NzJZSHD8tZZy",
74
  "colab": {
75
  "base_uri": "https://localhost:8080/",
76
  "height": 17,
@@ -97,22 +86,23 @@
97
  "80df5f3cd6c646808b09d99daed5bfd2"
98
  ]
99
  },
 
100
  "outputId": "c01b2b6f-3c1e-45da-9fc0-f4f518bcca24"
101
  },
102
  "outputs": [
103
  {
104
- "output_type": "display_data",
105
  "data": {
106
- "text/plain": [
107
- "VBox(children=(HTML(value='<center> <img\\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…"
108
- ],
109
  "application/vnd.jupyter.widget-view+json": {
 
110
  "version_major": 2,
111
- "version_minor": 0,
112
- "model_id": "4f0e85aa740146d3aca81588a0288031"
113
- }
 
 
114
  },
115
- "metadata": {}
 
116
  }
117
  ],
118
  "source": [
@@ -133,16 +123,16 @@
133
  "cell_type": "code",
134
  "execution_count": 1,
135
  "metadata": {
136
- "id": "az5kdSbNpjgH",
137
  "colab": {
138
  "base_uri": "https://localhost:8080/"
139
  },
 
140
  "outputId": "2d9f379c-eb31-45b0-b84c-79c2a2577d01"
141
  },
142
  "outputs": [
143
  {
144
- "output_type": "stream",
145
  "name": "stderr",
 
146
  "text": [
147
  "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning: \n",
148
  "The secret `HF_TOKEN` does not exist in your Colab secrets.\n",
@@ -174,15 +164,14 @@
174
  "cell_type": "code",
175
  "execution_count": 3,
176
  "metadata": {
177
- "id": "TNJW2ty4yy4L",
178
  "colab": {
179
  "base_uri": "https://localhost:8080/"
180
  },
 
181
  "outputId": "f76414b2-8f37-48ae-d369-b977323fa892"
182
  },
183
  "outputs": [
184
  {
185
- "output_type": "execute_result",
186
  "data": {
187
  "text/plain": [
188
  "Dataset({\n",
@@ -191,8 +180,9 @@
191
  "})"
192
  ]
193
  },
 
194
  "metadata": {},
195
- "execution_count": 3
196
  }
197
  ],
198
  "source": [
@@ -224,7 +214,6 @@
224
  "cell_type": "code",
225
  "execution_count": 5,
226
  "metadata": {
227
- "id": "iZRvrfUquH1y",
228
  "colab": {
229
  "base_uri": "https://localhost:8080/",
230
  "height": 49,
@@ -242,22 +231,23 @@
242
  "a9a5503caf384b93bf987e5271a577d2"
243
  ]
244
  },
 
245
  "outputId": "34f12289-6ef4-49d9-9257-ad0328961190"
246
  },
247
  "outputs": [
248
  {
249
- "output_type": "display_data",
250
  "data": {
251
- "text/plain": [
252
- "Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]"
253
- ],
254
  "application/vnd.jupyter.widget-view+json": {
 
255
  "version_major": 2,
256
- "version_minor": 0,
257
- "model_id": "8458933373264dbeb58d0b5ace4fd9c6"
258
- }
 
 
259
  },
260
- "metadata": {}
 
261
  }
262
  ],
263
  "source": [
@@ -286,7 +276,6 @@
286
  "cell_type": "code",
287
  "execution_count": 6,
288
  "metadata": {
289
- "id": "9AYeuyzNuJ9X",
290
  "colab": {
291
  "base_uri": "https://localhost:8080/",
292
  "height": 66,
@@ -304,26 +293,27 @@
304
  "8859eb8d9c154cb79a302db1568768fa"
305
  ]
306
  },
 
307
  "outputId": "aaedd707-f694-4ba8-ba43-7ae2a3739e73"
308
  },
309
  "outputs": [
310
  {
311
- "output_type": "display_data",
312
  "data": {
313
- "text/plain": [
314
- "Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]"
315
- ],
316
  "application/vnd.jupyter.widget-view+json": {
 
317
  "version_major": 2,
318
- "version_minor": 0,
319
- "model_id": "c68f0fe7a6bb4060afcb05e3f6422288"
320
- }
 
 
321
  },
322
- "metadata": {}
 
323
  },
324
  {
325
- "output_type": "stream",
326
  "name": "stdout",
 
327
  "text": [
328
  "trainable params: 11,876,352 || all params: 3,044,118,768 || trainable%: 0.3901\n"
329
  ]
@@ -349,23 +339,23 @@
349
  },
350
  {
351
  "cell_type": "markdown",
352
- "source": [
353
- "We need to take tokens to same dtype as model so need to store it as a variable."
354
- ],
355
  "metadata": {
356
  "id": "sfxtN1iKRWXX"
357
- }
 
 
 
358
  },
359
  {
360
  "cell_type": "code",
361
- "source": [
362
- "DTYPE = model.dtype"
363
- ],
364
  "metadata": {
365
  "id": "uGZ6FnioRWEc"
366
  },
367
- "execution_count": 7,
368
- "outputs": []
 
 
369
  },
370
  {
371
  "cell_type": "markdown",
@@ -378,14 +368,14 @@
378
  },
379
  {
380
  "cell_type": "code",
381
- "source": [
382
- "processor = PaliGemmaProcessor.from_pretrained(model_id)"
383
- ],
384
  "metadata": {
385
  "id": "wQ_gbnXARKz1"
386
  },
387
- "execution_count": 8,
388
- "outputs": []
 
 
389
  },
390
  {
391
  "cell_type": "markdown",
@@ -486,13 +476,13 @@
486
  },
487
  {
488
  "cell_type": "markdown",
 
 
 
489
  "source": [
490
  "LoRA with bsz of 2 works on A100 Colab. You can apply gradient accumulation (which is enabled in this notebook) to simulate larger batch sizes.\n",
491
  "Currently there's an issue with QLoRA, we are investigating and will solve soon."
492
- ],
493
- "metadata": {
494
- "id": "ZX912_liP-Eh"
495
- }
496
  },
497
  {
498
  "cell_type": "code",
@@ -530,8 +520,8 @@
530
  "accelerator": "GPU",
531
  "colab": {
532
  "gpuType": "A100",
533
- "provenance": [],
534
- "include_colab_link": true
535
  },
536
  "kernelspec": {
537
  "display_name": "Python 3",
@@ -551,116 +541,92 @@
551
  },
552
  "widgets": {
553
  "application/vnd.jupyter.widget-state+json": {
554
- "4f0e85aa740146d3aca81588a0288031": {
555
- "model_module": "@jupyter-widgets/controls",
556
- "model_name": "VBoxModel",
557
- "model_module_version": "1.5.0",
558
- "state": {
559
- "_dom_classes": [],
560
- "_model_module": "@jupyter-widgets/controls",
561
- "_model_module_version": "1.5.0",
562
- "_model_name": "VBoxModel",
563
- "_view_count": null,
564
- "_view_module": "@jupyter-widgets/controls",
565
- "_view_module_version": "1.5.0",
566
- "_view_name": "VBoxView",
567
- "box_style": "",
568
- "children": [],
569
- "layout": "IPY_MODEL_c25efe32ee7c40d3a4c95093abb2a720"
570
- }
571
- },
572
- "c7fcb9dd46e649c4b8bd967b69bdb867": {
573
- "model_module": "@jupyter-widgets/controls",
574
- "model_name": "HTMLModel",
575
- "model_module_version": "1.5.0",
576
- "state": {
577
- "_dom_classes": [],
578
- "_model_module": "@jupyter-widgets/controls",
579
- "_model_module_version": "1.5.0",
580
- "_model_name": "HTMLModel",
581
- "_view_count": null,
582
- "_view_module": "@jupyter-widgets/controls",
583
- "_view_module_version": "1.5.0",
584
- "_view_name": "HTMLView",
585
- "description": "",
586
- "description_tooltip": null,
587
- "layout": "IPY_MODEL_55c01e2c04d1499ca5b9b19dea7e4e02",
588
- "placeholder": "​",
589
- "style": "IPY_MODEL_bf9da831d7ad4651a262c5e7f80bbf87",
590
- "value": "<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.svg\nalt='Hugging Face'> <br> Copy a token from <a\nhref=\"https://huggingface.co/settings/tokens\" target=\"_blank\">your Hugging Face\ntokens page</a> and paste it below. <br> Immediately click login after copying\nyour token or it might be stored in plain text in this notebook file. </center>"
591
- }
592
- },
593
- "c3fad0f1cb954317a20ee158f7e10363": {
594
- "model_module": "@jupyter-widgets/controls",
595
- "model_name": "PasswordModel",
596
- "model_module_version": "1.5.0",
597
  "state": {
598
- "_dom_classes": [],
599
- "_model_module": "@jupyter-widgets/controls",
600
- "_model_module_version": "1.5.0",
601
- "_model_name": "PasswordModel",
602
  "_view_count": null,
603
- "_view_module": "@jupyter-widgets/controls",
604
- "_view_module_version": "1.5.0",
605
- "_view_name": "PasswordView",
606
- "continuous_update": true,
607
- "description": "Token:",
608
- "description_tooltip": null,
609
- "disabled": false,
610
- "layout": "IPY_MODEL_ed2d3d1a700143d2a48e9a9b13bd1200",
611
- "placeholder": "​",
612
- "style": "IPY_MODEL_40782cfc43a8437da5534feee03c6ba6",
613
- "value": ""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
614
  }
615
  },
616
- "3deca9286f89422aa691325b39347b0b": {
617
  "model_module": "@jupyter-widgets/controls",
618
- "model_name": "CheckboxModel",
619
  "model_module_version": "1.5.0",
 
620
  "state": {
621
- "_dom_classes": [],
622
  "_model_module": "@jupyter-widgets/controls",
623
  "_model_module_version": "1.5.0",
624
- "_model_name": "CheckboxModel",
625
  "_view_count": null,
626
- "_view_module": "@jupyter-widgets/controls",
627
- "_view_module_version": "1.5.0",
628
- "_view_name": "CheckboxView",
629
- "description": "Add token as git credential?",
630
- "description_tooltip": null,
631
- "disabled": false,
632
- "indent": true,
633
- "layout": "IPY_MODEL_b6fac3155dd140bc8e1b010270bc3cc2",
634
- "style": "IPY_MODEL_ca348c721475417582ed5018ed43151f",
635
- "value": true
636
  }
637
  },
638
- "ca1c290bfb654f1190bbde68d51167f1": {
639
  "model_module": "@jupyter-widgets/controls",
640
- "model_name": "ButtonModel",
641
  "model_module_version": "1.5.0",
 
642
  "state": {
643
- "_dom_classes": [],
644
  "_model_module": "@jupyter-widgets/controls",
645
  "_model_module_version": "1.5.0",
646
- "_model_name": "ButtonModel",
647
  "_view_count": null,
648
- "_view_module": "@jupyter-widgets/controls",
649
- "_view_module_version": "1.5.0",
650
- "_view_name": "ButtonView",
651
- "button_style": "",
652
- "description": "Login",
653
- "disabled": false,
654
- "icon": "",
655
- "layout": "IPY_MODEL_3f07afac7c194db7a16167d177562a46",
656
- "style": "IPY_MODEL_5515d96f0c8947f0ad4b7f17eb7d63f6",
657
- "tooltip": ""
658
  }
659
  },
660
  "2d8493a60b7a42c1b25ec0bbe0a59043": {
661
  "model_module": "@jupyter-widgets/controls",
662
- "model_name": "HTMLModel",
663
  "model_module_version": "1.5.0",
 
664
  "state": {
665
  "_dom_classes": [],
666
  "_model_module": "@jupyter-widgets/controls",
@@ -678,10 +644,10 @@
678
  "value": "\n<b>Pro Tip:</b> If you don't already have one, you can create a dedicated\n'notebooks' token with 'write' access, that you can then easily reuse for all\nnotebooks. </center>"
679
  }
680
  },
681
- "c25efe32ee7c40d3a4c95093abb2a720": {
682
  "model_module": "@jupyter-widgets/base",
683
- "model_name": "LayoutModel",
684
  "model_module_version": "1.2.0",
 
685
  "state": {
686
  "_model_module": "@jupyter-widgets/base",
687
  "_model_module_version": "1.2.0",
@@ -691,13 +657,13 @@
691
  "_view_module_version": "1.2.0",
692
  "_view_name": "LayoutView",
693
  "align_content": null,
694
- "align_items": "center",
695
  "align_self": null,
696
  "border": null,
697
  "bottom": null,
698
- "display": "flex",
699
  "flex": null,
700
- "flex_flow": "column",
701
  "grid_area": null,
702
  "grid_auto_columns": null,
703
  "grid_auto_flow": null,
@@ -727,80 +693,35 @@
727
  "right": null,
728
  "top": null,
729
  "visibility": null,
730
- "width": "50%"
731
  }
732
  },
733
- "55c01e2c04d1499ca5b9b19dea7e4e02": {
734
- "model_module": "@jupyter-widgets/base",
735
- "model_name": "LayoutModel",
736
- "model_module_version": "1.2.0",
737
- "state": {
738
- "_model_module": "@jupyter-widgets/base",
739
- "_model_module_version": "1.2.0",
740
- "_model_name": "LayoutModel",
741
- "_view_count": null,
742
- "_view_module": "@jupyter-widgets/base",
743
- "_view_module_version": "1.2.0",
744
- "_view_name": "LayoutView",
745
- "align_content": null,
746
- "align_items": null,
747
- "align_self": null,
748
- "border": null,
749
- "bottom": null,
750
- "display": null,
751
- "flex": null,
752
- "flex_flow": null,
753
- "grid_area": null,
754
- "grid_auto_columns": null,
755
- "grid_auto_flow": null,
756
- "grid_auto_rows": null,
757
- "grid_column": null,
758
- "grid_gap": null,
759
- "grid_row": null,
760
- "grid_template_areas": null,
761
- "grid_template_columns": null,
762
- "grid_template_rows": null,
763
- "height": null,
764
- "justify_content": null,
765
- "justify_items": null,
766
- "left": null,
767
- "margin": null,
768
- "max_height": null,
769
- "max_width": null,
770
- "min_height": null,
771
- "min_width": null,
772
- "object_fit": null,
773
- "object_position": null,
774
- "order": null,
775
- "overflow": null,
776
- "overflow_x": null,
777
- "overflow_y": null,
778
- "padding": null,
779
- "right": null,
780
- "top": null,
781
- "visibility": null,
782
- "width": null
783
- }
784
- },
785
- "bf9da831d7ad4651a262c5e7f80bbf87": {
786
  "model_module": "@jupyter-widgets/controls",
787
- "model_name": "DescriptionStyleModel",
788
  "model_module_version": "1.5.0",
 
789
  "state": {
 
790
  "_model_module": "@jupyter-widgets/controls",
791
  "_model_module_version": "1.5.0",
792
- "_model_name": "DescriptionStyleModel",
793
  "_view_count": null,
794
- "_view_module": "@jupyter-widgets/base",
795
- "_view_module_version": "1.2.0",
796
- "_view_name": "StyleView",
797
- "description_width": ""
 
 
 
 
 
 
798
  }
799
  },
800
- "ed2d3d1a700143d2a48e9a9b13bd1200": {
801
  "model_module": "@jupyter-widgets/base",
802
- "model_name": "LayoutModel",
803
  "model_module_version": "1.2.0",
 
804
  "state": {
805
  "_model_module": "@jupyter-widgets/base",
806
  "_model_module_version": "1.2.0",
@@ -851,8 +772,8 @@
851
  },
852
  "40782cfc43a8437da5534feee03c6ba6": {
853
  "model_module": "@jupyter-widgets/controls",
854
- "model_name": "DescriptionStyleModel",
855
  "model_module_version": "1.5.0",
 
856
  "state": {
857
  "_model_module": "@jupyter-widgets/controls",
858
  "_model_module_version": "1.5.0",
@@ -864,10 +785,10 @@
864
  "description_width": ""
865
  }
866
  },
867
- "b6fac3155dd140bc8e1b010270bc3cc2": {
868
  "model_module": "@jupyter-widgets/base",
869
- "model_name": "LayoutModel",
870
  "model_module_version": "1.2.0",
 
871
  "state": {
872
  "_model_module": "@jupyter-widgets/base",
873
  "_model_module_version": "1.2.0",
@@ -916,77 +837,28 @@
916
  "width": null
917
  }
918
  },
919
- "ca348c721475417582ed5018ed43151f": {
920
  "model_module": "@jupyter-widgets/controls",
921
- "model_name": "DescriptionStyleModel",
922
  "model_module_version": "1.5.0",
 
923
  "state": {
 
924
  "_model_module": "@jupyter-widgets/controls",
925
  "_model_module_version": "1.5.0",
926
- "_model_name": "DescriptionStyleModel",
927
- "_view_count": null,
928
- "_view_module": "@jupyter-widgets/base",
929
- "_view_module_version": "1.2.0",
930
- "_view_name": "StyleView",
931
- "description_width": ""
932
- }
933
- },
934
- "3f07afac7c194db7a16167d177562a46": {
935
- "model_module": "@jupyter-widgets/base",
936
- "model_name": "LayoutModel",
937
- "model_module_version": "1.2.0",
938
- "state": {
939
- "_model_module": "@jupyter-widgets/base",
940
- "_model_module_version": "1.2.0",
941
- "_model_name": "LayoutModel",
942
  "_view_count": null,
943
- "_view_module": "@jupyter-widgets/base",
944
- "_view_module_version": "1.2.0",
945
- "_view_name": "LayoutView",
946
- "align_content": null,
947
- "align_items": null,
948
- "align_self": null,
949
- "border": null,
950
- "bottom": null,
951
- "display": null,
952
- "flex": null,
953
- "flex_flow": null,
954
- "grid_area": null,
955
- "grid_auto_columns": null,
956
- "grid_auto_flow": null,
957
- "grid_auto_rows": null,
958
- "grid_column": null,
959
- "grid_gap": null,
960
- "grid_row": null,
961
- "grid_template_areas": null,
962
- "grid_template_columns": null,
963
- "grid_template_rows": null,
964
- "height": null,
965
- "justify_content": null,
966
- "justify_items": null,
967
- "left": null,
968
- "margin": null,
969
- "max_height": null,
970
- "max_width": null,
971
- "min_height": null,
972
- "min_width": null,
973
- "object_fit": null,
974
- "object_position": null,
975
- "order": null,
976
- "overflow": null,
977
- "overflow_x": null,
978
- "overflow_y": null,
979
- "padding": null,
980
- "right": null,
981
- "top": null,
982
- "visibility": null,
983
- "width": null
984
  }
985
  },
986
  "5515d96f0c8947f0ad4b7f17eb7d63f6": {
987
  "model_module": "@jupyter-widgets/controls",
988
- "model_name": "ButtonStyleModel",
989
  "model_module_version": "1.5.0",
 
990
  "state": {
991
  "_model_module": "@jupyter-widgets/controls",
992
  "_model_module_version": "1.5.0",
@@ -999,10 +871,10 @@
999
  "font_weight": ""
1000
  }
1001
  },
1002
- "d703de12cf9d4f87aa6ec2cc52f1090a": {
1003
  "model_module": "@jupyter-widgets/base",
1004
- "model_name": "LayoutModel",
1005
  "model_module_version": "1.2.0",
 
1006
  "state": {
1007
  "_model_module": "@jupyter-widgets/base",
1008
  "_model_module_version": "1.2.0",
@@ -1051,25 +923,26 @@
1051
  "width": null
1052
  }
1053
  },
1054
- "757bc788bd6842d28a9f889187ffb88e": {
1055
  "model_module": "@jupyter-widgets/controls",
1056
- "model_name": "DescriptionStyleModel",
1057
  "model_module_version": "1.5.0",
 
1058
  "state": {
1059
  "_model_module": "@jupyter-widgets/controls",
1060
  "_model_module_version": "1.5.0",
1061
- "_model_name": "DescriptionStyleModel",
1062
  "_view_count": null,
1063
  "_view_module": "@jupyter-widgets/base",
1064
  "_view_module_version": "1.2.0",
1065
  "_view_name": "StyleView",
 
1066
  "description_width": ""
1067
  }
1068
  },
1069
  "65f10d2456cb4ee1963fac050e4c34f7": {
1070
  "model_module": "@jupyter-widgets/controls",
1071
- "model_name": "LabelModel",
1072
  "model_module_version": "1.5.0",
 
1073
  "state": {
1074
  "_dom_classes": [],
1075
  "_model_module": "@jupyter-widgets/controls",
@@ -1087,10 +960,67 @@
1087
  "value": "Connecting..."
1088
  }
1089
  },
1090
- "9335e48fe8ba4fe9b535b5ece1be6ff5": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1091
  "model_module": "@jupyter-widgets/base",
1092
- "model_name": "LayoutModel",
1093
  "model_module_version": "1.2.0",
 
1094
  "state": {
1095
  "_model_module": "@jupyter-widgets/base",
1096
  "_model_module_version": "1.2.0",
@@ -1141,8 +1071,8 @@
1141
  },
1142
  "80df5f3cd6c646808b09d99daed5bfd2": {
1143
  "model_module": "@jupyter-widgets/controls",
1144
- "model_name": "DescriptionStyleModel",
1145
  "model_module_version": "1.5.0",
 
1146
  "state": {
1147
  "_model_module": "@jupyter-widgets/controls",
1148
  "_model_module_version": "1.5.0",
@@ -1156,8 +1086,8 @@
1156
  },
1157
  "8458933373264dbeb58d0b5ace4fd9c6": {
1158
  "model_module": "@jupyter-widgets/controls",
1159
- "model_name": "HBoxModel",
1160
  "model_module_version": "1.5.0",
 
1161
  "state": {
1162
  "_dom_classes": [],
1163
  "_model_module": "@jupyter-widgets/controls",
@@ -1176,76 +1106,65 @@
1176
  "layout": "IPY_MODEL_46810cc7c7c54e31a65e609c386d86d9"
1177
  }
1178
  },
1179
- "714009484da745dc8a87e5066b939de2": {
1180
  "model_module": "@jupyter-widgets/controls",
1181
- "model_name": "HTMLModel",
1182
  "model_module_version": "1.5.0",
 
1183
  "state": {
1184
- "_dom_classes": [],
1185
  "_model_module": "@jupyter-widgets/controls",
1186
  "_model_module_version": "1.5.0",
1187
- "_model_name": "HTMLModel",
1188
  "_view_count": null,
1189
- "_view_module": "@jupyter-widgets/controls",
1190
- "_view_module_version": "1.5.0",
1191
- "_view_name": "HTMLView",
1192
- "description": "",
1193
- "description_tooltip": null,
1194
- "layout": "IPY_MODEL_cfed7deef0b74f4b9d160e9fdc2b138e",
1195
- "placeholder": "​",
1196
- "style": "IPY_MODEL_23ddab24ac304751b3babfaeec9360eb",
1197
- "value": "Loading checkpoint shards: 100%"
1198
  }
1199
  },
1200
- "e43e970ce8ba477e83081a4c7fea05f5": {
1201
  "model_module": "@jupyter-widgets/controls",
1202
- "model_name": "FloatProgressModel",
1203
  "model_module_version": "1.5.0",
 
1204
  "state": {
1205
- "_dom_classes": [],
1206
  "_model_module": "@jupyter-widgets/controls",
1207
  "_model_module_version": "1.5.0",
1208
- "_model_name": "FloatProgressModel",
1209
  "_view_count": null,
1210
- "_view_module": "@jupyter-widgets/controls",
1211
- "_view_module_version": "1.5.0",
1212
- "_view_name": "ProgressView",
1213
- "bar_style": "success",
1214
- "description": "",
1215
- "description_tooltip": null,
1216
- "layout": "IPY_MODEL_79e87175ffb949bd8cddf4577210a42d",
1217
- "max": 2,
1218
- "min": 0,
1219
- "orientation": "horizontal",
1220
- "style": "IPY_MODEL_5aed84a20ac34f2b943d26d66decc88f",
1221
- "value": 2
1222
  }
1223
  },
1224
- "7138aa9537fc4b4f809e57665be87139": {
1225
  "model_module": "@jupyter-widgets/controls",
1226
- "model_name": "HTMLModel",
1227
  "model_module_version": "1.5.0",
 
1228
  "state": {
1229
  "_dom_classes": [],
1230
  "_model_module": "@jupyter-widgets/controls",
1231
  "_model_module_version": "1.5.0",
1232
- "_model_name": "HTMLModel",
1233
  "_view_count": null,
1234
  "_view_module": "@jupyter-widgets/controls",
1235
  "_view_module_version": "1.5.0",
1236
- "_view_name": "HTMLView",
 
1237
  "description": "",
1238
  "description_tooltip": null,
1239
- "layout": "IPY_MODEL_3ca0e1427ac6477c9921929af7ff00d1",
1240
- "placeholder": "​",
1241
- "style": "IPY_MODEL_a9a5503caf384b93bf987e5271a577d2",
1242
- "value": " 2/2 [00:00&lt;00:00,  2.83it/s]"
 
 
1243
  }
1244
  },
1245
- "46810cc7c7c54e31a65e609c386d86d9": {
1246
  "model_module": "@jupyter-widgets/base",
1247
- "model_name": "LayoutModel",
1248
  "model_module_version": "1.2.0",
 
1249
  "state": {
1250
  "_model_module": "@jupyter-widgets/base",
1251
  "_model_module_version": "1.2.0",
@@ -1294,10 +1213,25 @@
1294
  "width": null
1295
  }
1296
  },
1297
- "cfed7deef0b74f4b9d160e9fdc2b138e": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1298
  "model_module": "@jupyter-widgets/base",
1299
- "model_name": "LayoutModel",
1300
  "model_module_version": "1.2.0",
 
1301
  "state": {
1302
  "_model_module": "@jupyter-widgets/base",
1303
  "_model_module_version": "1.2.0",
@@ -1346,25 +1280,10 @@
1346
  "width": null
1347
  }
1348
  },
1349
- "23ddab24ac304751b3babfaeec9360eb": {
1350
- "model_module": "@jupyter-widgets/controls",
1351
- "model_name": "DescriptionStyleModel",
1352
- "model_module_version": "1.5.0",
1353
- "state": {
1354
- "_model_module": "@jupyter-widgets/controls",
1355
- "_model_module_version": "1.5.0",
1356
- "_model_name": "DescriptionStyleModel",
1357
- "_view_count": null,
1358
- "_view_module": "@jupyter-widgets/base",
1359
- "_view_module_version": "1.2.0",
1360
- "_view_name": "StyleView",
1361
- "description_width": ""
1362
- }
1363
- },
1364
- "79e87175ffb949bd8cddf4577210a42d": {
1365
  "model_module": "@jupyter-widgets/base",
1366
- "model_name": "LayoutModel",
1367
  "model_module_version": "1.2.0",
 
1368
  "state": {
1369
  "_model_module": "@jupyter-widgets/base",
1370
  "_model_module_version": "1.2.0",
@@ -1413,26 +1332,25 @@
1413
  "width": null
1414
  }
1415
  },
1416
- "5aed84a20ac34f2b943d26d66decc88f": {
1417
  "model_module": "@jupyter-widgets/controls",
1418
- "model_name": "ProgressStyleModel",
1419
  "model_module_version": "1.5.0",
 
1420
  "state": {
1421
  "_model_module": "@jupyter-widgets/controls",
1422
  "_model_module_version": "1.5.0",
1423
- "_model_name": "ProgressStyleModel",
1424
  "_view_count": null,
1425
  "_view_module": "@jupyter-widgets/base",
1426
  "_view_module_version": "1.2.0",
1427
  "_view_name": "StyleView",
1428
- "bar_color": null,
1429
  "description_width": ""
1430
  }
1431
  },
1432
- "3ca0e1427ac6477c9921929af7ff00d1": {
1433
  "model_module": "@jupyter-widgets/base",
1434
- "model_name": "LayoutModel",
1435
  "model_module_version": "1.2.0",
 
1436
  "state": {
1437
  "_model_module": "@jupyter-widgets/base",
1438
  "_model_module_version": "1.2.0",
@@ -1442,13 +1360,13 @@
1442
  "_view_module_version": "1.2.0",
1443
  "_view_name": "LayoutView",
1444
  "align_content": null,
1445
- "align_items": null,
1446
  "align_self": null,
1447
  "border": null,
1448
  "bottom": null,
1449
- "display": null,
1450
  "flex": null,
1451
- "flex_flow": null,
1452
  "grid_area": null,
1453
  "grid_auto_columns": null,
1454
  "grid_auto_flow": null,
@@ -1478,28 +1396,36 @@
1478
  "right": null,
1479
  "top": null,
1480
  "visibility": null,
1481
- "width": null
1482
  }
1483
  },
1484
- "a9a5503caf384b93bf987e5271a577d2": {
1485
  "model_module": "@jupyter-widgets/controls",
1486
- "model_name": "DescriptionStyleModel",
1487
  "model_module_version": "1.5.0",
 
1488
  "state": {
 
1489
  "_model_module": "@jupyter-widgets/controls",
1490
  "_model_module_version": "1.5.0",
1491
- "_model_name": "DescriptionStyleModel",
1492
  "_view_count": null,
1493
- "_view_module": "@jupyter-widgets/base",
1494
- "_view_module_version": "1.2.0",
1495
- "_view_name": "StyleView",
1496
- "description_width": ""
 
 
 
 
 
 
 
1497
  }
1498
  },
1499
  "c68f0fe7a6bb4060afcb05e3f6422288": {
1500
  "model_module": "@jupyter-widgets/controls",
1501
- "model_name": "HBoxModel",
1502
  "model_module_version": "1.5.0",
 
1503
  "state": {
1504
  "_dom_classes": [],
1505
  "_model_module": "@jupyter-widgets/controls",
@@ -1518,10 +1444,10 @@
1518
  "layout": "IPY_MODEL_1a29c71234d74f08b2645f9383fee126"
1519
  }
1520
  },
1521
- "fef3c94897fc4ffa86f91aac7a45ac7f": {
1522
  "model_module": "@jupyter-widgets/controls",
1523
- "model_name": "HTMLModel",
1524
  "model_module_version": "1.5.0",
 
1525
  "state": {
1526
  "_dom_classes": [],
1527
  "_model_module": "@jupyter-widgets/controls",
@@ -1533,61 +1459,53 @@
1533
  "_view_name": "HTMLView",
1534
  "description": "",
1535
  "description_tooltip": null,
1536
- "layout": "IPY_MODEL_f8553ec713ea440eb0208a1012547988",
1537
  "placeholder": "​",
1538
- "style": "IPY_MODEL_25e0373512b747ba8ebe020b8b8ab932",
1539
- "value": "Loading checkpoint shards: 100%"
1540
  }
1541
  },
1542
- "92881d2e3f1a438b92a389cc6022f7ad": {
1543
  "model_module": "@jupyter-widgets/controls",
1544
- "model_name": "FloatProgressModel",
1545
  "model_module_version": "1.5.0",
 
1546
  "state": {
1547
  "_dom_classes": [],
1548
  "_model_module": "@jupyter-widgets/controls",
1549
  "_model_module_version": "1.5.0",
1550
- "_model_name": "FloatProgressModel",
1551
  "_view_count": null,
1552
  "_view_module": "@jupyter-widgets/controls",
1553
  "_view_module_version": "1.5.0",
1554
- "_view_name": "ProgressView",
1555
- "bar_style": "success",
1556
- "description": "",
1557
- "description_tooltip": null,
1558
- "layout": "IPY_MODEL_daff4ba27c68441395aa5377111f30f1",
1559
- "max": 2,
1560
- "min": 0,
1561
- "orientation": "horizontal",
1562
- "style": "IPY_MODEL_863090b3318e4e0186bd46d3d1479de4",
1563
- "value": 2
1564
  }
1565
  },
1566
- "f518ab021bc648f188638fd168879edd": {
1567
  "model_module": "@jupyter-widgets/controls",
1568
- "model_name": "HTMLModel",
1569
  "model_module_version": "1.5.0",
 
1570
  "state": {
1571
- "_dom_classes": [],
1572
  "_model_module": "@jupyter-widgets/controls",
1573
  "_model_module_version": "1.5.0",
1574
- "_model_name": "HTMLModel",
1575
  "_view_count": null,
1576
- "_view_module": "@jupyter-widgets/controls",
1577
- "_view_module_version": "1.5.0",
1578
- "_view_name": "HTMLView",
1579
- "description": "",
1580
- "description_tooltip": null,
1581
- "layout": "IPY_MODEL_acae1751ff5d4293bb588c2d9c7ab851",
1582
- "placeholder": "​",
1583
- "style": "IPY_MODEL_8859eb8d9c154cb79a302db1568768fa",
1584
- "value": " 2/2 [00:05&lt;00:00,  2.39s/it]"
1585
  }
1586
  },
1587
- "1a29c71234d74f08b2645f9383fee126": {
1588
  "model_module": "@jupyter-widgets/base",
1589
- "model_name": "LayoutModel",
1590
  "model_module_version": "1.2.0",
 
1591
  "state": {
1592
  "_model_module": "@jupyter-widgets/base",
1593
  "_model_module_version": "1.2.0",
@@ -1636,10 +1554,62 @@
1636
  "width": null
1637
  }
1638
  },
1639
- "f8553ec713ea440eb0208a1012547988": {
1640
  "model_module": "@jupyter-widgets/base",
 
1641
  "model_name": "LayoutModel",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1642
  "model_module_version": "1.2.0",
 
1643
  "state": {
1644
  "_model_module": "@jupyter-widgets/base",
1645
  "_model_module_version": "1.2.0",
@@ -1688,25 +1658,34 @@
1688
  "width": null
1689
  }
1690
  },
1691
- "25e0373512b747ba8ebe020b8b8ab932": {
1692
  "model_module": "@jupyter-widgets/controls",
1693
- "model_name": "DescriptionStyleModel",
1694
  "model_module_version": "1.5.0",
 
1695
  "state": {
 
1696
  "_model_module": "@jupyter-widgets/controls",
1697
  "_model_module_version": "1.5.0",
1698
- "_model_name": "DescriptionStyleModel",
1699
  "_view_count": null,
1700
- "_view_module": "@jupyter-widgets/base",
1701
- "_view_module_version": "1.2.0",
1702
- "_view_name": "StyleView",
1703
- "description_width": ""
 
 
 
 
 
 
 
 
1704
  }
1705
  },
1706
- "daff4ba27c68441395aa5377111f30f1": {
1707
  "model_module": "@jupyter-widgets/base",
1708
- "model_name": "LayoutModel",
1709
  "model_module_version": "1.2.0",
 
1710
  "state": {
1711
  "_model_module": "@jupyter-widgets/base",
1712
  "_model_module_version": "1.2.0",
@@ -1755,26 +1734,31 @@
1755
  "width": null
1756
  }
1757
  },
1758
- "863090b3318e4e0186bd46d3d1479de4": {
1759
  "model_module": "@jupyter-widgets/controls",
1760
- "model_name": "ProgressStyleModel",
1761
  "model_module_version": "1.5.0",
 
1762
  "state": {
 
1763
  "_model_module": "@jupyter-widgets/controls",
1764
  "_model_module_version": "1.5.0",
1765
- "_model_name": "ProgressStyleModel",
1766
  "_view_count": null,
1767
- "_view_module": "@jupyter-widgets/base",
1768
- "_view_module_version": "1.2.0",
1769
- "_view_name": "StyleView",
1770
- "bar_color": null,
1771
- "description_width": ""
 
 
 
 
1772
  }
1773
  },
1774
- "acae1751ff5d4293bb588c2d9c7ab851": {
1775
  "model_module": "@jupyter-widgets/base",
1776
- "model_name": "LayoutModel",
1777
  "model_module_version": "1.2.0",
 
1778
  "state": {
1779
  "_model_module": "@jupyter-widgets/base",
1780
  "_model_module_version": "1.2.0",
@@ -1823,19 +1807,25 @@
1823
  "width": null
1824
  }
1825
  },
1826
- "8859eb8d9c154cb79a302db1568768fa": {
1827
  "model_module": "@jupyter-widgets/controls",
1828
- "model_name": "DescriptionStyleModel",
1829
  "model_module_version": "1.5.0",
 
1830
  "state": {
 
1831
  "_model_module": "@jupyter-widgets/controls",
1832
  "_model_module_version": "1.5.0",
1833
- "_model_name": "DescriptionStyleModel",
1834
  "_view_count": null,
1835
- "_view_module": "@jupyter-widgets/base",
1836
- "_view_module_version": "1.2.0",
1837
- "_view_name": "StyleView",
1838
- "description_width": ""
 
 
 
 
 
1839
  }
1840
  }
1841
  }
@@ -1843,4 +1833,4 @@
1843
  },
1844
  "nbformat": 4,
1845
  "nbformat_minor": 0
1846
- }
 
1
  {
2
  "cells": [
 
 
 
 
 
 
 
 
 
 
3
  {
4
  "cell_type": "markdown",
5
  "metadata": {
 
13
  },
14
  {
15
  "cell_type": "code",
16
+ "execution_count": 1,
 
 
17
  "metadata": {
 
18
  "colab": {
19
  "base_uri": "https://localhost:8080/"
20
  },
21
+ "id": "EB0gv8OzHfLV",
22
  "outputId": "9de07e75-ddf4-4347-fc41-432a23774e2c"
23
  },
 
24
  "outputs": [
25
  {
 
26
  "name": "stdout",
27
+ "output_type": "stream",
28
  "text": [
29
  " Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
30
  " Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
 
42
  "\u001b[0m"
43
  ]
44
  }
45
+ ],
46
+ "source": [
47
+ "!pip install -q -U datasets bitsandbytes peft git+https://github.com/huggingface/transformers.git"
48
  ]
49
  },
50
  {
 
60
  "cell_type": "code",
61
  "execution_count": 2,
62
  "metadata": {
 
63
  "colab": {
64
  "base_uri": "https://localhost:8080/",
65
  "height": 17,
 
86
  "80df5f3cd6c646808b09d99daed5bfd2"
87
  ]
88
  },
89
+ "id": "NzJZSHD8tZZy",
90
  "outputId": "c01b2b6f-3c1e-45da-9fc0-f4f518bcca24"
91
  },
92
  "outputs": [
93
  {
 
94
  "data": {
 
 
 
95
  "application/vnd.jupyter.widget-view+json": {
96
+ "model_id": "4f0e85aa740146d3aca81588a0288031",
97
  "version_major": 2,
98
+ "version_minor": 0
99
+ },
100
+ "text/plain": [
101
+ "VBox(children=(HTML(value='<center> <img\\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…"
102
+ ]
103
  },
104
+ "metadata": {},
105
+ "output_type": "display_data"
106
  }
107
  ],
108
  "source": [
 
123
  "cell_type": "code",
124
  "execution_count": 1,
125
  "metadata": {
 
126
  "colab": {
127
  "base_uri": "https://localhost:8080/"
128
  },
129
+ "id": "az5kdSbNpjgH",
130
  "outputId": "2d9f379c-eb31-45b0-b84c-79c2a2577d01"
131
  },
132
  "outputs": [
133
  {
 
134
  "name": "stderr",
135
+ "output_type": "stream",
136
  "text": [
137
  "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning: \n",
138
  "The secret `HF_TOKEN` does not exist in your Colab secrets.\n",
 
164
  "cell_type": "code",
165
  "execution_count": 3,
166
  "metadata": {
 
167
  "colab": {
168
  "base_uri": "https://localhost:8080/"
169
  },
170
+ "id": "TNJW2ty4yy4L",
171
  "outputId": "f76414b2-8f37-48ae-d369-b977323fa892"
172
  },
173
  "outputs": [
174
  {
 
175
  "data": {
176
  "text/plain": [
177
  "Dataset({\n",
 
180
  "})"
181
  ]
182
  },
183
+ "execution_count": 3,
184
  "metadata": {},
185
+ "output_type": "execute_result"
186
  }
187
  ],
188
  "source": [
 
214
  "cell_type": "code",
215
  "execution_count": 5,
216
  "metadata": {
 
217
  "colab": {
218
  "base_uri": "https://localhost:8080/",
219
  "height": 49,
 
231
  "a9a5503caf384b93bf987e5271a577d2"
232
  ]
233
  },
234
+ "id": "iZRvrfUquH1y",
235
  "outputId": "34f12289-6ef4-49d9-9257-ad0328961190"
236
  },
237
  "outputs": [
238
  {
 
239
  "data": {
 
 
 
240
  "application/vnd.jupyter.widget-view+json": {
241
+ "model_id": "8458933373264dbeb58d0b5ace4fd9c6",
242
  "version_major": 2,
243
+ "version_minor": 0
244
+ },
245
+ "text/plain": [
246
+ "Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]"
247
+ ]
248
  },
249
+ "metadata": {},
250
+ "output_type": "display_data"
251
  }
252
  ],
253
  "source": [
 
276
  "cell_type": "code",
277
  "execution_count": 6,
278
  "metadata": {
 
279
  "colab": {
280
  "base_uri": "https://localhost:8080/",
281
  "height": 66,
 
293
  "8859eb8d9c154cb79a302db1568768fa"
294
  ]
295
  },
296
+ "id": "9AYeuyzNuJ9X",
297
  "outputId": "aaedd707-f694-4ba8-ba43-7ae2a3739e73"
298
  },
299
  "outputs": [
300
  {
 
301
  "data": {
 
 
 
302
  "application/vnd.jupyter.widget-view+json": {
303
+ "model_id": "c68f0fe7a6bb4060afcb05e3f6422288",
304
  "version_major": 2,
305
+ "version_minor": 0
306
+ },
307
+ "text/plain": [
308
+ "Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]"
309
+ ]
310
  },
311
+ "metadata": {},
312
+ "output_type": "display_data"
313
  },
314
  {
 
315
  "name": "stdout",
316
+ "output_type": "stream",
317
  "text": [
318
  "trainable params: 11,876,352 || all params: 3,044,118,768 || trainable%: 0.3901\n"
319
  ]
 
339
  },
340
  {
341
  "cell_type": "markdown",
 
 
 
342
  "metadata": {
343
  "id": "sfxtN1iKRWXX"
344
+ },
345
+ "source": [
346
+ "We need to take tokens to same dtype as model so need to store it as a variable."
347
+ ]
348
  },
349
  {
350
  "cell_type": "code",
351
+ "execution_count": 7,
 
 
352
  "metadata": {
353
  "id": "uGZ6FnioRWEc"
354
  },
355
+ "outputs": [],
356
+ "source": [
357
+ "DTYPE = model.dtype"
358
+ ]
359
  },
360
  {
361
  "cell_type": "markdown",
 
368
  },
369
  {
370
  "cell_type": "code",
371
+ "execution_count": 8,
 
 
372
  "metadata": {
373
  "id": "wQ_gbnXARKz1"
374
  },
375
+ "outputs": [],
376
+ "source": [
377
+ "processor = PaliGemmaProcessor.from_pretrained(model_id)"
378
+ ]
379
  },
380
  {
381
  "cell_type": "markdown",
 
476
  },
477
  {
478
  "cell_type": "markdown",
479
+ "metadata": {
480
+ "id": "ZX912_liP-Eh"
481
+ },
482
  "source": [
483
  "LoRA with bsz of 2 works on A100 Colab. You can apply gradient accumulation (which is enabled in this notebook) to simulate larger batch sizes.\n",
484
  "Currently there's an issue with QLoRA, we are investigating and will solve soon."
485
+ ]
 
 
 
486
  },
487
  {
488
  "cell_type": "code",
 
520
  "accelerator": "GPU",
521
  "colab": {
522
  "gpuType": "A100",
523
+ "include_colab_link": true,
524
+ "provenance": []
525
  },
526
  "kernelspec": {
527
  "display_name": "Python 3",
 
541
  },
542
  "widgets": {
543
  "application/vnd.jupyter.widget-state+json": {
544
+ "1a29c71234d74f08b2645f9383fee126": {
545
+ "model_module": "@jupyter-widgets/base",
546
+ "model_module_version": "1.2.0",
547
+ "model_name": "LayoutModel",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
548
  "state": {
549
+ "_model_module": "@jupyter-widgets/base",
550
+ "_model_module_version": "1.2.0",
551
+ "_model_name": "LayoutModel",
 
552
  "_view_count": null,
553
+ "_view_module": "@jupyter-widgets/base",
554
+ "_view_module_version": "1.2.0",
555
+ "_view_name": "LayoutView",
556
+ "align_content": null,
557
+ "align_items": null,
558
+ "align_self": null,
559
+ "border": null,
560
+ "bottom": null,
561
+ "display": null,
562
+ "flex": null,
563
+ "flex_flow": null,
564
+ "grid_area": null,
565
+ "grid_auto_columns": null,
566
+ "grid_auto_flow": null,
567
+ "grid_auto_rows": null,
568
+ "grid_column": null,
569
+ "grid_gap": null,
570
+ "grid_row": null,
571
+ "grid_template_areas": null,
572
+ "grid_template_columns": null,
573
+ "grid_template_rows": null,
574
+ "height": null,
575
+ "justify_content": null,
576
+ "justify_items": null,
577
+ "left": null,
578
+ "margin": null,
579
+ "max_height": null,
580
+ "max_width": null,
581
+ "min_height": null,
582
+ "min_width": null,
583
+ "object_fit": null,
584
+ "object_position": null,
585
+ "order": null,
586
+ "overflow": null,
587
+ "overflow_x": null,
588
+ "overflow_y": null,
589
+ "padding": null,
590
+ "right": null,
591
+ "top": null,
592
+ "visibility": null,
593
+ "width": null
594
  }
595
  },
596
+ "23ddab24ac304751b3babfaeec9360eb": {
597
  "model_module": "@jupyter-widgets/controls",
 
598
  "model_module_version": "1.5.0",
599
+ "model_name": "DescriptionStyleModel",
600
  "state": {
 
601
  "_model_module": "@jupyter-widgets/controls",
602
  "_model_module_version": "1.5.0",
603
+ "_model_name": "DescriptionStyleModel",
604
  "_view_count": null,
605
+ "_view_module": "@jupyter-widgets/base",
606
+ "_view_module_version": "1.2.0",
607
+ "_view_name": "StyleView",
608
+ "description_width": ""
 
 
 
 
 
 
609
  }
610
  },
611
+ "25e0373512b747ba8ebe020b8b8ab932": {
612
  "model_module": "@jupyter-widgets/controls",
 
613
  "model_module_version": "1.5.0",
614
+ "model_name": "DescriptionStyleModel",
615
  "state": {
 
616
  "_model_module": "@jupyter-widgets/controls",
617
  "_model_module_version": "1.5.0",
618
+ "_model_name": "DescriptionStyleModel",
619
  "_view_count": null,
620
+ "_view_module": "@jupyter-widgets/base",
621
+ "_view_module_version": "1.2.0",
622
+ "_view_name": "StyleView",
623
+ "description_width": ""
 
 
 
 
 
 
624
  }
625
  },
626
  "2d8493a60b7a42c1b25ec0bbe0a59043": {
627
  "model_module": "@jupyter-widgets/controls",
 
628
  "model_module_version": "1.5.0",
629
+ "model_name": "HTMLModel",
630
  "state": {
631
  "_dom_classes": [],
632
  "_model_module": "@jupyter-widgets/controls",
 
644
  "value": "\n<b>Pro Tip:</b> If you don't already have one, you can create a dedicated\n'notebooks' token with 'write' access, that you can then easily reuse for all\nnotebooks. </center>"
645
  }
646
  },
647
+ "3ca0e1427ac6477c9921929af7ff00d1": {
648
  "model_module": "@jupyter-widgets/base",
 
649
  "model_module_version": "1.2.0",
650
+ "model_name": "LayoutModel",
651
  "state": {
652
  "_model_module": "@jupyter-widgets/base",
653
  "_model_module_version": "1.2.0",
 
657
  "_view_module_version": "1.2.0",
658
  "_view_name": "LayoutView",
659
  "align_content": null,
660
+ "align_items": null,
661
  "align_self": null,
662
  "border": null,
663
  "bottom": null,
664
+ "display": null,
665
  "flex": null,
666
+ "flex_flow": null,
667
  "grid_area": null,
668
  "grid_auto_columns": null,
669
  "grid_auto_flow": null,
 
693
  "right": null,
694
  "top": null,
695
  "visibility": null,
696
+ "width": null
697
  }
698
  },
699
+ "3deca9286f89422aa691325b39347b0b": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
700
  "model_module": "@jupyter-widgets/controls",
 
701
  "model_module_version": "1.5.0",
702
+ "model_name": "CheckboxModel",
703
  "state": {
704
+ "_dom_classes": [],
705
  "_model_module": "@jupyter-widgets/controls",
706
  "_model_module_version": "1.5.0",
707
+ "_model_name": "CheckboxModel",
708
  "_view_count": null,
709
+ "_view_module": "@jupyter-widgets/controls",
710
+ "_view_module_version": "1.5.0",
711
+ "_view_name": "CheckboxView",
712
+ "description": "Add token as git credential?",
713
+ "description_tooltip": null,
714
+ "disabled": false,
715
+ "indent": true,
716
+ "layout": "IPY_MODEL_b6fac3155dd140bc8e1b010270bc3cc2",
717
+ "style": "IPY_MODEL_ca348c721475417582ed5018ed43151f",
718
+ "value": true
719
  }
720
  },
721
+ "3f07afac7c194db7a16167d177562a46": {
722
  "model_module": "@jupyter-widgets/base",
 
723
  "model_module_version": "1.2.0",
724
+ "model_name": "LayoutModel",
725
  "state": {
726
  "_model_module": "@jupyter-widgets/base",
727
  "_model_module_version": "1.2.0",
 
772
  },
773
  "40782cfc43a8437da5534feee03c6ba6": {
774
  "model_module": "@jupyter-widgets/controls",
 
775
  "model_module_version": "1.5.0",
776
+ "model_name": "DescriptionStyleModel",
777
  "state": {
778
  "_model_module": "@jupyter-widgets/controls",
779
  "_model_module_version": "1.5.0",
 
785
  "description_width": ""
786
  }
787
  },
788
+ "46810cc7c7c54e31a65e609c386d86d9": {
789
  "model_module": "@jupyter-widgets/base",
 
790
  "model_module_version": "1.2.0",
791
+ "model_name": "LayoutModel",
792
  "state": {
793
  "_model_module": "@jupyter-widgets/base",
794
  "_model_module_version": "1.2.0",
 
837
  "width": null
838
  }
839
  },
840
+ "4f0e85aa740146d3aca81588a0288031": {
841
  "model_module": "@jupyter-widgets/controls",
 
842
  "model_module_version": "1.5.0",
843
+ "model_name": "VBoxModel",
844
  "state": {
845
+ "_dom_classes": [],
846
  "_model_module": "@jupyter-widgets/controls",
847
  "_model_module_version": "1.5.0",
848
+ "_model_name": "VBoxModel",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
849
  "_view_count": null,
850
+ "_view_module": "@jupyter-widgets/controls",
851
+ "_view_module_version": "1.5.0",
852
+ "_view_name": "VBoxView",
853
+ "box_style": "",
854
+ "children": [],
855
+ "layout": "IPY_MODEL_c25efe32ee7c40d3a4c95093abb2a720"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
856
  }
857
  },
858
  "5515d96f0c8947f0ad4b7f17eb7d63f6": {
859
  "model_module": "@jupyter-widgets/controls",
 
860
  "model_module_version": "1.5.0",
861
+ "model_name": "ButtonStyleModel",
862
  "state": {
863
  "_model_module": "@jupyter-widgets/controls",
864
  "_model_module_version": "1.5.0",
 
871
  "font_weight": ""
872
  }
873
  },
874
+ "55c01e2c04d1499ca5b9b19dea7e4e02": {
875
  "model_module": "@jupyter-widgets/base",
 
876
  "model_module_version": "1.2.0",
877
+ "model_name": "LayoutModel",
878
  "state": {
879
  "_model_module": "@jupyter-widgets/base",
880
  "_model_module_version": "1.2.0",
 
923
  "width": null
924
  }
925
  },
926
+ "5aed84a20ac34f2b943d26d66decc88f": {
927
  "model_module": "@jupyter-widgets/controls",
 
928
  "model_module_version": "1.5.0",
929
+ "model_name": "ProgressStyleModel",
930
  "state": {
931
  "_model_module": "@jupyter-widgets/controls",
932
  "_model_module_version": "1.5.0",
933
+ "_model_name": "ProgressStyleModel",
934
  "_view_count": null,
935
  "_view_module": "@jupyter-widgets/base",
936
  "_view_module_version": "1.2.0",
937
  "_view_name": "StyleView",
938
+ "bar_color": null,
939
  "description_width": ""
940
  }
941
  },
942
  "65f10d2456cb4ee1963fac050e4c34f7": {
943
  "model_module": "@jupyter-widgets/controls",
 
944
  "model_module_version": "1.5.0",
945
+ "model_name": "LabelModel",
946
  "state": {
947
  "_dom_classes": [],
948
  "_model_module": "@jupyter-widgets/controls",
 
960
  "value": "Connecting..."
961
  }
962
  },
963
+ "7138aa9537fc4b4f809e57665be87139": {
964
+ "model_module": "@jupyter-widgets/controls",
965
+ "model_module_version": "1.5.0",
966
+ "model_name": "HTMLModel",
967
+ "state": {
968
+ "_dom_classes": [],
969
+ "_model_module": "@jupyter-widgets/controls",
970
+ "_model_module_version": "1.5.0",
971
+ "_model_name": "HTMLModel",
972
+ "_view_count": null,
973
+ "_view_module": "@jupyter-widgets/controls",
974
+ "_view_module_version": "1.5.0",
975
+ "_view_name": "HTMLView",
976
+ "description": "",
977
+ "description_tooltip": null,
978
+ "layout": "IPY_MODEL_3ca0e1427ac6477c9921929af7ff00d1",
979
+ "placeholder": "​",
980
+ "style": "IPY_MODEL_a9a5503caf384b93bf987e5271a577d2",
981
+ "value": " 2/2 [00:00&lt;00:00,  2.83it/s]"
982
+ }
983
+ },
984
+ "714009484da745dc8a87e5066b939de2": {
985
+ "model_module": "@jupyter-widgets/controls",
986
+ "model_module_version": "1.5.0",
987
+ "model_name": "HTMLModel",
988
+ "state": {
989
+ "_dom_classes": [],
990
+ "_model_module": "@jupyter-widgets/controls",
991
+ "_model_module_version": "1.5.0",
992
+ "_model_name": "HTMLModel",
993
+ "_view_count": null,
994
+ "_view_module": "@jupyter-widgets/controls",
995
+ "_view_module_version": "1.5.0",
996
+ "_view_name": "HTMLView",
997
+ "description": "",
998
+ "description_tooltip": null,
999
+ "layout": "IPY_MODEL_cfed7deef0b74f4b9d160e9fdc2b138e",
1000
+ "placeholder": "​",
1001
+ "style": "IPY_MODEL_23ddab24ac304751b3babfaeec9360eb",
1002
+ "value": "Loading checkpoint shards: 100%"
1003
+ }
1004
+ },
1005
+ "757bc788bd6842d28a9f889187ffb88e": {
1006
+ "model_module": "@jupyter-widgets/controls",
1007
+ "model_module_version": "1.5.0",
1008
+ "model_name": "DescriptionStyleModel",
1009
+ "state": {
1010
+ "_model_module": "@jupyter-widgets/controls",
1011
+ "_model_module_version": "1.5.0",
1012
+ "_model_name": "DescriptionStyleModel",
1013
+ "_view_count": null,
1014
+ "_view_module": "@jupyter-widgets/base",
1015
+ "_view_module_version": "1.2.0",
1016
+ "_view_name": "StyleView",
1017
+ "description_width": ""
1018
+ }
1019
+ },
1020
+ "79e87175ffb949bd8cddf4577210a42d": {
1021
  "model_module": "@jupyter-widgets/base",
 
1022
  "model_module_version": "1.2.0",
1023
+ "model_name": "LayoutModel",
1024
  "state": {
1025
  "_model_module": "@jupyter-widgets/base",
1026
  "_model_module_version": "1.2.0",
 
1071
  },
1072
  "80df5f3cd6c646808b09d99daed5bfd2": {
1073
  "model_module": "@jupyter-widgets/controls",
 
1074
  "model_module_version": "1.5.0",
1075
+ "model_name": "DescriptionStyleModel",
1076
  "state": {
1077
  "_model_module": "@jupyter-widgets/controls",
1078
  "_model_module_version": "1.5.0",
 
1086
  },
1087
  "8458933373264dbeb58d0b5ace4fd9c6": {
1088
  "model_module": "@jupyter-widgets/controls",
 
1089
  "model_module_version": "1.5.0",
1090
+ "model_name": "HBoxModel",
1091
  "state": {
1092
  "_dom_classes": [],
1093
  "_model_module": "@jupyter-widgets/controls",
 
1106
  "layout": "IPY_MODEL_46810cc7c7c54e31a65e609c386d86d9"
1107
  }
1108
  },
1109
+ "863090b3318e4e0186bd46d3d1479de4": {
1110
  "model_module": "@jupyter-widgets/controls",
 
1111
  "model_module_version": "1.5.0",
1112
+ "model_name": "ProgressStyleModel",
1113
  "state": {
 
1114
  "_model_module": "@jupyter-widgets/controls",
1115
  "_model_module_version": "1.5.0",
1116
+ "_model_name": "ProgressStyleModel",
1117
  "_view_count": null,
1118
+ "_view_module": "@jupyter-widgets/base",
1119
+ "_view_module_version": "1.2.0",
1120
+ "_view_name": "StyleView",
1121
+ "bar_color": null,
1122
+ "description_width": ""
 
 
 
 
1123
  }
1124
  },
1125
+ "8859eb8d9c154cb79a302db1568768fa": {
1126
  "model_module": "@jupyter-widgets/controls",
 
1127
  "model_module_version": "1.5.0",
1128
+ "model_name": "DescriptionStyleModel",
1129
  "state": {
 
1130
  "_model_module": "@jupyter-widgets/controls",
1131
  "_model_module_version": "1.5.0",
1132
+ "_model_name": "DescriptionStyleModel",
1133
  "_view_count": null,
1134
+ "_view_module": "@jupyter-widgets/base",
1135
+ "_view_module_version": "1.2.0",
1136
+ "_view_name": "StyleView",
1137
+ "description_width": ""
 
 
 
 
 
 
 
 
1138
  }
1139
  },
1140
+ "92881d2e3f1a438b92a389cc6022f7ad": {
1141
  "model_module": "@jupyter-widgets/controls",
 
1142
  "model_module_version": "1.5.0",
1143
+ "model_name": "FloatProgressModel",
1144
  "state": {
1145
  "_dom_classes": [],
1146
  "_model_module": "@jupyter-widgets/controls",
1147
  "_model_module_version": "1.5.0",
1148
+ "_model_name": "FloatProgressModel",
1149
  "_view_count": null,
1150
  "_view_module": "@jupyter-widgets/controls",
1151
  "_view_module_version": "1.5.0",
1152
+ "_view_name": "ProgressView",
1153
+ "bar_style": "success",
1154
  "description": "",
1155
  "description_tooltip": null,
1156
+ "layout": "IPY_MODEL_daff4ba27c68441395aa5377111f30f1",
1157
+ "max": 2,
1158
+ "min": 0,
1159
+ "orientation": "horizontal",
1160
+ "style": "IPY_MODEL_863090b3318e4e0186bd46d3d1479de4",
1161
+ "value": 2
1162
  }
1163
  },
1164
+ "9335e48fe8ba4fe9b535b5ece1be6ff5": {
1165
  "model_module": "@jupyter-widgets/base",
 
1166
  "model_module_version": "1.2.0",
1167
+ "model_name": "LayoutModel",
1168
  "state": {
1169
  "_model_module": "@jupyter-widgets/base",
1170
  "_model_module_version": "1.2.0",
 
1213
  "width": null
1214
  }
1215
  },
1216
+ "a9a5503caf384b93bf987e5271a577d2": {
1217
+ "model_module": "@jupyter-widgets/controls",
1218
+ "model_module_version": "1.5.0",
1219
+ "model_name": "DescriptionStyleModel",
1220
+ "state": {
1221
+ "_model_module": "@jupyter-widgets/controls",
1222
+ "_model_module_version": "1.5.0",
1223
+ "_model_name": "DescriptionStyleModel",
1224
+ "_view_count": null,
1225
+ "_view_module": "@jupyter-widgets/base",
1226
+ "_view_module_version": "1.2.0",
1227
+ "_view_name": "StyleView",
1228
+ "description_width": ""
1229
+ }
1230
+ },
1231
+ "acae1751ff5d4293bb588c2d9c7ab851": {
1232
  "model_module": "@jupyter-widgets/base",
 
1233
  "model_module_version": "1.2.0",
1234
+ "model_name": "LayoutModel",
1235
  "state": {
1236
  "_model_module": "@jupyter-widgets/base",
1237
  "_model_module_version": "1.2.0",
 
1280
  "width": null
1281
  }
1282
  },
1283
+ "b6fac3155dd140bc8e1b010270bc3cc2": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1284
  "model_module": "@jupyter-widgets/base",
 
1285
  "model_module_version": "1.2.0",
1286
+ "model_name": "LayoutModel",
1287
  "state": {
1288
  "_model_module": "@jupyter-widgets/base",
1289
  "_model_module_version": "1.2.0",
 
1332
  "width": null
1333
  }
1334
  },
1335
+ "bf9da831d7ad4651a262c5e7f80bbf87": {
1336
  "model_module": "@jupyter-widgets/controls",
 
1337
  "model_module_version": "1.5.0",
1338
+ "model_name": "DescriptionStyleModel",
1339
  "state": {
1340
  "_model_module": "@jupyter-widgets/controls",
1341
  "_model_module_version": "1.5.0",
1342
+ "_model_name": "DescriptionStyleModel",
1343
  "_view_count": null,
1344
  "_view_module": "@jupyter-widgets/base",
1345
  "_view_module_version": "1.2.0",
1346
  "_view_name": "StyleView",
 
1347
  "description_width": ""
1348
  }
1349
  },
1350
+ "c25efe32ee7c40d3a4c95093abb2a720": {
1351
  "model_module": "@jupyter-widgets/base",
 
1352
  "model_module_version": "1.2.0",
1353
+ "model_name": "LayoutModel",
1354
  "state": {
1355
  "_model_module": "@jupyter-widgets/base",
1356
  "_model_module_version": "1.2.0",
 
1360
  "_view_module_version": "1.2.0",
1361
  "_view_name": "LayoutView",
1362
  "align_content": null,
1363
+ "align_items": "center",
1364
  "align_self": null,
1365
  "border": null,
1366
  "bottom": null,
1367
+ "display": "flex",
1368
  "flex": null,
1369
+ "flex_flow": "column",
1370
  "grid_area": null,
1371
  "grid_auto_columns": null,
1372
  "grid_auto_flow": null,
 
1396
  "right": null,
1397
  "top": null,
1398
  "visibility": null,
1399
+ "width": "50%"
1400
  }
1401
  },
1402
+ "c3fad0f1cb954317a20ee158f7e10363": {
1403
  "model_module": "@jupyter-widgets/controls",
 
1404
  "model_module_version": "1.5.0",
1405
+ "model_name": "PasswordModel",
1406
  "state": {
1407
+ "_dom_classes": [],
1408
  "_model_module": "@jupyter-widgets/controls",
1409
  "_model_module_version": "1.5.0",
1410
+ "_model_name": "PasswordModel",
1411
  "_view_count": null,
1412
+ "_view_module": "@jupyter-widgets/controls",
1413
+ "_view_module_version": "1.5.0",
1414
+ "_view_name": "PasswordView",
1415
+ "continuous_update": true,
1416
+ "description": "Token:",
1417
+ "description_tooltip": null,
1418
+ "disabled": false,
1419
+ "layout": "IPY_MODEL_ed2d3d1a700143d2a48e9a9b13bd1200",
1420
+ "placeholder": "​",
1421
+ "style": "IPY_MODEL_40782cfc43a8437da5534feee03c6ba6",
1422
+ "value": ""
1423
  }
1424
  },
1425
  "c68f0fe7a6bb4060afcb05e3f6422288": {
1426
  "model_module": "@jupyter-widgets/controls",
 
1427
  "model_module_version": "1.5.0",
1428
+ "model_name": "HBoxModel",
1429
  "state": {
1430
  "_dom_classes": [],
1431
  "_model_module": "@jupyter-widgets/controls",
 
1444
  "layout": "IPY_MODEL_1a29c71234d74f08b2645f9383fee126"
1445
  }
1446
  },
1447
+ "c7fcb9dd46e649c4b8bd967b69bdb867": {
1448
  "model_module": "@jupyter-widgets/controls",
 
1449
  "model_module_version": "1.5.0",
1450
+ "model_name": "HTMLModel",
1451
  "state": {
1452
  "_dom_classes": [],
1453
  "_model_module": "@jupyter-widgets/controls",
 
1459
  "_view_name": "HTMLView",
1460
  "description": "",
1461
  "description_tooltip": null,
1462
+ "layout": "IPY_MODEL_55c01e2c04d1499ca5b9b19dea7e4e02",
1463
  "placeholder": "​",
1464
+ "style": "IPY_MODEL_bf9da831d7ad4651a262c5e7f80bbf87",
1465
+ "value": "<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.svg\nalt='Hugging Face'> <br> Copy a token from <a\nhref=\"https://huggingface.co/settings/tokens\" target=\"_blank\">your Hugging Face\ntokens page</a> and paste it below. <br> Immediately click login after copying\nyour token or it might be stored in plain text in this notebook file. </center>"
1466
  }
1467
  },
1468
+ "ca1c290bfb654f1190bbde68d51167f1": {
1469
  "model_module": "@jupyter-widgets/controls",
 
1470
  "model_module_version": "1.5.0",
1471
+ "model_name": "ButtonModel",
1472
  "state": {
1473
  "_dom_classes": [],
1474
  "_model_module": "@jupyter-widgets/controls",
1475
  "_model_module_version": "1.5.0",
1476
+ "_model_name": "ButtonModel",
1477
  "_view_count": null,
1478
  "_view_module": "@jupyter-widgets/controls",
1479
  "_view_module_version": "1.5.0",
1480
+ "_view_name": "ButtonView",
1481
+ "button_style": "",
1482
+ "description": "Login",
1483
+ "disabled": false,
1484
+ "icon": "",
1485
+ "layout": "IPY_MODEL_3f07afac7c194db7a16167d177562a46",
1486
+ "style": "IPY_MODEL_5515d96f0c8947f0ad4b7f17eb7d63f6",
1487
+ "tooltip": ""
 
 
1488
  }
1489
  },
1490
+ "ca348c721475417582ed5018ed43151f": {
1491
  "model_module": "@jupyter-widgets/controls",
 
1492
  "model_module_version": "1.5.0",
1493
+ "model_name": "DescriptionStyleModel",
1494
  "state": {
 
1495
  "_model_module": "@jupyter-widgets/controls",
1496
  "_model_module_version": "1.5.0",
1497
+ "_model_name": "DescriptionStyleModel",
1498
  "_view_count": null,
1499
+ "_view_module": "@jupyter-widgets/base",
1500
+ "_view_module_version": "1.2.0",
1501
+ "_view_name": "StyleView",
1502
+ "description_width": ""
 
 
 
 
 
1503
  }
1504
  },
1505
+ "cfed7deef0b74f4b9d160e9fdc2b138e": {
1506
  "model_module": "@jupyter-widgets/base",
 
1507
  "model_module_version": "1.2.0",
1508
+ "model_name": "LayoutModel",
1509
  "state": {
1510
  "_model_module": "@jupyter-widgets/base",
1511
  "_model_module_version": "1.2.0",
 
1554
  "width": null
1555
  }
1556
  },
1557
+ "d703de12cf9d4f87aa6ec2cc52f1090a": {
1558
  "model_module": "@jupyter-widgets/base",
1559
+ "model_module_version": "1.2.0",
1560
  "model_name": "LayoutModel",
1561
+ "state": {
1562
+ "_model_module": "@jupyter-widgets/base",
1563
+ "_model_module_version": "1.2.0",
1564
+ "_model_name": "LayoutModel",
1565
+ "_view_count": null,
1566
+ "_view_module": "@jupyter-widgets/base",
1567
+ "_view_module_version": "1.2.0",
1568
+ "_view_name": "LayoutView",
1569
+ "align_content": null,
1570
+ "align_items": null,
1571
+ "align_self": null,
1572
+ "border": null,
1573
+ "bottom": null,
1574
+ "display": null,
1575
+ "flex": null,
1576
+ "flex_flow": null,
1577
+ "grid_area": null,
1578
+ "grid_auto_columns": null,
1579
+ "grid_auto_flow": null,
1580
+ "grid_auto_rows": null,
1581
+ "grid_column": null,
1582
+ "grid_gap": null,
1583
+ "grid_row": null,
1584
+ "grid_template_areas": null,
1585
+ "grid_template_columns": null,
1586
+ "grid_template_rows": null,
1587
+ "height": null,
1588
+ "justify_content": null,
1589
+ "justify_items": null,
1590
+ "left": null,
1591
+ "margin": null,
1592
+ "max_height": null,
1593
+ "max_width": null,
1594
+ "min_height": null,
1595
+ "min_width": null,
1596
+ "object_fit": null,
1597
+ "object_position": null,
1598
+ "order": null,
1599
+ "overflow": null,
1600
+ "overflow_x": null,
1601
+ "overflow_y": null,
1602
+ "padding": null,
1603
+ "right": null,
1604
+ "top": null,
1605
+ "visibility": null,
1606
+ "width": null
1607
+ }
1608
+ },
1609
+ "daff4ba27c68441395aa5377111f30f1": {
1610
+ "model_module": "@jupyter-widgets/base",
1611
  "model_module_version": "1.2.0",
1612
+ "model_name": "LayoutModel",
1613
  "state": {
1614
  "_model_module": "@jupyter-widgets/base",
1615
  "_model_module_version": "1.2.0",
 
1658
  "width": null
1659
  }
1660
  },
1661
+ "e43e970ce8ba477e83081a4c7fea05f5": {
1662
  "model_module": "@jupyter-widgets/controls",
 
1663
  "model_module_version": "1.5.0",
1664
+ "model_name": "FloatProgressModel",
1665
  "state": {
1666
+ "_dom_classes": [],
1667
  "_model_module": "@jupyter-widgets/controls",
1668
  "_model_module_version": "1.5.0",
1669
+ "_model_name": "FloatProgressModel",
1670
  "_view_count": null,
1671
+ "_view_module": "@jupyter-widgets/controls",
1672
+ "_view_module_version": "1.5.0",
1673
+ "_view_name": "ProgressView",
1674
+ "bar_style": "success",
1675
+ "description": "",
1676
+ "description_tooltip": null,
1677
+ "layout": "IPY_MODEL_79e87175ffb949bd8cddf4577210a42d",
1678
+ "max": 2,
1679
+ "min": 0,
1680
+ "orientation": "horizontal",
1681
+ "style": "IPY_MODEL_5aed84a20ac34f2b943d26d66decc88f",
1682
+ "value": 2
1683
  }
1684
  },
1685
+ "ed2d3d1a700143d2a48e9a9b13bd1200": {
1686
  "model_module": "@jupyter-widgets/base",
 
1687
  "model_module_version": "1.2.0",
1688
+ "model_name": "LayoutModel",
1689
  "state": {
1690
  "_model_module": "@jupyter-widgets/base",
1691
  "_model_module_version": "1.2.0",
 
1734
  "width": null
1735
  }
1736
  },
1737
+ "f518ab021bc648f188638fd168879edd": {
1738
  "model_module": "@jupyter-widgets/controls",
 
1739
  "model_module_version": "1.5.0",
1740
+ "model_name": "HTMLModel",
1741
  "state": {
1742
+ "_dom_classes": [],
1743
  "_model_module": "@jupyter-widgets/controls",
1744
  "_model_module_version": "1.5.0",
1745
+ "_model_name": "HTMLModel",
1746
  "_view_count": null,
1747
+ "_view_module": "@jupyter-widgets/controls",
1748
+ "_view_module_version": "1.5.0",
1749
+ "_view_name": "HTMLView",
1750
+ "description": "",
1751
+ "description_tooltip": null,
1752
+ "layout": "IPY_MODEL_acae1751ff5d4293bb588c2d9c7ab851",
1753
+ "placeholder": "​",
1754
+ "style": "IPY_MODEL_8859eb8d9c154cb79a302db1568768fa",
1755
+ "value": " 2/2 [00:05&lt;00:00,  2.39s/it]"
1756
  }
1757
  },
1758
+ "f8553ec713ea440eb0208a1012547988": {
1759
  "model_module": "@jupyter-widgets/base",
 
1760
  "model_module_version": "1.2.0",
1761
+ "model_name": "LayoutModel",
1762
  "state": {
1763
  "_model_module": "@jupyter-widgets/base",
1764
  "_model_module_version": "1.2.0",
 
1807
  "width": null
1808
  }
1809
  },
1810
+ "fef3c94897fc4ffa86f91aac7a45ac7f": {
1811
  "model_module": "@jupyter-widgets/controls",
 
1812
  "model_module_version": "1.5.0",
1813
+ "model_name": "HTMLModel",
1814
  "state": {
1815
+ "_dom_classes": [],
1816
  "_model_module": "@jupyter-widgets/controls",
1817
  "_model_module_version": "1.5.0",
1818
+ "_model_name": "HTMLModel",
1819
  "_view_count": null,
1820
+ "_view_module": "@jupyter-widgets/controls",
1821
+ "_view_module_version": "1.5.0",
1822
+ "_view_name": "HTMLView",
1823
+ "description": "",
1824
+ "description_tooltip": null,
1825
+ "layout": "IPY_MODEL_f8553ec713ea440eb0208a1012547988",
1826
+ "placeholder": "​",
1827
+ "style": "IPY_MODEL_25e0373512b747ba8ebe020b8b8ab932",
1828
+ "value": "Loading checkpoint shards: 100%"
1829
  }
1830
  }
1831
  }
 
1833
  },
1834
  "nbformat": 4,
1835
  "nbformat_minor": 0
1836
+ }
Fine_tune_SmolVLM2_on_Video.ipynb CHANGED
@@ -1,15 +1,5 @@
1
  {
2
  "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {
6
- "id": "view-in-github",
7
- "colab_type": "text"
8
- },
9
- "source": [
10
- "<a href=\"https://colab.research.google.com/github/merveenoyan/smol-vision/blob/main/Fine_tune_SmolVLM2_on_Video.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
11
- ]
12
- },
13
  {
14
  "cell_type": "markdown",
15
  "metadata": {
@@ -32,8 +22,8 @@
32
  },
33
  "outputs": [
34
  {
35
- "output_type": "stream",
36
  "name": "stdout",
 
37
  "text": [
38
  " Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
39
  "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m163.5/163.5 kB\u001b[0m \u001b[31m5.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
@@ -47,14 +37,14 @@
47
  },
48
  {
49
  "cell_type": "code",
50
- "source": [
51
- "!pip install -q git+https://github.com/huggingface/transformers.git"
52
- ],
53
  "metadata": {
54
  "id": "FCYgmJtDRElR"
55
  },
56
- "execution_count": null,
57
- "outputs": []
 
 
58
  },
59
  {
60
  "cell_type": "code",
@@ -111,18 +101,18 @@
111
  },
112
  "outputs": [
113
  {
114
- "output_type": "display_data",
115
  "data": {
116
- "text/plain": [
117
- "VBox(children=(HTML(value='<center> <img\\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…"
118
- ],
119
  "application/vnd.jupyter.widget-view+json": {
 
120
  "version_major": 2,
121
- "version_minor": 0,
122
- "model_id": "112da28d935543069e7a1a2abc22f9f4"
123
- }
 
 
124
  },
125
- "metadata": {}
 
126
  }
127
  ],
128
  "source": [
@@ -154,8 +144,8 @@
154
  },
155
  "outputs": [
156
  {
157
- "output_type": "stream",
158
  "name": "stderr",
 
159
  "text": [
160
  "/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning: \n",
161
  "The secret `HF_TOKEN` does not exist in your Colab secrets.\n",
@@ -167,8 +157,8 @@
167
  ]
168
  },
169
  {
170
- "output_type": "stream",
171
  "name": "stdout",
 
172
  "text": [
173
  "The model as is is holding: 0.97 of GPU RAM\n"
174
  ]
@@ -280,14 +270,14 @@
280
  },
281
  {
282
  "cell_type": "code",
283
- "source": [
284
- "del split_ds, ds"
285
- ],
286
  "metadata": {
287
  "id": "KKEZPwinSwTr"
288
  },
289
- "execution_count": null,
290
- "outputs": []
 
 
291
  },
292
  {
293
  "cell_type": "markdown",
@@ -310,8 +300,8 @@
310
  },
311
  "outputs": [
312
  {
313
- "output_type": "stream",
314
  "name": "stdout",
 
315
  "text": [
316
  "prompt: A dog inside of a dog kennel on a patio., video: https://huggingface.co/datasets/hexuan21/VideoFeedback-videos-mp4/resolve/main/p/p110924.mp4\n"
317
  ]
@@ -500,28 +490,24 @@
500
  "cell_type": "code",
501
  "execution_count": null,
502
  "metadata": {
503
- "id": "_QOCpw_-uYYo",
504
  "colab": {
505
  "base_uri": "https://localhost:8080/",
506
  "height": 1000
507
  },
 
508
  "outputId": "ad1fd1f6-41f9-4fa2-ae89-e75c9876cd65"
509
  },
510
  "outputs": [
511
  {
512
- "output_type": "stream",
513
  "name": "stderr",
 
514
  "text": [
515
  "/usr/local/lib/python3.11/dist-packages/transformers/optimization.py:640: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning\n",
516
  " warnings.warn(\n"
517
  ]
518
  },
519
  {
520
- "output_type": "display_data",
521
  "data": {
522
- "text/plain": [
523
- "<IPython.core.display.HTML object>"
524
- ],
525
  "text/html": [
526
  "\n",
527
  " <div>\n",
@@ -699,19 +685,23 @@
699
  " </tr>\n",
700
  " </tbody>\n",
701
  "</table><p>"
 
 
 
702
  ]
703
  },
704
- "metadata": {}
 
705
  },
706
  {
707
- "output_type": "execute_result",
708
  "data": {
709
  "text/plain": [
710
  "TrainOutput(global_step=1000, training_loss=0.3446595501899719, metrics={'train_runtime': 1194.5916, 'train_samples_per_second': 1.674, 'train_steps_per_second': 0.837, 'total_flos': 1550232912784896.0, 'train_loss': 0.3446595501899719, 'epoch': 1.0})"
711
  ]
712
  },
 
713
  "metadata": {},
714
- "execution_count": 9
715
  }
716
  ],
717
  "source": [
@@ -722,7 +712,6 @@
722
  "cell_type": "code",
723
  "execution_count": null,
724
  "metadata": {
725
- "id": "0hN0QD9_uYYo",
726
  "colab": {
727
  "base_uri": "https://localhost:8080/",
728
  "height": 214,
@@ -773,77 +762,78 @@
773
  "3a51fdcde7984422b7a0925057f6cc37"
774
  ]
775
  },
 
776
  "outputId": "20daaa82-d090-4a7b-c655-60eccf851f47"
777
  },
778
  "outputs": [
779
  {
780
- "output_type": "display_data",
781
  "data": {
782
- "text/plain": [
783
- "Upload 3 LFS files: 0%| | 0/3 [00:00<?, ?it/s]"
784
- ],
785
  "application/vnd.jupyter.widget-view+json": {
 
786
  "version_major": 2,
787
- "version_minor": 0,
788
- "model_id": "50f23a17be4f46b687b1b1df1e70238a"
789
- }
 
 
790
  },
791
- "metadata": {}
 
792
  },
793
  {
794
- "output_type": "display_data",
795
  "data": {
796
- "text/plain": [
797
- "events.out.tfevents.1740055910.82ea94387a47.41010.0: 0%| | 0.00/17.1k [00:00<?, ?B/s]"
798
- ],
799
  "application/vnd.jupyter.widget-view+json": {
 
800
  "version_major": 2,
801
- "version_minor": 0,
802
- "model_id": "03e863ab55424bdabbc87bb965562b9b"
803
- }
 
 
804
  },
805
- "metadata": {}
 
806
  },
807
  {
808
- "output_type": "display_data",
809
  "data": {
810
- "text/plain": [
811
- "model.safetensors: 0%| | 0.00/1.02G [00:00<?, ?B/s]"
812
- ],
813
  "application/vnd.jupyter.widget-view+json": {
 
814
  "version_major": 2,
815
- "version_minor": 0,
816
- "model_id": "3d6e35795ba24eed96f2fa842b265e5b"
817
- }
 
 
818
  },
819
- "metadata": {}
 
820
  },
821
  {
822
- "output_type": "display_data",
823
  "data": {
824
- "text/plain": [
825
- "training_args.bin: 0%| | 0.00/5.43k [00:00<?, ?B/s]"
826
- ],
827
  "application/vnd.jupyter.widget-view+json": {
 
828
  "version_major": 2,
829
- "version_minor": 0,
830
- "model_id": "bc2065597db04146a6df7ed10de7b93c"
831
- }
 
 
832
  },
833
- "metadata": {}
 
834
  },
835
  {
836
- "output_type": "execute_result",
837
  "data": {
838
- "text/plain": [
839
- "CommitInfo(commit_url='https://huggingface.co/merve/SmolVLM2-500M-Video-Instruct-video-feedback/commit/2f33b0685d991475ac091593e224f3e5e7b7cac7', commit_message='End of training', commit_description='', oid='2f33b0685d991475ac091593e224f3e5e7b7cac7', pr_url=None, repo_url=RepoUrl('https://huggingface.co/merve/SmolVLM2-500M-Video-Instruct-video-feedback', endpoint='https://huggingface.co', repo_type='model', repo_id='merve/SmolVLM2-500M-Video-Instruct-video-feedback'), pr_revision=None, pr_num=None)"
840
- ],
841
  "application/vnd.google.colaboratory.intrinsic+json": {
842
  "type": "string"
843
- }
 
 
 
844
  },
 
845
  "metadata": {},
846
- "execution_count": 10
847
  }
848
  ],
849
  "source": [
@@ -852,12 +842,12 @@
852
  },
853
  {
854
  "cell_type": "markdown",
855
- "source": [
856
- "The test example is a video of a woman walking by, you can download and check from [here](https://huggingface.co/datasets/hexuan21/VideoFeedback-videos-mp4/blob/main/p/p000304.mp4)."
857
- ],
858
  "metadata": {
859
  "id": "4dewIZzjfpNx"
860
- }
 
 
 
861
  },
862
  {
863
  "cell_type": "code",
@@ -871,8 +861,8 @@
871
  },
872
  "outputs": [
873
  {
874
- "output_type": "stream",
875
  "name": "stdout",
 
876
  "text": [
877
  "User: Caption the video.You are provided the following series of three frames from a 0:00:03 [H:MM:SS] video.\n",
878
  "\n",
@@ -908,8 +898,8 @@
908
  "accelerator": "GPU",
909
  "colab": {
910
  "gpuType": "A100",
911
- "provenance": [],
912
- "include_colab_link": true
913
  },
914
  "kernelspec": {
915
  "display_name": "Python 3 (ipykernel)",
@@ -930,137 +920,87 @@
930
  },
931
  "widgets": {
932
  "application/vnd.jupyter.widget-state+json": {
933
- "112da28d935543069e7a1a2abc22f9f4": {
934
- "model_module": "@jupyter-widgets/controls",
935
- "model_name": "VBoxModel",
936
- "model_module_version": "1.5.0",
937
- "state": {
938
- "_dom_classes": [],
939
- "_model_module": "@jupyter-widgets/controls",
940
- "_model_module_version": "1.5.0",
941
- "_model_name": "VBoxModel",
942
- "_view_count": null,
943
- "_view_module": "@jupyter-widgets/controls",
944
- "_view_module_version": "1.5.0",
945
- "_view_name": "VBoxView",
946
- "box_style": "",
947
- "children": [],
948
- "layout": "IPY_MODEL_f6362bc7b5b24dd592d35a76a1fbf26b"
949
- }
950
- },
951
- "0d22c009aa584ca1a71e32336a7985e0": {
952
  "model_module": "@jupyter-widgets/controls",
953
- "model_name": "HTMLModel",
954
  "model_module_version": "1.5.0",
 
955
  "state": {
956
  "_dom_classes": [],
957
  "_model_module": "@jupyter-widgets/controls",
958
  "_model_module_version": "1.5.0",
959
- "_model_name": "HTMLModel",
960
  "_view_count": null,
961
  "_view_module": "@jupyter-widgets/controls",
962
  "_view_module_version": "1.5.0",
963
- "_view_name": "HTMLView",
 
964
  "description": "",
965
  "description_tooltip": null,
966
- "layout": "IPY_MODEL_e99fbdfc8a22408a8c728a36c8744b24",
967
- "placeholder": "​",
968
- "style": "IPY_MODEL_0fee30c9bf2b4bdfad7a37261f92db64",
969
- "value": "<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.svg\nalt='Hugging Face'> <br> Copy a token from <a\nhref=\"https://huggingface.co/settings/tokens\" target=\"_blank\">your Hugging Face\ntokens page</a> and paste it below. <br> Immediately click login after copying\nyour token or it might be stored in plain text in this notebook file. </center>"
970
- }
971
- },
972
- "ad17e30049cb4b5aa4046d94690f87d3": {
973
- "model_module": "@jupyter-widgets/controls",
974
- "model_name": "PasswordModel",
975
- "model_module_version": "1.5.0",
976
- "state": {
977
- "_dom_classes": [],
978
- "_model_module": "@jupyter-widgets/controls",
979
- "_model_module_version": "1.5.0",
980
- "_model_name": "PasswordModel",
981
- "_view_count": null,
982
- "_view_module": "@jupyter-widgets/controls",
983
- "_view_module_version": "1.5.0",
984
- "_view_name": "PasswordView",
985
- "continuous_update": true,
986
- "description": "Token:",
987
- "description_tooltip": null,
988
- "disabled": false,
989
- "layout": "IPY_MODEL_4cd8babc92cc4aeba74d2147f28dee7d",
990
- "placeholder": "​",
991
- "style": "IPY_MODEL_a4fbf37fe0fe44cfbf72ca1e82af3467",
992
- "value": ""
993
  }
994
  },
995
- "e77d3520a2d64f9a840652669c9a0ba1": {
996
  "model_module": "@jupyter-widgets/controls",
997
- "model_name": "CheckboxModel",
998
  "model_module_version": "1.5.0",
 
999
  "state": {
1000
  "_dom_classes": [],
1001
  "_model_module": "@jupyter-widgets/controls",
1002
  "_model_module_version": "1.5.0",
1003
- "_model_name": "CheckboxModel",
1004
  "_view_count": null,
1005
  "_view_module": "@jupyter-widgets/controls",
1006
  "_view_module_version": "1.5.0",
1007
- "_view_name": "CheckboxView",
1008
- "description": "Add token as git credential?",
1009
- "description_tooltip": null,
1010
- "disabled": false,
1011
- "indent": true,
1012
- "layout": "IPY_MODEL_be50e04c5629463eb18d029d045f25b3",
1013
- "style": "IPY_MODEL_5490c69c251144c4979e346c66ac1e53",
1014
- "value": true
1015
  }
1016
  },
1017
- "1852745b0de44f4281cea0cbb3508459": {
1018
  "model_module": "@jupyter-widgets/controls",
1019
- "model_name": "ButtonModel",
1020
  "model_module_version": "1.5.0",
 
1021
  "state": {
1022
- "_dom_classes": [],
1023
  "_model_module": "@jupyter-widgets/controls",
1024
  "_model_module_version": "1.5.0",
1025
- "_model_name": "ButtonModel",
1026
  "_view_count": null,
1027
- "_view_module": "@jupyter-widgets/controls",
1028
- "_view_module_version": "1.5.0",
1029
- "_view_name": "ButtonView",
1030
- "button_style": "",
1031
- "description": "Login",
1032
- "disabled": false,
1033
- "icon": "",
1034
- "layout": "IPY_MODEL_44d0e1db5f664b3fb7c146c216566776",
1035
- "style": "IPY_MODEL_7af918a10ec745d7a3f4a883dbdc8b6a",
1036
- "tooltip": ""
1037
  }
1038
  },
1039
- "166c19ec6d9f4455a56a0f146d1c0abc": {
1040
  "model_module": "@jupyter-widgets/controls",
1041
- "model_name": "HTMLModel",
1042
  "model_module_version": "1.5.0",
 
1043
  "state": {
1044
- "_dom_classes": [],
1045
  "_model_module": "@jupyter-widgets/controls",
1046
  "_model_module_version": "1.5.0",
1047
- "_model_name": "HTMLModel",
1048
  "_view_count": null,
1049
- "_view_module": "@jupyter-widgets/controls",
1050
- "_view_module_version": "1.5.0",
1051
- "_view_name": "HTMLView",
1052
- "description": "",
1053
- "description_tooltip": null,
1054
- "layout": "IPY_MODEL_4156b6897089446984196606ef0d3461",
1055
- "placeholder": "​",
1056
- "style": "IPY_MODEL_cf4b5a9cefe84fd9a4d120ab1da6f3f4",
1057
- "value": "\n<b>Pro Tip:</b> If you don't already have one, you can create a dedicated\n'notebooks' token with 'write' access, that you can then easily reuse for all\nnotebooks. </center>"
1058
  }
1059
  },
1060
- "f6362bc7b5b24dd592d35a76a1fbf26b": {
1061
  "model_module": "@jupyter-widgets/base",
1062
- "model_name": "LayoutModel",
1063
  "model_module_version": "1.2.0",
 
1064
  "state": {
1065
  "_model_module": "@jupyter-widgets/base",
1066
  "_model_module_version": "1.2.0",
@@ -1070,13 +1010,13 @@
1070
  "_view_module_version": "1.2.0",
1071
  "_view_name": "LayoutView",
1072
  "align_content": null,
1073
- "align_items": "center",
1074
  "align_self": null,
1075
  "border": null,
1076
  "bottom": null,
1077
- "display": "flex",
1078
  "flex": null,
1079
- "flex_flow": "column",
1080
  "grid_area": null,
1081
  "grid_auto_columns": null,
1082
  "grid_auto_flow": null,
@@ -1106,65 +1046,49 @@
1106
  "right": null,
1107
  "top": null,
1108
  "visibility": null,
1109
- "width": "50%"
1110
  }
1111
  },
1112
- "e99fbdfc8a22408a8c728a36c8744b24": {
1113
- "model_module": "@jupyter-widgets/base",
1114
- "model_name": "LayoutModel",
1115
- "model_module_version": "1.2.0",
1116
  "state": {
1117
- "_model_module": "@jupyter-widgets/base",
1118
- "_model_module_version": "1.2.0",
1119
- "_model_name": "LayoutModel",
 
1120
  "_view_count": null,
1121
- "_view_module": "@jupyter-widgets/base",
1122
- "_view_module_version": "1.2.0",
1123
- "_view_name": "LayoutView",
1124
- "align_content": null,
1125
- "align_items": null,
1126
- "align_self": null,
1127
- "border": null,
1128
- "bottom": null,
1129
- "display": null,
1130
- "flex": null,
1131
- "flex_flow": null,
1132
- "grid_area": null,
1133
- "grid_auto_columns": null,
1134
- "grid_auto_flow": null,
1135
- "grid_auto_rows": null,
1136
- "grid_column": null,
1137
- "grid_gap": null,
1138
- "grid_row": null,
1139
- "grid_template_areas": null,
1140
- "grid_template_columns": null,
1141
- "grid_template_rows": null,
1142
- "height": null,
1143
- "justify_content": null,
1144
- "justify_items": null,
1145
- "left": null,
1146
- "margin": null,
1147
- "max_height": null,
1148
- "max_width": null,
1149
- "min_height": null,
1150
- "min_width": null,
1151
- "object_fit": null,
1152
- "object_position": null,
1153
- "order": null,
1154
- "overflow": null,
1155
- "overflow_x": null,
1156
- "overflow_y": null,
1157
- "padding": null,
1158
- "right": null,
1159
- "top": null,
1160
- "visibility": null,
1161
- "width": null
1162
  }
1163
  },
1164
- "0fee30c9bf2b4bdfad7a37261f92db64": {
1165
  "model_module": "@jupyter-widgets/controls",
 
1166
  "model_name": "DescriptionStyleModel",
 
 
 
 
 
 
 
 
 
 
 
 
 
1167
  "model_module_version": "1.5.0",
 
1168
  "state": {
1169
  "_model_module": "@jupyter-widgets/controls",
1170
  "_model_module_version": "1.5.0",
@@ -1176,10 +1100,28 @@
1176
  "description_width": ""
1177
  }
1178
  },
1179
- "4cd8babc92cc4aeba74d2147f28dee7d": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1180
  "model_module": "@jupyter-widgets/base",
1181
- "model_name": "LayoutModel",
1182
  "model_module_version": "1.2.0",
 
1183
  "state": {
1184
  "_model_module": "@jupyter-widgets/base",
1185
  "_model_module_version": "1.2.0",
@@ -1228,25 +1170,10 @@
1228
  "width": null
1229
  }
1230
  },
1231
- "a4fbf37fe0fe44cfbf72ca1e82af3467": {
1232
- "model_module": "@jupyter-widgets/controls",
1233
- "model_name": "DescriptionStyleModel",
1234
- "model_module_version": "1.5.0",
1235
- "state": {
1236
- "_model_module": "@jupyter-widgets/controls",
1237
- "_model_module_version": "1.5.0",
1238
- "_model_name": "DescriptionStyleModel",
1239
- "_view_count": null,
1240
- "_view_module": "@jupyter-widgets/base",
1241
- "_view_module_version": "1.2.0",
1242
- "_view_name": "StyleView",
1243
- "description_width": ""
1244
- }
1245
- },
1246
- "be50e04c5629463eb18d029d045f25b3": {
1247
  "model_module": "@jupyter-widgets/base",
1248
- "model_name": "LayoutModel",
1249
  "model_module_version": "1.2.0",
 
1250
  "state": {
1251
  "_model_module": "@jupyter-widgets/base",
1252
  "_model_module_version": "1.2.0",
@@ -1295,10 +1222,116 @@
1295
  "width": null
1296
  }
1297
  },
1298
- "5490c69c251144c4979e346c66ac1e53": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1299
  "model_module": "@jupyter-widgets/controls",
1300
- "model_name": "DescriptionStyleModel",
1301
  "model_module_version": "1.5.0",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1302
  "state": {
1303
  "_model_module": "@jupyter-widgets/controls",
1304
  "_model_module_version": "1.5.0",
@@ -1310,10 +1343,10 @@
1310
  "description_width": ""
1311
  }
1312
  },
1313
- "44d0e1db5f664b3fb7c146c216566776": {
1314
  "model_module": "@jupyter-widgets/base",
1315
- "model_name": "LayoutModel",
1316
  "model_module_version": "1.2.0",
 
1317
  "state": {
1318
  "_model_module": "@jupyter-widgets/base",
1319
  "_model_module_version": "1.2.0",
@@ -1362,26 +1395,32 @@
1362
  "width": null
1363
  }
1364
  },
1365
- "7af918a10ec745d7a3f4a883dbdc8b6a": {
1366
  "model_module": "@jupyter-widgets/controls",
1367
- "model_name": "ButtonStyleModel",
1368
  "model_module_version": "1.5.0",
 
1369
  "state": {
 
1370
  "_model_module": "@jupyter-widgets/controls",
1371
  "_model_module_version": "1.5.0",
1372
- "_model_name": "ButtonStyleModel",
1373
  "_view_count": null,
1374
- "_view_module": "@jupyter-widgets/base",
1375
- "_view_module_version": "1.2.0",
1376
- "_view_name": "StyleView",
1377
- "button_color": null,
1378
- "font_weight": ""
 
 
 
 
 
1379
  }
1380
  },
1381
  "4156b6897089446984196606ef0d3461": {
1382
  "model_module": "@jupyter-widgets/base",
1383
- "model_name": "LayoutModel",
1384
  "model_module_version": "1.2.0",
 
1385
  "state": {
1386
  "_model_module": "@jupyter-widgets/base",
1387
  "_model_module_version": "1.2.0",
@@ -1430,46 +1469,10 @@
1430
  "width": null
1431
  }
1432
  },
1433
- "cf4b5a9cefe84fd9a4d120ab1da6f3f4": {
1434
- "model_module": "@jupyter-widgets/controls",
1435
- "model_name": "DescriptionStyleModel",
1436
- "model_module_version": "1.5.0",
1437
- "state": {
1438
- "_model_module": "@jupyter-widgets/controls",
1439
- "_model_module_version": "1.5.0",
1440
- "_model_name": "DescriptionStyleModel",
1441
- "_view_count": null,
1442
- "_view_module": "@jupyter-widgets/base",
1443
- "_view_module_version": "1.2.0",
1444
- "_view_name": "StyleView",
1445
- "description_width": ""
1446
- }
1447
- },
1448
- "484155e67e36453c9d1ebd2ea1768eca": {
1449
- "model_module": "@jupyter-widgets/controls",
1450
- "model_name": "LabelModel",
1451
- "model_module_version": "1.5.0",
1452
- "state": {
1453
- "_dom_classes": [],
1454
- "_model_module": "@jupyter-widgets/controls",
1455
- "_model_module_version": "1.5.0",
1456
- "_model_name": "LabelModel",
1457
- "_view_count": null,
1458
- "_view_module": "@jupyter-widgets/controls",
1459
- "_view_module_version": "1.5.0",
1460
- "_view_name": "LabelView",
1461
- "description": "",
1462
- "description_tooltip": null,
1463
- "layout": "IPY_MODEL_48bb89c434284b639f45b5929cf8d1a9",
1464
- "placeholder": "​",
1465
- "style": "IPY_MODEL_0ead4ab9bb7648c69352094bfbcb8800",
1466
- "value": "Connecting..."
1467
- }
1468
- },
1469
- "48bb89c434284b639f45b5929cf8d1a9": {
1470
  "model_module": "@jupyter-widgets/base",
1471
- "model_name": "LayoutModel",
1472
  "model_module_version": "1.2.0",
 
1473
  "state": {
1474
  "_model_module": "@jupyter-widgets/base",
1475
  "_model_module_version": "1.2.0",
@@ -1518,113 +1521,98 @@
1518
  "width": null
1519
  }
1520
  },
1521
- "0ead4ab9bb7648c69352094bfbcb8800": {
1522
- "model_module": "@jupyter-widgets/controls",
1523
- "model_name": "DescriptionStyleModel",
1524
- "model_module_version": "1.5.0",
1525
  "state": {
1526
- "_model_module": "@jupyter-widgets/controls",
1527
- "_model_module_version": "1.5.0",
1528
- "_model_name": "DescriptionStyleModel",
1529
  "_view_count": null,
1530
  "_view_module": "@jupyter-widgets/base",
1531
  "_view_module_version": "1.2.0",
1532
- "_view_name": "StyleView",
1533
- "description_width": ""
1534
- }
1535
- },
1536
- "50f23a17be4f46b687b1b1df1e70238a": {
1537
- "model_module": "@jupyter-widgets/controls",
1538
- "model_name": "HBoxModel",
1539
- "model_module_version": "1.5.0",
1540
- "state": {
1541
- "_dom_classes": [],
1542
- "_model_module": "@jupyter-widgets/controls",
1543
- "_model_module_version": "1.5.0",
1544
- "_model_name": "HBoxModel",
1545
- "_view_count": null,
1546
- "_view_module": "@jupyter-widgets/controls",
1547
- "_view_module_version": "1.5.0",
1548
- "_view_name": "HBoxView",
1549
- "box_style": "",
1550
- "children": [
1551
- "IPY_MODEL_f1a0aa50142044f5817f8676103bff58",
1552
- "IPY_MODEL_02cb2b35986f4ea294fbf6b2490972d5",
1553
- "IPY_MODEL_ecc5cc7a30fa48f2afc708f8a40f50eb"
1554
- ],
1555
- "layout": "IPY_MODEL_6e80b0bf0aa2433fa97d6f43dd21e2cf"
1556
- }
1557
- },
1558
- "f1a0aa50142044f5817f8676103bff58": {
1559
- "model_module": "@jupyter-widgets/controls",
1560
- "model_name": "HTMLModel",
1561
- "model_module_version": "1.5.0",
1562
- "state": {
1563
- "_dom_classes": [],
1564
- "_model_module": "@jupyter-widgets/controls",
1565
- "_model_module_version": "1.5.0",
1566
- "_model_name": "HTMLModel",
1567
- "_view_count": null,
1568
- "_view_module": "@jupyter-widgets/controls",
1569
- "_view_module_version": "1.5.0",
1570
- "_view_name": "HTMLView",
1571
- "description": "",
1572
- "description_tooltip": null,
1573
- "layout": "IPY_MODEL_d7d628d4ef7c4b888fd2f70e472dac10",
1574
- "placeholder": "​",
1575
- "style": "IPY_MODEL_c42bbb0b19a544f58a5737acb1d97a85",
1576
- "value": "Upload 3 LFS files: 100%"
1577
  }
1578
  },
1579
- "02cb2b35986f4ea294fbf6b2490972d5": {
1580
  "model_module": "@jupyter-widgets/controls",
1581
- "model_name": "FloatProgressModel",
1582
  "model_module_version": "1.5.0",
 
1583
  "state": {
1584
  "_dom_classes": [],
1585
  "_model_module": "@jupyter-widgets/controls",
1586
  "_model_module_version": "1.5.0",
1587
- "_model_name": "FloatProgressModel",
1588
  "_view_count": null,
1589
  "_view_module": "@jupyter-widgets/controls",
1590
  "_view_module_version": "1.5.0",
1591
- "_view_name": "ProgressView",
1592
- "bar_style": "success",
1593
  "description": "",
1594
  "description_tooltip": null,
1595
- "layout": "IPY_MODEL_7cd065e777f54efb857efb92415997b2",
1596
- "max": 3,
1597
- "min": 0,
1598
- "orientation": "horizontal",
1599
- "style": "IPY_MODEL_970c3d13cbbf47da986503c6ad99f506",
1600
- "value": 3
1601
  }
1602
  },
1603
- "ecc5cc7a30fa48f2afc708f8a40f50eb": {
1604
  "model_module": "@jupyter-widgets/controls",
1605
- "model_name": "HTMLModel",
1606
  "model_module_version": "1.5.0",
 
1607
  "state": {
1608
- "_dom_classes": [],
1609
  "_model_module": "@jupyter-widgets/controls",
1610
  "_model_module_version": "1.5.0",
1611
- "_model_name": "HTMLModel",
1612
  "_view_count": null,
1613
- "_view_module": "@jupyter-widgets/controls",
1614
- "_view_module_version": "1.5.0",
1615
- "_view_name": "HTMLView",
1616
- "description": "",
1617
- "description_tooltip": null,
1618
- "layout": "IPY_MODEL_96ba1124ba7642fe9d616b348499d0ef",
1619
- "placeholder": "​",
1620
- "style": "IPY_MODEL_484d11b997454f88902fe507c1156698",
1621
- "value": " 3/3 [00:24&lt;00:00, 24.17s/it]"
1622
  }
1623
  },
1624
- "6e80b0bf0aa2433fa97d6f43dd21e2cf": {
1625
  "model_module": "@jupyter-widgets/base",
1626
- "model_name": "LayoutModel",
1627
  "model_module_version": "1.2.0",
 
1628
  "state": {
1629
  "_model_module": "@jupyter-widgets/base",
1630
  "_model_module_version": "1.2.0",
@@ -1673,10 +1661,10 @@
1673
  "width": null
1674
  }
1675
  },
1676
- "d7d628d4ef7c4b888fd2f70e472dac10": {
1677
  "model_module": "@jupyter-widgets/base",
1678
- "model_name": "LayoutModel",
1679
  "model_module_version": "1.2.0",
 
1680
  "state": {
1681
  "_model_module": "@jupyter-widgets/base",
1682
  "_model_module_version": "1.2.0",
@@ -1725,25 +1713,10 @@
1725
  "width": null
1726
  }
1727
  },
1728
- "c42bbb0b19a544f58a5737acb1d97a85": {
1729
- "model_module": "@jupyter-widgets/controls",
1730
- "model_name": "DescriptionStyleModel",
1731
- "model_module_version": "1.5.0",
1732
- "state": {
1733
- "_model_module": "@jupyter-widgets/controls",
1734
- "_model_module_version": "1.5.0",
1735
- "_model_name": "DescriptionStyleModel",
1736
- "_view_count": null,
1737
- "_view_module": "@jupyter-widgets/base",
1738
- "_view_module_version": "1.2.0",
1739
- "_view_name": "StyleView",
1740
- "description_width": ""
1741
- }
1742
- },
1743
- "7cd065e777f54efb857efb92415997b2": {
1744
  "model_module": "@jupyter-widgets/base",
1745
- "model_name": "LayoutModel",
1746
  "model_module_version": "1.2.0",
 
1747
  "state": {
1748
  "_model_module": "@jupyter-widgets/base",
1749
  "_model_module_version": "1.2.0",
@@ -1792,26 +1765,89 @@
1792
  "width": null
1793
  }
1794
  },
1795
- "970c3d13cbbf47da986503c6ad99f506": {
1796
  "model_module": "@jupyter-widgets/controls",
1797
- "model_name": "ProgressStyleModel",
1798
  "model_module_version": "1.5.0",
 
1799
  "state": {
 
1800
  "_model_module": "@jupyter-widgets/controls",
1801
  "_model_module_version": "1.5.0",
1802
- "_model_name": "ProgressStyleModel",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1803
  "_view_count": null,
1804
  "_view_module": "@jupyter-widgets/base",
1805
  "_view_module_version": "1.2.0",
1806
  "_view_name": "StyleView",
1807
- "bar_color": null,
1808
  "description_width": ""
1809
  }
1810
  },
1811
- "96ba1124ba7642fe9d616b348499d0ef": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1812
  "model_module": "@jupyter-widgets/base",
1813
- "model_name": "LayoutModel",
1814
  "model_module_version": "1.2.0",
 
1815
  "state": {
1816
  "_model_module": "@jupyter-widgets/base",
1817
  "_model_module_version": "1.2.0",
@@ -1860,68 +1896,78 @@
1860
  "width": null
1861
  }
1862
  },
1863
- "484d11b997454f88902fe507c1156698": {
1864
  "model_module": "@jupyter-widgets/controls",
1865
- "model_name": "DescriptionStyleModel",
1866
  "model_module_version": "1.5.0",
 
1867
  "state": {
1868
  "_model_module": "@jupyter-widgets/controls",
1869
  "_model_module_version": "1.5.0",
1870
- "_model_name": "DescriptionStyleModel",
1871
  "_view_count": null,
1872
  "_view_module": "@jupyter-widgets/base",
1873
  "_view_module_version": "1.2.0",
1874
  "_view_name": "StyleView",
1875
- "description_width": ""
 
1876
  }
1877
  },
1878
- "03e863ab55424bdabbc87bb965562b9b": {
1879
- "model_module": "@jupyter-widgets/controls",
1880
- "model_name": "HBoxModel",
1881
- "model_module_version": "1.5.0",
1882
  "state": {
1883
- "_dom_classes": [],
1884
- "_model_module": "@jupyter-widgets/controls",
1885
- "_model_module_version": "1.5.0",
1886
- "_model_name": "HBoxModel",
1887
  "_view_count": null,
1888
- "_view_module": "@jupyter-widgets/controls",
1889
- "_view_module_version": "1.5.0",
1890
- "_view_name": "HBoxView",
1891
- "box_style": "",
1892
- "children": [
1893
- "IPY_MODEL_62c1056ac5f14b1caa235010d33f241a",
1894
- "IPY_MODEL_a415d0d029ec4225864ed59db18c20b3",
1895
- "IPY_MODEL_56cda03822db434bb17b7cadbbaeb81b"
1896
- ],
1897
- "layout": "IPY_MODEL_4e085c242353430b8e32afa3bb260aa9"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1898
  }
1899
  },
1900
- "62c1056ac5f14b1caa235010d33f241a": {
1901
  "model_module": "@jupyter-widgets/controls",
1902
- "model_name": "HTMLModel",
1903
  "model_module_version": "1.5.0",
1904
- "state": {
1905
- "_dom_classes": [],
1906
- "_model_module": "@jupyter-widgets/controls",
1907
- "_model_module_version": "1.5.0",
1908
- "_model_name": "HTMLModel",
1909
- "_view_count": null,
1910
- "_view_module": "@jupyter-widgets/controls",
1911
- "_view_module_version": "1.5.0",
1912
- "_view_name": "HTMLView",
1913
- "description": "",
1914
- "description_tooltip": null,
1915
- "layout": "IPY_MODEL_c55a0063346d4c429cbc000bbd612287",
1916
- "placeholder": "​",
1917
- "style": "IPY_MODEL_ba8348a627bc41149e888f0deae68a51",
1918
- "value": "events.out.tfevents.1740055910.82ea94387a47.41010.0: 100%"
1919
- }
1920
- },
1921
- "a415d0d029ec4225864ed59db18c20b3": {
1922
- "model_module": "@jupyter-widgets/controls",
1923
  "model_name": "FloatProgressModel",
1924
- "model_module_version": "1.5.0",
1925
  "state": {
1926
  "_dom_classes": [],
1927
  "_model_module": "@jupyter-widgets/controls",
@@ -1934,18 +1980,18 @@
1934
  "bar_style": "success",
1935
  "description": "",
1936
  "description_tooltip": null,
1937
- "layout": "IPY_MODEL_bef10393380046c3a842dc979ed8c01f",
1938
- "max": 17132,
1939
  "min": 0,
1940
  "orientation": "horizontal",
1941
- "style": "IPY_MODEL_052ce17694bd4d3291ff5a10d2702b4b",
1942
- "value": 17132
1943
  }
1944
  },
1945
- "56cda03822db434bb17b7cadbbaeb81b": {
1946
  "model_module": "@jupyter-widgets/controls",
1947
- "model_name": "HTMLModel",
1948
  "model_module_version": "1.5.0",
 
1949
  "state": {
1950
  "_dom_classes": [],
1951
  "_model_module": "@jupyter-widgets/controls",
@@ -1957,16 +2003,32 @@
1957
  "_view_name": "HTMLView",
1958
  "description": "",
1959
  "description_tooltip": null,
1960
- "layout": "IPY_MODEL_11775cc8d35442c3a31452d66f6104e7",
1961
  "placeholder": "​",
1962
- "style": "IPY_MODEL_af6707547c5243eb9227efc0eb76134e",
1963
- "value": " 17.1k/17.1k [00:00&lt;00:00,121kB/s]"
1964
  }
1965
  },
1966
- "4e085c242353430b8e32afa3bb260aa9": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1967
  "model_module": "@jupyter-widgets/base",
1968
- "model_name": "LayoutModel",
1969
  "model_module_version": "1.2.0",
 
1970
  "state": {
1971
  "_model_module": "@jupyter-widgets/base",
1972
  "_model_module_version": "1.2.0",
@@ -2015,10 +2077,10 @@
2015
  "width": null
2016
  }
2017
  },
2018
- "c55a0063346d4c429cbc000bbd612287": {
2019
  "model_module": "@jupyter-widgets/base",
2020
- "model_name": "LayoutModel",
2021
  "model_module_version": "1.2.0",
 
2022
  "state": {
2023
  "_model_module": "@jupyter-widgets/base",
2024
  "_model_module_version": "1.2.0",
@@ -2067,25 +2129,26 @@
2067
  "width": null
2068
  }
2069
  },
2070
- "ba8348a627bc41149e888f0deae68a51": {
2071
  "model_module": "@jupyter-widgets/controls",
2072
- "model_name": "DescriptionStyleModel",
2073
  "model_module_version": "1.5.0",
 
2074
  "state": {
2075
  "_model_module": "@jupyter-widgets/controls",
2076
  "_model_module_version": "1.5.0",
2077
- "_model_name": "DescriptionStyleModel",
2078
  "_view_count": null,
2079
  "_view_module": "@jupyter-widgets/base",
2080
  "_view_module_version": "1.2.0",
2081
  "_view_name": "StyleView",
 
2082
  "description_width": ""
2083
  }
2084
  },
2085
- "bef10393380046c3a842dc979ed8c01f": {
2086
  "model_module": "@jupyter-widgets/base",
2087
- "model_name": "LayoutModel",
2088
  "model_module_version": "1.2.0",
 
2089
  "state": {
2090
  "_model_module": "@jupyter-widgets/base",
2091
  "_model_module_version": "1.2.0",
@@ -2134,26 +2197,10 @@
2134
  "width": null
2135
  }
2136
  },
2137
- "052ce17694bd4d3291ff5a10d2702b4b": {
2138
- "model_module": "@jupyter-widgets/controls",
2139
- "model_name": "ProgressStyleModel",
2140
- "model_module_version": "1.5.0",
2141
- "state": {
2142
- "_model_module": "@jupyter-widgets/controls",
2143
- "_model_module_version": "1.5.0",
2144
- "_model_name": "ProgressStyleModel",
2145
- "_view_count": null,
2146
- "_view_module": "@jupyter-widgets/base",
2147
- "_view_module_version": "1.2.0",
2148
- "_view_name": "StyleView",
2149
- "bar_color": null,
2150
- "description_width": ""
2151
- }
2152
- },
2153
- "11775cc8d35442c3a31452d66f6104e7": {
2154
  "model_module": "@jupyter-widgets/base",
2155
- "model_name": "LayoutModel",
2156
  "model_module_version": "1.2.0",
 
2157
  "state": {
2158
  "_model_module": "@jupyter-widgets/base",
2159
  "_model_module_version": "1.2.0",
@@ -2202,68 +2249,10 @@
2202
  "width": null
2203
  }
2204
  },
2205
- "af6707547c5243eb9227efc0eb76134e": {
2206
- "model_module": "@jupyter-widgets/controls",
2207
- "model_name": "DescriptionStyleModel",
2208
- "model_module_version": "1.5.0",
2209
- "state": {
2210
- "_model_module": "@jupyter-widgets/controls",
2211
- "_model_module_version": "1.5.0",
2212
- "_model_name": "DescriptionStyleModel",
2213
- "_view_count": null,
2214
- "_view_module": "@jupyter-widgets/base",
2215
- "_view_module_version": "1.2.0",
2216
- "_view_name": "StyleView",
2217
- "description_width": ""
2218
- }
2219
- },
2220
- "3d6e35795ba24eed96f2fa842b265e5b": {
2221
- "model_module": "@jupyter-widgets/controls",
2222
- "model_name": "HBoxModel",
2223
- "model_module_version": "1.5.0",
2224
- "state": {
2225
- "_dom_classes": [],
2226
- "_model_module": "@jupyter-widgets/controls",
2227
- "_model_module_version": "1.5.0",
2228
- "_model_name": "HBoxModel",
2229
- "_view_count": null,
2230
- "_view_module": "@jupyter-widgets/controls",
2231
- "_view_module_version": "1.5.0",
2232
- "_view_name": "HBoxView",
2233
- "box_style": "",
2234
- "children": [
2235
- "IPY_MODEL_84376aa81cae42a18fc49bdded395187",
2236
- "IPY_MODEL_805784ce9e65411dbf35373db3680920",
2237
- "IPY_MODEL_308ab776682a4e93ac05e06aa98a77f1"
2238
- ],
2239
- "layout": "IPY_MODEL_13ece6fbb1d84f03ae434119de486f07"
2240
- }
2241
- },
2242
- "84376aa81cae42a18fc49bdded395187": {
2243
  "model_module": "@jupyter-widgets/controls",
2244
- "model_name": "HTMLModel",
2245
  "model_module_version": "1.5.0",
2246
- "state": {
2247
- "_dom_classes": [],
2248
- "_model_module": "@jupyter-widgets/controls",
2249
- "_model_module_version": "1.5.0",
2250
- "_model_name": "HTMLModel",
2251
- "_view_count": null,
2252
- "_view_module": "@jupyter-widgets/controls",
2253
- "_view_module_version": "1.5.0",
2254
- "_view_name": "HTMLView",
2255
- "description": "",
2256
- "description_tooltip": null,
2257
- "layout": "IPY_MODEL_88ef32028fd640de85d75c197eca36eb",
2258
- "placeholder": "​",
2259
- "style": "IPY_MODEL_d926557788ba45c0bef88d9e8a4b56aa",
2260
- "value": "model.safetensors: 100%"
2261
- }
2262
- },
2263
- "805784ce9e65411dbf35373db3680920": {
2264
- "model_module": "@jupyter-widgets/controls",
2265
  "model_name": "FloatProgressModel",
2266
- "model_module_version": "1.5.0",
2267
  "state": {
2268
  "_dom_classes": [],
2269
  "_model_module": "@jupyter-widgets/controls",
@@ -2276,39 +2265,33 @@
2276
  "bar_style": "success",
2277
  "description": "",
2278
  "description_tooltip": null,
2279
- "layout": "IPY_MODEL_a5faec577a9844ea921c2bce1d472b23",
2280
- "max": 1015025832,
2281
  "min": 0,
2282
  "orientation": "horizontal",
2283
- "style": "IPY_MODEL_f1e2134eb4624735842db7c112b515a0",
2284
- "value": 1015025832
2285
  }
2286
  },
2287
- "308ab776682a4e93ac05e06aa98a77f1": {
2288
  "model_module": "@jupyter-widgets/controls",
2289
- "model_name": "HTMLModel",
2290
  "model_module_version": "1.5.0",
 
2291
  "state": {
2292
- "_dom_classes": [],
2293
  "_model_module": "@jupyter-widgets/controls",
2294
  "_model_module_version": "1.5.0",
2295
- "_model_name": "HTMLModel",
2296
  "_view_count": null,
2297
- "_view_module": "@jupyter-widgets/controls",
2298
- "_view_module_version": "1.5.0",
2299
- "_view_name": "HTMLView",
2300
- "description": "",
2301
- "description_tooltip": null,
2302
- "layout": "IPY_MODEL_0bd50b2853324f5c832821b7174c5ce2",
2303
- "placeholder": "​",
2304
- "style": "IPY_MODEL_077f3bcf99044d168250d1e6c4abbcae",
2305
- "value": " 1.02G/1.02G [00:23&lt;00:00, 46.4MB/s]"
2306
  }
2307
  },
2308
- "13ece6fbb1d84f03ae434119de486f07": {
2309
  "model_module": "@jupyter-widgets/base",
2310
- "model_name": "LayoutModel",
2311
  "model_module_version": "1.2.0",
 
2312
  "state": {
2313
  "_model_module": "@jupyter-widgets/base",
2314
  "_model_module_version": "1.2.0",
@@ -2357,62 +2340,63 @@
2357
  "width": null
2358
  }
2359
  },
2360
- "88ef32028fd640de85d75c197eca36eb": {
2361
- "model_module": "@jupyter-widgets/base",
2362
- "model_name": "LayoutModel",
2363
- "model_module_version": "1.2.0",
2364
  "state": {
2365
- "_model_module": "@jupyter-widgets/base",
2366
- "_model_module_version": "1.2.0",
2367
- "_model_name": "LayoutModel",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2368
  "_view_count": null,
2369
  "_view_module": "@jupyter-widgets/base",
2370
  "_view_module_version": "1.2.0",
2371
- "_view_name": "LayoutView",
2372
- "align_content": null,
2373
- "align_items": null,
2374
- "align_self": null,
2375
- "border": null,
2376
- "bottom": null,
2377
- "display": null,
2378
- "flex": null,
2379
- "flex_flow": null,
2380
- "grid_area": null,
2381
- "grid_auto_columns": null,
2382
- "grid_auto_flow": null,
2383
- "grid_auto_rows": null,
2384
- "grid_column": null,
2385
- "grid_gap": null,
2386
- "grid_row": null,
2387
- "grid_template_areas": null,
2388
- "grid_template_columns": null,
2389
- "grid_template_rows": null,
2390
- "height": null,
2391
- "justify_content": null,
2392
- "justify_items": null,
2393
- "left": null,
2394
- "margin": null,
2395
- "max_height": null,
2396
- "max_width": null,
2397
- "min_height": null,
2398
- "min_width": null,
2399
- "object_fit": null,
2400
- "object_position": null,
2401
- "order": null,
2402
- "overflow": null,
2403
- "overflow_x": null,
2404
- "overflow_y": null,
2405
- "padding": null,
2406
- "right": null,
2407
- "top": null,
2408
- "visibility": null,
2409
- "width": null
2410
  }
2411
  },
2412
- "d926557788ba45c0bef88d9e8a4b56aa": {
2413
  "model_module": "@jupyter-widgets/controls",
2414
- "model_name": "DescriptionStyleModel",
2415
  "model_module_version": "1.5.0",
 
2416
  "state": {
2417
  "_model_module": "@jupyter-widgets/controls",
2418
  "_model_module_version": "1.5.0",
@@ -2424,10 +2408,32 @@
2424
  "description_width": ""
2425
  }
2426
  },
2427
- "a5faec577a9844ea921c2bce1d472b23": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2428
  "model_module": "@jupyter-widgets/base",
2429
- "model_name": "LayoutModel",
2430
  "model_module_version": "1.2.0",
 
2431
  "state": {
2432
  "_model_module": "@jupyter-widgets/base",
2433
  "_model_module_version": "1.2.0",
@@ -2476,26 +2482,10 @@
2476
  "width": null
2477
  }
2478
  },
2479
- "f1e2134eb4624735842db7c112b515a0": {
2480
- "model_module": "@jupyter-widgets/controls",
2481
- "model_name": "ProgressStyleModel",
2482
- "model_module_version": "1.5.0",
2483
- "state": {
2484
- "_model_module": "@jupyter-widgets/controls",
2485
- "_model_module_version": "1.5.0",
2486
- "_model_name": "ProgressStyleModel",
2487
- "_view_count": null,
2488
- "_view_module": "@jupyter-widgets/base",
2489
- "_view_module_version": "1.2.0",
2490
- "_view_name": "StyleView",
2491
- "bar_color": null,
2492
- "description_width": ""
2493
- }
2494
- },
2495
- "0bd50b2853324f5c832821b7174c5ce2": {
2496
  "model_module": "@jupyter-widgets/base",
2497
- "model_name": "LayoutModel",
2498
  "model_module_version": "1.2.0",
 
2499
  "state": {
2500
  "_model_module": "@jupyter-widgets/base",
2501
  "_model_module_version": "1.2.0",
@@ -2544,68 +2534,10 @@
2544
  "width": null
2545
  }
2546
  },
2547
- "077f3bcf99044d168250d1e6c4abbcae": {
2548
- "model_module": "@jupyter-widgets/controls",
2549
- "model_name": "DescriptionStyleModel",
2550
- "model_module_version": "1.5.0",
2551
- "state": {
2552
- "_model_module": "@jupyter-widgets/controls",
2553
- "_model_module_version": "1.5.0",
2554
- "_model_name": "DescriptionStyleModel",
2555
- "_view_count": null,
2556
- "_view_module": "@jupyter-widgets/base",
2557
- "_view_module_version": "1.2.0",
2558
- "_view_name": "StyleView",
2559
- "description_width": ""
2560
- }
2561
- },
2562
- "bc2065597db04146a6df7ed10de7b93c": {
2563
- "model_module": "@jupyter-widgets/controls",
2564
- "model_name": "HBoxModel",
2565
- "model_module_version": "1.5.0",
2566
- "state": {
2567
- "_dom_classes": [],
2568
- "_model_module": "@jupyter-widgets/controls",
2569
- "_model_module_version": "1.5.0",
2570
- "_model_name": "HBoxModel",
2571
- "_view_count": null,
2572
- "_view_module": "@jupyter-widgets/controls",
2573
- "_view_module_version": "1.5.0",
2574
- "_view_name": "HBoxView",
2575
- "box_style": "",
2576
- "children": [
2577
- "IPY_MODEL_24cf286de1cf40f299bf0797f74c85eb",
2578
- "IPY_MODEL_c349943637234fbc96a2a9f325d3c9f1",
2579
- "IPY_MODEL_2d4d2f5ffae5451ebf1583362da3e9a9"
2580
- ],
2581
- "layout": "IPY_MODEL_3cd8e1e9fc234219b8fbd4161799640f"
2582
- }
2583
- },
2584
- "24cf286de1cf40f299bf0797f74c85eb": {
2585
- "model_module": "@jupyter-widgets/controls",
2586
- "model_name": "HTMLModel",
2587
- "model_module_version": "1.5.0",
2588
- "state": {
2589
- "_dom_classes": [],
2590
- "_model_module": "@jupyter-widgets/controls",
2591
- "_model_module_version": "1.5.0",
2592
- "_model_name": "HTMLModel",
2593
- "_view_count": null,
2594
- "_view_module": "@jupyter-widgets/controls",
2595
- "_view_module_version": "1.5.0",
2596
- "_view_name": "HTMLView",
2597
- "description": "",
2598
- "description_tooltip": null,
2599
- "layout": "IPY_MODEL_9e604469cb34439c944e84238b1ec055",
2600
- "placeholder": "​",
2601
- "style": "IPY_MODEL_ba94df0961c44b6d974ef882297731d8",
2602
- "value": "training_args.bin: 100%"
2603
- }
2604
- },
2605
  "c349943637234fbc96a2a9f325d3c9f1": {
2606
  "model_module": "@jupyter-widgets/controls",
2607
- "model_name": "FloatProgressModel",
2608
  "model_module_version": "1.5.0",
 
2609
  "state": {
2610
  "_dom_classes": [],
2611
  "_model_module": "@jupyter-widgets/controls",
@@ -2626,31 +2558,25 @@
2626
  "value": 5432
2627
  }
2628
  },
2629
- "2d4d2f5ffae5451ebf1583362da3e9a9": {
2630
  "model_module": "@jupyter-widgets/controls",
2631
- "model_name": "HTMLModel",
2632
  "model_module_version": "1.5.0",
 
2633
  "state": {
2634
- "_dom_classes": [],
2635
  "_model_module": "@jupyter-widgets/controls",
2636
  "_model_module_version": "1.5.0",
2637
- "_model_name": "HTMLModel",
2638
  "_view_count": null,
2639
- "_view_module": "@jupyter-widgets/controls",
2640
- "_view_module_version": "1.5.0",
2641
- "_view_name": "HTMLView",
2642
- "description": "",
2643
- "description_tooltip": null,
2644
- "layout": "IPY_MODEL_9a6a8a5bf8f1479eb478a8c81b58aa69",
2645
- "placeholder": "​",
2646
- "style": "IPY_MODEL_3a51fdcde7984422b7a0925057f6cc37",
2647
- "value": " 5.43k/5.43k [00:00&lt;00:00, 40.2kB/s]"
2648
  }
2649
  },
2650
- "3cd8e1e9fc234219b8fbd4161799640f": {
2651
  "model_module": "@jupyter-widgets/base",
2652
- "model_name": "LayoutModel",
2653
  "model_module_version": "1.2.0",
 
2654
  "state": {
2655
  "_model_module": "@jupyter-widgets/base",
2656
  "_model_module_version": "1.2.0",
@@ -2699,10 +2625,25 @@
2699
  "width": null
2700
  }
2701
  },
2702
- "9e604469cb34439c944e84238b1ec055": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2703
  "model_module": "@jupyter-widgets/base",
2704
- "model_name": "LayoutModel",
2705
  "model_module_version": "1.2.0",
 
2706
  "state": {
2707
  "_model_module": "@jupyter-widgets/base",
2708
  "_model_module_version": "1.2.0",
@@ -2751,10 +2692,10 @@
2751
  "width": null
2752
  }
2753
  },
2754
- "ba94df0961c44b6d974ef882297731d8": {
2755
  "model_module": "@jupyter-widgets/controls",
2756
- "model_name": "DescriptionStyleModel",
2757
  "model_module_version": "1.5.0",
 
2758
  "state": {
2759
  "_model_module": "@jupyter-widgets/controls",
2760
  "_model_module_version": "1.5.0",
@@ -2766,10 +2707,32 @@
2766
  "description_width": ""
2767
  }
2768
  },
2769
- "419572dbd59a4583831961b7d8ecfa4a": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2770
  "model_module": "@jupyter-widgets/base",
2771
- "model_name": "LayoutModel",
2772
  "model_module_version": "1.2.0",
 
2773
  "state": {
2774
  "_model_module": "@jupyter-widgets/base",
2775
  "_model_module_version": "1.2.0",
@@ -2818,10 +2781,52 @@
2818
  "width": null
2819
  }
2820
  },
2821
- "88cf901fb47a4925930b7deffe98a9ce": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2822
  "model_module": "@jupyter-widgets/controls",
2823
- "model_name": "ProgressStyleModel",
2824
  "model_module_version": "1.5.0",
 
2825
  "state": {
2826
  "_model_module": "@jupyter-widgets/controls",
2827
  "_model_module_version": "1.5.0",
@@ -2834,10 +2839,10 @@
2834
  "description_width": ""
2835
  }
2836
  },
2837
- "9a6a8a5bf8f1479eb478a8c81b58aa69": {
2838
  "model_module": "@jupyter-widgets/base",
2839
- "model_name": "LayoutModel",
2840
  "model_module_version": "1.2.0",
 
2841
  "state": {
2842
  "_model_module": "@jupyter-widgets/base",
2843
  "_model_module_version": "1.2.0",
@@ -2847,13 +2852,13 @@
2847
  "_view_module_version": "1.2.0",
2848
  "_view_name": "LayoutView",
2849
  "align_content": null,
2850
- "align_items": null,
2851
  "align_self": null,
2852
  "border": null,
2853
  "bottom": null,
2854
- "display": null,
2855
  "flex": null,
2856
- "flex_flow": null,
2857
  "grid_area": null,
2858
  "grid_auto_columns": null,
2859
  "grid_auto_flow": null,
@@ -2883,22 +2888,7 @@
2883
  "right": null,
2884
  "top": null,
2885
  "visibility": null,
2886
- "width": null
2887
- }
2888
- },
2889
- "3a51fdcde7984422b7a0925057f6cc37": {
2890
- "model_module": "@jupyter-widgets/controls",
2891
- "model_name": "DescriptionStyleModel",
2892
- "model_module_version": "1.5.0",
2893
- "state": {
2894
- "_model_module": "@jupyter-widgets/controls",
2895
- "_model_module_version": "1.5.0",
2896
- "_model_name": "DescriptionStyleModel",
2897
- "_view_count": null,
2898
- "_view_module": "@jupyter-widgets/base",
2899
- "_view_module_version": "1.2.0",
2900
- "_view_name": "StyleView",
2901
- "description_width": ""
2902
  }
2903
  }
2904
  }
@@ -2906,4 +2896,4 @@
2906
  },
2907
  "nbformat": 4,
2908
  "nbformat_minor": 0
2909
- }
 
1
  {
2
  "cells": [
 
 
 
 
 
 
 
 
 
 
3
  {
4
  "cell_type": "markdown",
5
  "metadata": {
 
22
  },
23
  "outputs": [
24
  {
 
25
  "name": "stdout",
26
+ "output_type": "stream",
27
  "text": [
28
  " Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
29
  "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m163.5/163.5 kB\u001b[0m \u001b[31m5.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
 
37
  },
38
  {
39
  "cell_type": "code",
40
+ "execution_count": null,
 
 
41
  "metadata": {
42
  "id": "FCYgmJtDRElR"
43
  },
44
+ "outputs": [],
45
+ "source": [
46
+ "!pip install -q git+https://github.com/huggingface/transformers.git"
47
+ ]
48
  },
49
  {
50
  "cell_type": "code",
 
101
  },
102
  "outputs": [
103
  {
 
104
  "data": {
 
 
 
105
  "application/vnd.jupyter.widget-view+json": {
106
+ "model_id": "112da28d935543069e7a1a2abc22f9f4",
107
  "version_major": 2,
108
+ "version_minor": 0
109
+ },
110
+ "text/plain": [
111
+ "VBox(children=(HTML(value='<center> <img\\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…"
112
+ ]
113
  },
114
+ "metadata": {},
115
+ "output_type": "display_data"
116
  }
117
  ],
118
  "source": [
 
144
  },
145
  "outputs": [
146
  {
 
147
  "name": "stderr",
148
+ "output_type": "stream",
149
  "text": [
150
  "/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning: \n",
151
  "The secret `HF_TOKEN` does not exist in your Colab secrets.\n",
 
157
  ]
158
  },
159
  {
 
160
  "name": "stdout",
161
+ "output_type": "stream",
162
  "text": [
163
  "The model as is is holding: 0.97 of GPU RAM\n"
164
  ]
 
270
  },
271
  {
272
  "cell_type": "code",
273
+ "execution_count": null,
 
 
274
  "metadata": {
275
  "id": "KKEZPwinSwTr"
276
  },
277
+ "outputs": [],
278
+ "source": [
279
+ "del split_ds, ds"
280
+ ]
281
  },
282
  {
283
  "cell_type": "markdown",
 
300
  },
301
  "outputs": [
302
  {
 
303
  "name": "stdout",
304
+ "output_type": "stream",
305
  "text": [
306
  "prompt: A dog inside of a dog kennel on a patio., video: https://huggingface.co/datasets/hexuan21/VideoFeedback-videos-mp4/resolve/main/p/p110924.mp4\n"
307
  ]
 
490
  "cell_type": "code",
491
  "execution_count": null,
492
  "metadata": {
 
493
  "colab": {
494
  "base_uri": "https://localhost:8080/",
495
  "height": 1000
496
  },
497
+ "id": "_QOCpw_-uYYo",
498
  "outputId": "ad1fd1f6-41f9-4fa2-ae89-e75c9876cd65"
499
  },
500
  "outputs": [
501
  {
 
502
  "name": "stderr",
503
+ "output_type": "stream",
504
  "text": [
505
  "/usr/local/lib/python3.11/dist-packages/transformers/optimization.py:640: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning\n",
506
  " warnings.warn(\n"
507
  ]
508
  },
509
  {
 
510
  "data": {
 
 
 
511
  "text/html": [
512
  "\n",
513
  " <div>\n",
 
685
  " </tr>\n",
686
  " </tbody>\n",
687
  "</table><p>"
688
+ ],
689
+ "text/plain": [
690
+ "<IPython.core.display.HTML object>"
691
  ]
692
  },
693
+ "metadata": {},
694
+ "output_type": "display_data"
695
  },
696
  {
 
697
  "data": {
698
  "text/plain": [
699
  "TrainOutput(global_step=1000, training_loss=0.3446595501899719, metrics={'train_runtime': 1194.5916, 'train_samples_per_second': 1.674, 'train_steps_per_second': 0.837, 'total_flos': 1550232912784896.0, 'train_loss': 0.3446595501899719, 'epoch': 1.0})"
700
  ]
701
  },
702
+ "execution_count": 9,
703
  "metadata": {},
704
+ "output_type": "execute_result"
705
  }
706
  ],
707
  "source": [
 
712
  "cell_type": "code",
713
  "execution_count": null,
714
  "metadata": {
 
715
  "colab": {
716
  "base_uri": "https://localhost:8080/",
717
  "height": 214,
 
762
  "3a51fdcde7984422b7a0925057f6cc37"
763
  ]
764
  },
765
+ "id": "0hN0QD9_uYYo",
766
  "outputId": "20daaa82-d090-4a7b-c655-60eccf851f47"
767
  },
768
  "outputs": [
769
  {
 
770
  "data": {
 
 
 
771
  "application/vnd.jupyter.widget-view+json": {
772
+ "model_id": "50f23a17be4f46b687b1b1df1e70238a",
773
  "version_major": 2,
774
+ "version_minor": 0
775
+ },
776
+ "text/plain": [
777
+ "Upload 3 LFS files: 0%| | 0/3 [00:00<?, ?it/s]"
778
+ ]
779
  },
780
+ "metadata": {},
781
+ "output_type": "display_data"
782
  },
783
  {
 
784
  "data": {
 
 
 
785
  "application/vnd.jupyter.widget-view+json": {
786
+ "model_id": "03e863ab55424bdabbc87bb965562b9b",
787
  "version_major": 2,
788
+ "version_minor": 0
789
+ },
790
+ "text/plain": [
791
+ "events.out.tfevents.1740055910.82ea94387a47.41010.0: 0%| | 0.00/17.1k [00:00<?, ?B/s]"
792
+ ]
793
  },
794
+ "metadata": {},
795
+ "output_type": "display_data"
796
  },
797
  {
 
798
  "data": {
 
 
 
799
  "application/vnd.jupyter.widget-view+json": {
800
+ "model_id": "3d6e35795ba24eed96f2fa842b265e5b",
801
  "version_major": 2,
802
+ "version_minor": 0
803
+ },
804
+ "text/plain": [
805
+ "model.safetensors: 0%| | 0.00/1.02G [00:00<?, ?B/s]"
806
+ ]
807
  },
808
+ "metadata": {},
809
+ "output_type": "display_data"
810
  },
811
  {
 
812
  "data": {
 
 
 
813
  "application/vnd.jupyter.widget-view+json": {
814
+ "model_id": "bc2065597db04146a6df7ed10de7b93c",
815
  "version_major": 2,
816
+ "version_minor": 0
817
+ },
818
+ "text/plain": [
819
+ "training_args.bin: 0%| | 0.00/5.43k [00:00<?, ?B/s]"
820
+ ]
821
  },
822
+ "metadata": {},
823
+ "output_type": "display_data"
824
  },
825
  {
 
826
  "data": {
 
 
 
827
  "application/vnd.google.colaboratory.intrinsic+json": {
828
  "type": "string"
829
+ },
830
+ "text/plain": [
831
+ "CommitInfo(commit_url='https://huggingface.co/merve/SmolVLM2-500M-Video-Instruct-video-feedback/commit/2f33b0685d991475ac091593e224f3e5e7b7cac7', commit_message='End of training', commit_description='', oid='2f33b0685d991475ac091593e224f3e5e7b7cac7', pr_url=None, repo_url=RepoUrl('https://huggingface.co/merve/SmolVLM2-500M-Video-Instruct-video-feedback', endpoint='https://huggingface.co', repo_type='model', repo_id='merve/SmolVLM2-500M-Video-Instruct-video-feedback'), pr_revision=None, pr_num=None)"
832
+ ]
833
  },
834
+ "execution_count": 10,
835
  "metadata": {},
836
+ "output_type": "execute_result"
837
  }
838
  ],
839
  "source": [
 
842
  },
843
  {
844
  "cell_type": "markdown",
 
 
 
845
  "metadata": {
846
  "id": "4dewIZzjfpNx"
847
+ },
848
+ "source": [
849
+ "The test example is a video of a woman walking by, you can download and check from [here](https://huggingface.co/datasets/hexuan21/VideoFeedback-videos-mp4/blob/main/p/p000304.mp4)."
850
+ ]
851
  },
852
  {
853
  "cell_type": "code",
 
861
  },
862
  "outputs": [
863
  {
 
864
  "name": "stdout",
865
+ "output_type": "stream",
866
  "text": [
867
  "User: Caption the video.You are provided the following series of three frames from a 0:00:03 [H:MM:SS] video.\n",
868
  "\n",
 
898
  "accelerator": "GPU",
899
  "colab": {
900
  "gpuType": "A100",
901
+ "include_colab_link": true,
902
+ "provenance": []
903
  },
904
  "kernelspec": {
905
  "display_name": "Python 3 (ipykernel)",
 
920
  },
921
  "widgets": {
922
  "application/vnd.jupyter.widget-state+json": {
923
+ "02cb2b35986f4ea294fbf6b2490972d5": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
924
  "model_module": "@jupyter-widgets/controls",
 
925
  "model_module_version": "1.5.0",
926
+ "model_name": "FloatProgressModel",
927
  "state": {
928
  "_dom_classes": [],
929
  "_model_module": "@jupyter-widgets/controls",
930
  "_model_module_version": "1.5.0",
931
+ "_model_name": "FloatProgressModel",
932
  "_view_count": null,
933
  "_view_module": "@jupyter-widgets/controls",
934
  "_view_module_version": "1.5.0",
935
+ "_view_name": "ProgressView",
936
+ "bar_style": "success",
937
  "description": "",
938
  "description_tooltip": null,
939
+ "layout": "IPY_MODEL_7cd065e777f54efb857efb92415997b2",
940
+ "max": 3,
941
+ "min": 0,
942
+ "orientation": "horizontal",
943
+ "style": "IPY_MODEL_970c3d13cbbf47da986503c6ad99f506",
944
+ "value": 3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
945
  }
946
  },
947
+ "03e863ab55424bdabbc87bb965562b9b": {
948
  "model_module": "@jupyter-widgets/controls",
 
949
  "model_module_version": "1.5.0",
950
+ "model_name": "HBoxModel",
951
  "state": {
952
  "_dom_classes": [],
953
  "_model_module": "@jupyter-widgets/controls",
954
  "_model_module_version": "1.5.0",
955
+ "_model_name": "HBoxModel",
956
  "_view_count": null,
957
  "_view_module": "@jupyter-widgets/controls",
958
  "_view_module_version": "1.5.0",
959
+ "_view_name": "HBoxView",
960
+ "box_style": "",
961
+ "children": [
962
+ "IPY_MODEL_62c1056ac5f14b1caa235010d33f241a",
963
+ "IPY_MODEL_a415d0d029ec4225864ed59db18c20b3",
964
+ "IPY_MODEL_56cda03822db434bb17b7cadbbaeb81b"
965
+ ],
966
+ "layout": "IPY_MODEL_4e085c242353430b8e32afa3bb260aa9"
967
  }
968
  },
969
+ "052ce17694bd4d3291ff5a10d2702b4b": {
970
  "model_module": "@jupyter-widgets/controls",
 
971
  "model_module_version": "1.5.0",
972
+ "model_name": "ProgressStyleModel",
973
  "state": {
 
974
  "_model_module": "@jupyter-widgets/controls",
975
  "_model_module_version": "1.5.0",
976
+ "_model_name": "ProgressStyleModel",
977
  "_view_count": null,
978
+ "_view_module": "@jupyter-widgets/base",
979
+ "_view_module_version": "1.2.0",
980
+ "_view_name": "StyleView",
981
+ "bar_color": null,
982
+ "description_width": ""
 
 
 
 
 
983
  }
984
  },
985
+ "077f3bcf99044d168250d1e6c4abbcae": {
986
  "model_module": "@jupyter-widgets/controls",
 
987
  "model_module_version": "1.5.0",
988
+ "model_name": "DescriptionStyleModel",
989
  "state": {
 
990
  "_model_module": "@jupyter-widgets/controls",
991
  "_model_module_version": "1.5.0",
992
+ "_model_name": "DescriptionStyleModel",
993
  "_view_count": null,
994
+ "_view_module": "@jupyter-widgets/base",
995
+ "_view_module_version": "1.2.0",
996
+ "_view_name": "StyleView",
997
+ "description_width": ""
 
 
 
 
 
998
  }
999
  },
1000
+ "0bd50b2853324f5c832821b7174c5ce2": {
1001
  "model_module": "@jupyter-widgets/base",
 
1002
  "model_module_version": "1.2.0",
1003
+ "model_name": "LayoutModel",
1004
  "state": {
1005
  "_model_module": "@jupyter-widgets/base",
1006
  "_model_module_version": "1.2.0",
 
1010
  "_view_module_version": "1.2.0",
1011
  "_view_name": "LayoutView",
1012
  "align_content": null,
1013
+ "align_items": null,
1014
  "align_self": null,
1015
  "border": null,
1016
  "bottom": null,
1017
+ "display": null,
1018
  "flex": null,
1019
+ "flex_flow": null,
1020
  "grid_area": null,
1021
  "grid_auto_columns": null,
1022
  "grid_auto_flow": null,
 
1046
  "right": null,
1047
  "top": null,
1048
  "visibility": null,
1049
+ "width": null
1050
  }
1051
  },
1052
+ "0d22c009aa584ca1a71e32336a7985e0": {
1053
+ "model_module": "@jupyter-widgets/controls",
1054
+ "model_module_version": "1.5.0",
1055
+ "model_name": "HTMLModel",
1056
  "state": {
1057
+ "_dom_classes": [],
1058
+ "_model_module": "@jupyter-widgets/controls",
1059
+ "_model_module_version": "1.5.0",
1060
+ "_model_name": "HTMLModel",
1061
  "_view_count": null,
1062
+ "_view_module": "@jupyter-widgets/controls",
1063
+ "_view_module_version": "1.5.0",
1064
+ "_view_name": "HTMLView",
1065
+ "description": "",
1066
+ "description_tooltip": null,
1067
+ "layout": "IPY_MODEL_e99fbdfc8a22408a8c728a36c8744b24",
1068
+ "placeholder": "​",
1069
+ "style": "IPY_MODEL_0fee30c9bf2b4bdfad7a37261f92db64",
1070
+ "value": "<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.svg\nalt='Hugging Face'> <br> Copy a token from <a\nhref=\"https://huggingface.co/settings/tokens\" target=\"_blank\">your Hugging Face\ntokens page</a> and paste it below. <br> Immediately click login after copying\nyour token or it might be stored in plain text in this notebook file. </center>"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1071
  }
1072
  },
1073
+ "0ead4ab9bb7648c69352094bfbcb8800": {
1074
  "model_module": "@jupyter-widgets/controls",
1075
+ "model_module_version": "1.5.0",
1076
  "model_name": "DescriptionStyleModel",
1077
+ "state": {
1078
+ "_model_module": "@jupyter-widgets/controls",
1079
+ "_model_module_version": "1.5.0",
1080
+ "_model_name": "DescriptionStyleModel",
1081
+ "_view_count": null,
1082
+ "_view_module": "@jupyter-widgets/base",
1083
+ "_view_module_version": "1.2.0",
1084
+ "_view_name": "StyleView",
1085
+ "description_width": ""
1086
+ }
1087
+ },
1088
+ "0fee30c9bf2b4bdfad7a37261f92db64": {
1089
+ "model_module": "@jupyter-widgets/controls",
1090
  "model_module_version": "1.5.0",
1091
+ "model_name": "DescriptionStyleModel",
1092
  "state": {
1093
  "_model_module": "@jupyter-widgets/controls",
1094
  "_model_module_version": "1.5.0",
 
1100
  "description_width": ""
1101
  }
1102
  },
1103
+ "112da28d935543069e7a1a2abc22f9f4": {
1104
+ "model_module": "@jupyter-widgets/controls",
1105
+ "model_module_version": "1.5.0",
1106
+ "model_name": "VBoxModel",
1107
+ "state": {
1108
+ "_dom_classes": [],
1109
+ "_model_module": "@jupyter-widgets/controls",
1110
+ "_model_module_version": "1.5.0",
1111
+ "_model_name": "VBoxModel",
1112
+ "_view_count": null,
1113
+ "_view_module": "@jupyter-widgets/controls",
1114
+ "_view_module_version": "1.5.0",
1115
+ "_view_name": "VBoxView",
1116
+ "box_style": "",
1117
+ "children": [],
1118
+ "layout": "IPY_MODEL_f6362bc7b5b24dd592d35a76a1fbf26b"
1119
+ }
1120
+ },
1121
+ "11775cc8d35442c3a31452d66f6104e7": {
1122
  "model_module": "@jupyter-widgets/base",
 
1123
  "model_module_version": "1.2.0",
1124
+ "model_name": "LayoutModel",
1125
  "state": {
1126
  "_model_module": "@jupyter-widgets/base",
1127
  "_model_module_version": "1.2.0",
 
1170
  "width": null
1171
  }
1172
  },
1173
+ "13ece6fbb1d84f03ae434119de486f07": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1174
  "model_module": "@jupyter-widgets/base",
 
1175
  "model_module_version": "1.2.0",
1176
+ "model_name": "LayoutModel",
1177
  "state": {
1178
  "_model_module": "@jupyter-widgets/base",
1179
  "_model_module_version": "1.2.0",
 
1222
  "width": null
1223
  }
1224
  },
1225
+ "166c19ec6d9f4455a56a0f146d1c0abc": {
1226
+ "model_module": "@jupyter-widgets/controls",
1227
+ "model_module_version": "1.5.0",
1228
+ "model_name": "HTMLModel",
1229
+ "state": {
1230
+ "_dom_classes": [],
1231
+ "_model_module": "@jupyter-widgets/controls",
1232
+ "_model_module_version": "1.5.0",
1233
+ "_model_name": "HTMLModel",
1234
+ "_view_count": null,
1235
+ "_view_module": "@jupyter-widgets/controls",
1236
+ "_view_module_version": "1.5.0",
1237
+ "_view_name": "HTMLView",
1238
+ "description": "",
1239
+ "description_tooltip": null,
1240
+ "layout": "IPY_MODEL_4156b6897089446984196606ef0d3461",
1241
+ "placeholder": "​",
1242
+ "style": "IPY_MODEL_cf4b5a9cefe84fd9a4d120ab1da6f3f4",
1243
+ "value": "\n<b>Pro Tip:</b> If you don't already have one, you can create a dedicated\n'notebooks' token with 'write' access, that you can then easily reuse for all\nnotebooks. </center>"
1244
+ }
1245
+ },
1246
+ "1852745b0de44f4281cea0cbb3508459": {
1247
+ "model_module": "@jupyter-widgets/controls",
1248
+ "model_module_version": "1.5.0",
1249
+ "model_name": "ButtonModel",
1250
+ "state": {
1251
+ "_dom_classes": [],
1252
+ "_model_module": "@jupyter-widgets/controls",
1253
+ "_model_module_version": "1.5.0",
1254
+ "_model_name": "ButtonModel",
1255
+ "_view_count": null,
1256
+ "_view_module": "@jupyter-widgets/controls",
1257
+ "_view_module_version": "1.5.0",
1258
+ "_view_name": "ButtonView",
1259
+ "button_style": "",
1260
+ "description": "Login",
1261
+ "disabled": false,
1262
+ "icon": "",
1263
+ "layout": "IPY_MODEL_44d0e1db5f664b3fb7c146c216566776",
1264
+ "style": "IPY_MODEL_7af918a10ec745d7a3f4a883dbdc8b6a",
1265
+ "tooltip": ""
1266
+ }
1267
+ },
1268
+ "24cf286de1cf40f299bf0797f74c85eb": {
1269
  "model_module": "@jupyter-widgets/controls",
 
1270
  "model_module_version": "1.5.0",
1271
+ "model_name": "HTMLModel",
1272
+ "state": {
1273
+ "_dom_classes": [],
1274
+ "_model_module": "@jupyter-widgets/controls",
1275
+ "_model_module_version": "1.5.0",
1276
+ "_model_name": "HTMLModel",
1277
+ "_view_count": null,
1278
+ "_view_module": "@jupyter-widgets/controls",
1279
+ "_view_module_version": "1.5.0",
1280
+ "_view_name": "HTMLView",
1281
+ "description": "",
1282
+ "description_tooltip": null,
1283
+ "layout": "IPY_MODEL_9e604469cb34439c944e84238b1ec055",
1284
+ "placeholder": "​",
1285
+ "style": "IPY_MODEL_ba94df0961c44b6d974ef882297731d8",
1286
+ "value": "training_args.bin: 100%"
1287
+ }
1288
+ },
1289
+ "2d4d2f5ffae5451ebf1583362da3e9a9": {
1290
+ "model_module": "@jupyter-widgets/controls",
1291
+ "model_module_version": "1.5.0",
1292
+ "model_name": "HTMLModel",
1293
+ "state": {
1294
+ "_dom_classes": [],
1295
+ "_model_module": "@jupyter-widgets/controls",
1296
+ "_model_module_version": "1.5.0",
1297
+ "_model_name": "HTMLModel",
1298
+ "_view_count": null,
1299
+ "_view_module": "@jupyter-widgets/controls",
1300
+ "_view_module_version": "1.5.0",
1301
+ "_view_name": "HTMLView",
1302
+ "description": "",
1303
+ "description_tooltip": null,
1304
+ "layout": "IPY_MODEL_9a6a8a5bf8f1479eb478a8c81b58aa69",
1305
+ "placeholder": "​",
1306
+ "style": "IPY_MODEL_3a51fdcde7984422b7a0925057f6cc37",
1307
+ "value": " 5.43k/5.43k [00:00&lt;00:00, 40.2kB/s]"
1308
+ }
1309
+ },
1310
+ "308ab776682a4e93ac05e06aa98a77f1": {
1311
+ "model_module": "@jupyter-widgets/controls",
1312
+ "model_module_version": "1.5.0",
1313
+ "model_name": "HTMLModel",
1314
+ "state": {
1315
+ "_dom_classes": [],
1316
+ "_model_module": "@jupyter-widgets/controls",
1317
+ "_model_module_version": "1.5.0",
1318
+ "_model_name": "HTMLModel",
1319
+ "_view_count": null,
1320
+ "_view_module": "@jupyter-widgets/controls",
1321
+ "_view_module_version": "1.5.0",
1322
+ "_view_name": "HTMLView",
1323
+ "description": "",
1324
+ "description_tooltip": null,
1325
+ "layout": "IPY_MODEL_0bd50b2853324f5c832821b7174c5ce2",
1326
+ "placeholder": "​",
1327
+ "style": "IPY_MODEL_077f3bcf99044d168250d1e6c4abbcae",
1328
+ "value": " 1.02G/1.02G [00:23&lt;00:00, 46.4MB/s]"
1329
+ }
1330
+ },
1331
+ "3a51fdcde7984422b7a0925057f6cc37": {
1332
+ "model_module": "@jupyter-widgets/controls",
1333
+ "model_module_version": "1.5.0",
1334
+ "model_name": "DescriptionStyleModel",
1335
  "state": {
1336
  "_model_module": "@jupyter-widgets/controls",
1337
  "_model_module_version": "1.5.0",
 
1343
  "description_width": ""
1344
  }
1345
  },
1346
+ "3cd8e1e9fc234219b8fbd4161799640f": {
1347
  "model_module": "@jupyter-widgets/base",
 
1348
  "model_module_version": "1.2.0",
1349
+ "model_name": "LayoutModel",
1350
  "state": {
1351
  "_model_module": "@jupyter-widgets/base",
1352
  "_model_module_version": "1.2.0",
 
1395
  "width": null
1396
  }
1397
  },
1398
+ "3d6e35795ba24eed96f2fa842b265e5b": {
1399
  "model_module": "@jupyter-widgets/controls",
 
1400
  "model_module_version": "1.5.0",
1401
+ "model_name": "HBoxModel",
1402
  "state": {
1403
+ "_dom_classes": [],
1404
  "_model_module": "@jupyter-widgets/controls",
1405
  "_model_module_version": "1.5.0",
1406
+ "_model_name": "HBoxModel",
1407
  "_view_count": null,
1408
+ "_view_module": "@jupyter-widgets/controls",
1409
+ "_view_module_version": "1.5.0",
1410
+ "_view_name": "HBoxView",
1411
+ "box_style": "",
1412
+ "children": [
1413
+ "IPY_MODEL_84376aa81cae42a18fc49bdded395187",
1414
+ "IPY_MODEL_805784ce9e65411dbf35373db3680920",
1415
+ "IPY_MODEL_308ab776682a4e93ac05e06aa98a77f1"
1416
+ ],
1417
+ "layout": "IPY_MODEL_13ece6fbb1d84f03ae434119de486f07"
1418
  }
1419
  },
1420
  "4156b6897089446984196606ef0d3461": {
1421
  "model_module": "@jupyter-widgets/base",
 
1422
  "model_module_version": "1.2.0",
1423
+ "model_name": "LayoutModel",
1424
  "state": {
1425
  "_model_module": "@jupyter-widgets/base",
1426
  "_model_module_version": "1.2.0",
 
1469
  "width": null
1470
  }
1471
  },
1472
+ "419572dbd59a4583831961b7d8ecfa4a": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1473
  "model_module": "@jupyter-widgets/base",
 
1474
  "model_module_version": "1.2.0",
1475
+ "model_name": "LayoutModel",
1476
  "state": {
1477
  "_model_module": "@jupyter-widgets/base",
1478
  "_model_module_version": "1.2.0",
 
1521
  "width": null
1522
  }
1523
  },
1524
+ "44d0e1db5f664b3fb7c146c216566776": {
1525
+ "model_module": "@jupyter-widgets/base",
1526
+ "model_module_version": "1.2.0",
1527
+ "model_name": "LayoutModel",
1528
  "state": {
1529
+ "_model_module": "@jupyter-widgets/base",
1530
+ "_model_module_version": "1.2.0",
1531
+ "_model_name": "LayoutModel",
1532
  "_view_count": null,
1533
  "_view_module": "@jupyter-widgets/base",
1534
  "_view_module_version": "1.2.0",
1535
+ "_view_name": "LayoutView",
1536
+ "align_content": null,
1537
+ "align_items": null,
1538
+ "align_self": null,
1539
+ "border": null,
1540
+ "bottom": null,
1541
+ "display": null,
1542
+ "flex": null,
1543
+ "flex_flow": null,
1544
+ "grid_area": null,
1545
+ "grid_auto_columns": null,
1546
+ "grid_auto_flow": null,
1547
+ "grid_auto_rows": null,
1548
+ "grid_column": null,
1549
+ "grid_gap": null,
1550
+ "grid_row": null,
1551
+ "grid_template_areas": null,
1552
+ "grid_template_columns": null,
1553
+ "grid_template_rows": null,
1554
+ "height": null,
1555
+ "justify_content": null,
1556
+ "justify_items": null,
1557
+ "left": null,
1558
+ "margin": null,
1559
+ "max_height": null,
1560
+ "max_width": null,
1561
+ "min_height": null,
1562
+ "min_width": null,
1563
+ "object_fit": null,
1564
+ "object_position": null,
1565
+ "order": null,
1566
+ "overflow": null,
1567
+ "overflow_x": null,
1568
+ "overflow_y": null,
1569
+ "padding": null,
1570
+ "right": null,
1571
+ "top": null,
1572
+ "visibility": null,
1573
+ "width": null
 
 
 
 
 
 
1574
  }
1575
  },
1576
+ "484155e67e36453c9d1ebd2ea1768eca": {
1577
  "model_module": "@jupyter-widgets/controls",
 
1578
  "model_module_version": "1.5.0",
1579
+ "model_name": "LabelModel",
1580
  "state": {
1581
  "_dom_classes": [],
1582
  "_model_module": "@jupyter-widgets/controls",
1583
  "_model_module_version": "1.5.0",
1584
+ "_model_name": "LabelModel",
1585
  "_view_count": null,
1586
  "_view_module": "@jupyter-widgets/controls",
1587
  "_view_module_version": "1.5.0",
1588
+ "_view_name": "LabelView",
 
1589
  "description": "",
1590
  "description_tooltip": null,
1591
+ "layout": "IPY_MODEL_48bb89c434284b639f45b5929cf8d1a9",
1592
+ "placeholder": "​",
1593
+ "style": "IPY_MODEL_0ead4ab9bb7648c69352094bfbcb8800",
1594
+ "value": "Connecting..."
 
 
1595
  }
1596
  },
1597
+ "484d11b997454f88902fe507c1156698": {
1598
  "model_module": "@jupyter-widgets/controls",
 
1599
  "model_module_version": "1.5.0",
1600
+ "model_name": "DescriptionStyleModel",
1601
  "state": {
 
1602
  "_model_module": "@jupyter-widgets/controls",
1603
  "_model_module_version": "1.5.0",
1604
+ "_model_name": "DescriptionStyleModel",
1605
  "_view_count": null,
1606
+ "_view_module": "@jupyter-widgets/base",
1607
+ "_view_module_version": "1.2.0",
1608
+ "_view_name": "StyleView",
1609
+ "description_width": ""
 
 
 
 
 
1610
  }
1611
  },
1612
+ "48bb89c434284b639f45b5929cf8d1a9": {
1613
  "model_module": "@jupyter-widgets/base",
 
1614
  "model_module_version": "1.2.0",
1615
+ "model_name": "LayoutModel",
1616
  "state": {
1617
  "_model_module": "@jupyter-widgets/base",
1618
  "_model_module_version": "1.2.0",
 
1661
  "width": null
1662
  }
1663
  },
1664
+ "4cd8babc92cc4aeba74d2147f28dee7d": {
1665
  "model_module": "@jupyter-widgets/base",
 
1666
  "model_module_version": "1.2.0",
1667
+ "model_name": "LayoutModel",
1668
  "state": {
1669
  "_model_module": "@jupyter-widgets/base",
1670
  "_model_module_version": "1.2.0",
 
1713
  "width": null
1714
  }
1715
  },
1716
+ "4e085c242353430b8e32afa3bb260aa9": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1717
  "model_module": "@jupyter-widgets/base",
 
1718
  "model_module_version": "1.2.0",
1719
+ "model_name": "LayoutModel",
1720
  "state": {
1721
  "_model_module": "@jupyter-widgets/base",
1722
  "_model_module_version": "1.2.0",
 
1765
  "width": null
1766
  }
1767
  },
1768
+ "50f23a17be4f46b687b1b1df1e70238a": {
1769
  "model_module": "@jupyter-widgets/controls",
 
1770
  "model_module_version": "1.5.0",
1771
+ "model_name": "HBoxModel",
1772
  "state": {
1773
+ "_dom_classes": [],
1774
  "_model_module": "@jupyter-widgets/controls",
1775
  "_model_module_version": "1.5.0",
1776
+ "_model_name": "HBoxModel",
1777
+ "_view_count": null,
1778
+ "_view_module": "@jupyter-widgets/controls",
1779
+ "_view_module_version": "1.5.0",
1780
+ "_view_name": "HBoxView",
1781
+ "box_style": "",
1782
+ "children": [
1783
+ "IPY_MODEL_f1a0aa50142044f5817f8676103bff58",
1784
+ "IPY_MODEL_02cb2b35986f4ea294fbf6b2490972d5",
1785
+ "IPY_MODEL_ecc5cc7a30fa48f2afc708f8a40f50eb"
1786
+ ],
1787
+ "layout": "IPY_MODEL_6e80b0bf0aa2433fa97d6f43dd21e2cf"
1788
+ }
1789
+ },
1790
+ "5490c69c251144c4979e346c66ac1e53": {
1791
+ "model_module": "@jupyter-widgets/controls",
1792
+ "model_module_version": "1.5.0",
1793
+ "model_name": "DescriptionStyleModel",
1794
+ "state": {
1795
+ "_model_module": "@jupyter-widgets/controls",
1796
+ "_model_module_version": "1.5.0",
1797
+ "_model_name": "DescriptionStyleModel",
1798
  "_view_count": null,
1799
  "_view_module": "@jupyter-widgets/base",
1800
  "_view_module_version": "1.2.0",
1801
  "_view_name": "StyleView",
 
1802
  "description_width": ""
1803
  }
1804
  },
1805
+ "56cda03822db434bb17b7cadbbaeb81b": {
1806
+ "model_module": "@jupyter-widgets/controls",
1807
+ "model_module_version": "1.5.0",
1808
+ "model_name": "HTMLModel",
1809
+ "state": {
1810
+ "_dom_classes": [],
1811
+ "_model_module": "@jupyter-widgets/controls",
1812
+ "_model_module_version": "1.5.0",
1813
+ "_model_name": "HTMLModel",
1814
+ "_view_count": null,
1815
+ "_view_module": "@jupyter-widgets/controls",
1816
+ "_view_module_version": "1.5.0",
1817
+ "_view_name": "HTMLView",
1818
+ "description": "",
1819
+ "description_tooltip": null,
1820
+ "layout": "IPY_MODEL_11775cc8d35442c3a31452d66f6104e7",
1821
+ "placeholder": "​",
1822
+ "style": "IPY_MODEL_af6707547c5243eb9227efc0eb76134e",
1823
+ "value": " 17.1k/17.1k [00:00&lt;00:00, 121kB/s]"
1824
+ }
1825
+ },
1826
+ "62c1056ac5f14b1caa235010d33f241a": {
1827
+ "model_module": "@jupyter-widgets/controls",
1828
+ "model_module_version": "1.5.0",
1829
+ "model_name": "HTMLModel",
1830
+ "state": {
1831
+ "_dom_classes": [],
1832
+ "_model_module": "@jupyter-widgets/controls",
1833
+ "_model_module_version": "1.5.0",
1834
+ "_model_name": "HTMLModel",
1835
+ "_view_count": null,
1836
+ "_view_module": "@jupyter-widgets/controls",
1837
+ "_view_module_version": "1.5.0",
1838
+ "_view_name": "HTMLView",
1839
+ "description": "",
1840
+ "description_tooltip": null,
1841
+ "layout": "IPY_MODEL_c55a0063346d4c429cbc000bbd612287",
1842
+ "placeholder": "​",
1843
+ "style": "IPY_MODEL_ba8348a627bc41149e888f0deae68a51",
1844
+ "value": "events.out.tfevents.1740055910.82ea94387a47.41010.0: 100%"
1845
+ }
1846
+ },
1847
+ "6e80b0bf0aa2433fa97d6f43dd21e2cf": {
1848
  "model_module": "@jupyter-widgets/base",
 
1849
  "model_module_version": "1.2.0",
1850
+ "model_name": "LayoutModel",
1851
  "state": {
1852
  "_model_module": "@jupyter-widgets/base",
1853
  "_model_module_version": "1.2.0",
 
1896
  "width": null
1897
  }
1898
  },
1899
+ "7af918a10ec745d7a3f4a883dbdc8b6a": {
1900
  "model_module": "@jupyter-widgets/controls",
 
1901
  "model_module_version": "1.5.0",
1902
+ "model_name": "ButtonStyleModel",
1903
  "state": {
1904
  "_model_module": "@jupyter-widgets/controls",
1905
  "_model_module_version": "1.5.0",
1906
+ "_model_name": "ButtonStyleModel",
1907
  "_view_count": null,
1908
  "_view_module": "@jupyter-widgets/base",
1909
  "_view_module_version": "1.2.0",
1910
  "_view_name": "StyleView",
1911
+ "button_color": null,
1912
+ "font_weight": ""
1913
  }
1914
  },
1915
+ "7cd065e777f54efb857efb92415997b2": {
1916
+ "model_module": "@jupyter-widgets/base",
1917
+ "model_module_version": "1.2.0",
1918
+ "model_name": "LayoutModel",
1919
  "state": {
1920
+ "_model_module": "@jupyter-widgets/base",
1921
+ "_model_module_version": "1.2.0",
1922
+ "_model_name": "LayoutModel",
 
1923
  "_view_count": null,
1924
+ "_view_module": "@jupyter-widgets/base",
1925
+ "_view_module_version": "1.2.0",
1926
+ "_view_name": "LayoutView",
1927
+ "align_content": null,
1928
+ "align_items": null,
1929
+ "align_self": null,
1930
+ "border": null,
1931
+ "bottom": null,
1932
+ "display": null,
1933
+ "flex": null,
1934
+ "flex_flow": null,
1935
+ "grid_area": null,
1936
+ "grid_auto_columns": null,
1937
+ "grid_auto_flow": null,
1938
+ "grid_auto_rows": null,
1939
+ "grid_column": null,
1940
+ "grid_gap": null,
1941
+ "grid_row": null,
1942
+ "grid_template_areas": null,
1943
+ "grid_template_columns": null,
1944
+ "grid_template_rows": null,
1945
+ "height": null,
1946
+ "justify_content": null,
1947
+ "justify_items": null,
1948
+ "left": null,
1949
+ "margin": null,
1950
+ "max_height": null,
1951
+ "max_width": null,
1952
+ "min_height": null,
1953
+ "min_width": null,
1954
+ "object_fit": null,
1955
+ "object_position": null,
1956
+ "order": null,
1957
+ "overflow": null,
1958
+ "overflow_x": null,
1959
+ "overflow_y": null,
1960
+ "padding": null,
1961
+ "right": null,
1962
+ "top": null,
1963
+ "visibility": null,
1964
+ "width": null
1965
  }
1966
  },
1967
+ "805784ce9e65411dbf35373db3680920": {
1968
  "model_module": "@jupyter-widgets/controls",
 
1969
  "model_module_version": "1.5.0",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1970
  "model_name": "FloatProgressModel",
 
1971
  "state": {
1972
  "_dom_classes": [],
1973
  "_model_module": "@jupyter-widgets/controls",
 
1980
  "bar_style": "success",
1981
  "description": "",
1982
  "description_tooltip": null,
1983
+ "layout": "IPY_MODEL_a5faec577a9844ea921c2bce1d472b23",
1984
+ "max": 1015025832,
1985
  "min": 0,
1986
  "orientation": "horizontal",
1987
+ "style": "IPY_MODEL_f1e2134eb4624735842db7c112b515a0",
1988
+ "value": 1015025832
1989
  }
1990
  },
1991
+ "84376aa81cae42a18fc49bdded395187": {
1992
  "model_module": "@jupyter-widgets/controls",
 
1993
  "model_module_version": "1.5.0",
1994
+ "model_name": "HTMLModel",
1995
  "state": {
1996
  "_dom_classes": [],
1997
  "_model_module": "@jupyter-widgets/controls",
 
2003
  "_view_name": "HTMLView",
2004
  "description": "",
2005
  "description_tooltip": null,
2006
+ "layout": "IPY_MODEL_88ef32028fd640de85d75c197eca36eb",
2007
  "placeholder": "​",
2008
+ "style": "IPY_MODEL_d926557788ba45c0bef88d9e8a4b56aa",
2009
+ "value": "model.safetensors: 100%"
2010
  }
2011
  },
2012
+ "88cf901fb47a4925930b7deffe98a9ce": {
2013
+ "model_module": "@jupyter-widgets/controls",
2014
+ "model_module_version": "1.5.0",
2015
+ "model_name": "ProgressStyleModel",
2016
+ "state": {
2017
+ "_model_module": "@jupyter-widgets/controls",
2018
+ "_model_module_version": "1.5.0",
2019
+ "_model_name": "ProgressStyleModel",
2020
+ "_view_count": null,
2021
+ "_view_module": "@jupyter-widgets/base",
2022
+ "_view_module_version": "1.2.0",
2023
+ "_view_name": "StyleView",
2024
+ "bar_color": null,
2025
+ "description_width": ""
2026
+ }
2027
+ },
2028
+ "88ef32028fd640de85d75c197eca36eb": {
2029
  "model_module": "@jupyter-widgets/base",
 
2030
  "model_module_version": "1.2.0",
2031
+ "model_name": "LayoutModel",
2032
  "state": {
2033
  "_model_module": "@jupyter-widgets/base",
2034
  "_model_module_version": "1.2.0",
 
2077
  "width": null
2078
  }
2079
  },
2080
+ "96ba1124ba7642fe9d616b348499d0ef": {
2081
  "model_module": "@jupyter-widgets/base",
 
2082
  "model_module_version": "1.2.0",
2083
+ "model_name": "LayoutModel",
2084
  "state": {
2085
  "_model_module": "@jupyter-widgets/base",
2086
  "_model_module_version": "1.2.0",
 
2129
  "width": null
2130
  }
2131
  },
2132
+ "970c3d13cbbf47da986503c6ad99f506": {
2133
  "model_module": "@jupyter-widgets/controls",
 
2134
  "model_module_version": "1.5.0",
2135
+ "model_name": "ProgressStyleModel",
2136
  "state": {
2137
  "_model_module": "@jupyter-widgets/controls",
2138
  "_model_module_version": "1.5.0",
2139
+ "_model_name": "ProgressStyleModel",
2140
  "_view_count": null,
2141
  "_view_module": "@jupyter-widgets/base",
2142
  "_view_module_version": "1.2.0",
2143
  "_view_name": "StyleView",
2144
+ "bar_color": null,
2145
  "description_width": ""
2146
  }
2147
  },
2148
+ "9a6a8a5bf8f1479eb478a8c81b58aa69": {
2149
  "model_module": "@jupyter-widgets/base",
 
2150
  "model_module_version": "1.2.0",
2151
+ "model_name": "LayoutModel",
2152
  "state": {
2153
  "_model_module": "@jupyter-widgets/base",
2154
  "_model_module_version": "1.2.0",
 
2197
  "width": null
2198
  }
2199
  },
2200
+ "9e604469cb34439c944e84238b1ec055": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2201
  "model_module": "@jupyter-widgets/base",
 
2202
  "model_module_version": "1.2.0",
2203
+ "model_name": "LayoutModel",
2204
  "state": {
2205
  "_model_module": "@jupyter-widgets/base",
2206
  "_model_module_version": "1.2.0",
 
2249
  "width": null
2250
  }
2251
  },
2252
+ "a415d0d029ec4225864ed59db18c20b3": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2253
  "model_module": "@jupyter-widgets/controls",
 
2254
  "model_module_version": "1.5.0",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2255
  "model_name": "FloatProgressModel",
 
2256
  "state": {
2257
  "_dom_classes": [],
2258
  "_model_module": "@jupyter-widgets/controls",
 
2265
  "bar_style": "success",
2266
  "description": "",
2267
  "description_tooltip": null,
2268
+ "layout": "IPY_MODEL_bef10393380046c3a842dc979ed8c01f",
2269
+ "max": 17132,
2270
  "min": 0,
2271
  "orientation": "horizontal",
2272
+ "style": "IPY_MODEL_052ce17694bd4d3291ff5a10d2702b4b",
2273
+ "value": 17132
2274
  }
2275
  },
2276
+ "a4fbf37fe0fe44cfbf72ca1e82af3467": {
2277
  "model_module": "@jupyter-widgets/controls",
 
2278
  "model_module_version": "1.5.0",
2279
+ "model_name": "DescriptionStyleModel",
2280
  "state": {
 
2281
  "_model_module": "@jupyter-widgets/controls",
2282
  "_model_module_version": "1.5.0",
2283
+ "_model_name": "DescriptionStyleModel",
2284
  "_view_count": null,
2285
+ "_view_module": "@jupyter-widgets/base",
2286
+ "_view_module_version": "1.2.0",
2287
+ "_view_name": "StyleView",
2288
+ "description_width": ""
 
 
 
 
 
2289
  }
2290
  },
2291
+ "a5faec577a9844ea921c2bce1d472b23": {
2292
  "model_module": "@jupyter-widgets/base",
 
2293
  "model_module_version": "1.2.0",
2294
+ "model_name": "LayoutModel",
2295
  "state": {
2296
  "_model_module": "@jupyter-widgets/base",
2297
  "_model_module_version": "1.2.0",
 
2340
  "width": null
2341
  }
2342
  },
2343
+ "ad17e30049cb4b5aa4046d94690f87d3": {
2344
+ "model_module": "@jupyter-widgets/controls",
2345
+ "model_module_version": "1.5.0",
2346
+ "model_name": "PasswordModel",
2347
  "state": {
2348
+ "_dom_classes": [],
2349
+ "_model_module": "@jupyter-widgets/controls",
2350
+ "_model_module_version": "1.5.0",
2351
+ "_model_name": "PasswordModel",
2352
+ "_view_count": null,
2353
+ "_view_module": "@jupyter-widgets/controls",
2354
+ "_view_module_version": "1.5.0",
2355
+ "_view_name": "PasswordView",
2356
+ "continuous_update": true,
2357
+ "description": "Token:",
2358
+ "description_tooltip": null,
2359
+ "disabled": false,
2360
+ "layout": "IPY_MODEL_4cd8babc92cc4aeba74d2147f28dee7d",
2361
+ "placeholder": "​",
2362
+ "style": "IPY_MODEL_a4fbf37fe0fe44cfbf72ca1e82af3467",
2363
+ "value": ""
2364
+ }
2365
+ },
2366
+ "af6707547c5243eb9227efc0eb76134e": {
2367
+ "model_module": "@jupyter-widgets/controls",
2368
+ "model_module_version": "1.5.0",
2369
+ "model_name": "DescriptionStyleModel",
2370
+ "state": {
2371
+ "_model_module": "@jupyter-widgets/controls",
2372
+ "_model_module_version": "1.5.0",
2373
+ "_model_name": "DescriptionStyleModel",
2374
  "_view_count": null,
2375
  "_view_module": "@jupyter-widgets/base",
2376
  "_view_module_version": "1.2.0",
2377
+ "_view_name": "StyleView",
2378
+ "description_width": ""
2379
+ }
2380
+ },
2381
+ "ba8348a627bc41149e888f0deae68a51": {
2382
+ "model_module": "@jupyter-widgets/controls",
2383
+ "model_module_version": "1.5.0",
2384
+ "model_name": "DescriptionStyleModel",
2385
+ "state": {
2386
+ "_model_module": "@jupyter-widgets/controls",
2387
+ "_model_module_version": "1.5.0",
2388
+ "_model_name": "DescriptionStyleModel",
2389
+ "_view_count": null,
2390
+ "_view_module": "@jupyter-widgets/base",
2391
+ "_view_module_version": "1.2.0",
2392
+ "_view_name": "StyleView",
2393
+ "description_width": ""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2394
  }
2395
  },
2396
+ "ba94df0961c44b6d974ef882297731d8": {
2397
  "model_module": "@jupyter-widgets/controls",
 
2398
  "model_module_version": "1.5.0",
2399
+ "model_name": "DescriptionStyleModel",
2400
  "state": {
2401
  "_model_module": "@jupyter-widgets/controls",
2402
  "_model_module_version": "1.5.0",
 
2408
  "description_width": ""
2409
  }
2410
  },
2411
+ "bc2065597db04146a6df7ed10de7b93c": {
2412
+ "model_module": "@jupyter-widgets/controls",
2413
+ "model_module_version": "1.5.0",
2414
+ "model_name": "HBoxModel",
2415
+ "state": {
2416
+ "_dom_classes": [],
2417
+ "_model_module": "@jupyter-widgets/controls",
2418
+ "_model_module_version": "1.5.0",
2419
+ "_model_name": "HBoxModel",
2420
+ "_view_count": null,
2421
+ "_view_module": "@jupyter-widgets/controls",
2422
+ "_view_module_version": "1.5.0",
2423
+ "_view_name": "HBoxView",
2424
+ "box_style": "",
2425
+ "children": [
2426
+ "IPY_MODEL_24cf286de1cf40f299bf0797f74c85eb",
2427
+ "IPY_MODEL_c349943637234fbc96a2a9f325d3c9f1",
2428
+ "IPY_MODEL_2d4d2f5ffae5451ebf1583362da3e9a9"
2429
+ ],
2430
+ "layout": "IPY_MODEL_3cd8e1e9fc234219b8fbd4161799640f"
2431
+ }
2432
+ },
2433
+ "be50e04c5629463eb18d029d045f25b3": {
2434
  "model_module": "@jupyter-widgets/base",
 
2435
  "model_module_version": "1.2.0",
2436
+ "model_name": "LayoutModel",
2437
  "state": {
2438
  "_model_module": "@jupyter-widgets/base",
2439
  "_model_module_version": "1.2.0",
 
2482
  "width": null
2483
  }
2484
  },
2485
+ "bef10393380046c3a842dc979ed8c01f": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2486
  "model_module": "@jupyter-widgets/base",
 
2487
  "model_module_version": "1.2.0",
2488
+ "model_name": "LayoutModel",
2489
  "state": {
2490
  "_model_module": "@jupyter-widgets/base",
2491
  "_model_module_version": "1.2.0",
 
2534
  "width": null
2535
  }
2536
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2537
  "c349943637234fbc96a2a9f325d3c9f1": {
2538
  "model_module": "@jupyter-widgets/controls",
 
2539
  "model_module_version": "1.5.0",
2540
+ "model_name": "FloatProgressModel",
2541
  "state": {
2542
  "_dom_classes": [],
2543
  "_model_module": "@jupyter-widgets/controls",
 
2558
  "value": 5432
2559
  }
2560
  },
2561
+ "c42bbb0b19a544f58a5737acb1d97a85": {
2562
  "model_module": "@jupyter-widgets/controls",
 
2563
  "model_module_version": "1.5.0",
2564
+ "model_name": "DescriptionStyleModel",
2565
  "state": {
 
2566
  "_model_module": "@jupyter-widgets/controls",
2567
  "_model_module_version": "1.5.0",
2568
+ "_model_name": "DescriptionStyleModel",
2569
  "_view_count": null,
2570
+ "_view_module": "@jupyter-widgets/base",
2571
+ "_view_module_version": "1.2.0",
2572
+ "_view_name": "StyleView",
2573
+ "description_width": ""
 
 
 
 
 
2574
  }
2575
  },
2576
+ "c55a0063346d4c429cbc000bbd612287": {
2577
  "model_module": "@jupyter-widgets/base",
 
2578
  "model_module_version": "1.2.0",
2579
+ "model_name": "LayoutModel",
2580
  "state": {
2581
  "_model_module": "@jupyter-widgets/base",
2582
  "_model_module_version": "1.2.0",
 
2625
  "width": null
2626
  }
2627
  },
2628
+ "cf4b5a9cefe84fd9a4d120ab1da6f3f4": {
2629
+ "model_module": "@jupyter-widgets/controls",
2630
+ "model_module_version": "1.5.0",
2631
+ "model_name": "DescriptionStyleModel",
2632
+ "state": {
2633
+ "_model_module": "@jupyter-widgets/controls",
2634
+ "_model_module_version": "1.5.0",
2635
+ "_model_name": "DescriptionStyleModel",
2636
+ "_view_count": null,
2637
+ "_view_module": "@jupyter-widgets/base",
2638
+ "_view_module_version": "1.2.0",
2639
+ "_view_name": "StyleView",
2640
+ "description_width": ""
2641
+ }
2642
+ },
2643
+ "d7d628d4ef7c4b888fd2f70e472dac10": {
2644
  "model_module": "@jupyter-widgets/base",
 
2645
  "model_module_version": "1.2.0",
2646
+ "model_name": "LayoutModel",
2647
  "state": {
2648
  "_model_module": "@jupyter-widgets/base",
2649
  "_model_module_version": "1.2.0",
 
2692
  "width": null
2693
  }
2694
  },
2695
+ "d926557788ba45c0bef88d9e8a4b56aa": {
2696
  "model_module": "@jupyter-widgets/controls",
 
2697
  "model_module_version": "1.5.0",
2698
+ "model_name": "DescriptionStyleModel",
2699
  "state": {
2700
  "_model_module": "@jupyter-widgets/controls",
2701
  "_model_module_version": "1.5.0",
 
2707
  "description_width": ""
2708
  }
2709
  },
2710
+ "e77d3520a2d64f9a840652669c9a0ba1": {
2711
+ "model_module": "@jupyter-widgets/controls",
2712
+ "model_module_version": "1.5.0",
2713
+ "model_name": "CheckboxModel",
2714
+ "state": {
2715
+ "_dom_classes": [],
2716
+ "_model_module": "@jupyter-widgets/controls",
2717
+ "_model_module_version": "1.5.0",
2718
+ "_model_name": "CheckboxModel",
2719
+ "_view_count": null,
2720
+ "_view_module": "@jupyter-widgets/controls",
2721
+ "_view_module_version": "1.5.0",
2722
+ "_view_name": "CheckboxView",
2723
+ "description": "Add token as git credential?",
2724
+ "description_tooltip": null,
2725
+ "disabled": false,
2726
+ "indent": true,
2727
+ "layout": "IPY_MODEL_be50e04c5629463eb18d029d045f25b3",
2728
+ "style": "IPY_MODEL_5490c69c251144c4979e346c66ac1e53",
2729
+ "value": true
2730
+ }
2731
+ },
2732
+ "e99fbdfc8a22408a8c728a36c8744b24": {
2733
  "model_module": "@jupyter-widgets/base",
 
2734
  "model_module_version": "1.2.0",
2735
+ "model_name": "LayoutModel",
2736
  "state": {
2737
  "_model_module": "@jupyter-widgets/base",
2738
  "_model_module_version": "1.2.0",
 
2781
  "width": null
2782
  }
2783
  },
2784
+ "ecc5cc7a30fa48f2afc708f8a40f50eb": {
2785
+ "model_module": "@jupyter-widgets/controls",
2786
+ "model_module_version": "1.5.0",
2787
+ "model_name": "HTMLModel",
2788
+ "state": {
2789
+ "_dom_classes": [],
2790
+ "_model_module": "@jupyter-widgets/controls",
2791
+ "_model_module_version": "1.5.0",
2792
+ "_model_name": "HTMLModel",
2793
+ "_view_count": null,
2794
+ "_view_module": "@jupyter-widgets/controls",
2795
+ "_view_module_version": "1.5.0",
2796
+ "_view_name": "HTMLView",
2797
+ "description": "",
2798
+ "description_tooltip": null,
2799
+ "layout": "IPY_MODEL_96ba1124ba7642fe9d616b348499d0ef",
2800
+ "placeholder": "​",
2801
+ "style": "IPY_MODEL_484d11b997454f88902fe507c1156698",
2802
+ "value": " 3/3 [00:24&lt;00:00, 24.17s/it]"
2803
+ }
2804
+ },
2805
+ "f1a0aa50142044f5817f8676103bff58": {
2806
+ "model_module": "@jupyter-widgets/controls",
2807
+ "model_module_version": "1.5.0",
2808
+ "model_name": "HTMLModel",
2809
+ "state": {
2810
+ "_dom_classes": [],
2811
+ "_model_module": "@jupyter-widgets/controls",
2812
+ "_model_module_version": "1.5.0",
2813
+ "_model_name": "HTMLModel",
2814
+ "_view_count": null,
2815
+ "_view_module": "@jupyter-widgets/controls",
2816
+ "_view_module_version": "1.5.0",
2817
+ "_view_name": "HTMLView",
2818
+ "description": "",
2819
+ "description_tooltip": null,
2820
+ "layout": "IPY_MODEL_d7d628d4ef7c4b888fd2f70e472dac10",
2821
+ "placeholder": "​",
2822
+ "style": "IPY_MODEL_c42bbb0b19a544f58a5737acb1d97a85",
2823
+ "value": "Upload 3 LFS files: 100%"
2824
+ }
2825
+ },
2826
+ "f1e2134eb4624735842db7c112b515a0": {
2827
  "model_module": "@jupyter-widgets/controls",
 
2828
  "model_module_version": "1.5.0",
2829
+ "model_name": "ProgressStyleModel",
2830
  "state": {
2831
  "_model_module": "@jupyter-widgets/controls",
2832
  "_model_module_version": "1.5.0",
 
2839
  "description_width": ""
2840
  }
2841
  },
2842
+ "f6362bc7b5b24dd592d35a76a1fbf26b": {
2843
  "model_module": "@jupyter-widgets/base",
 
2844
  "model_module_version": "1.2.0",
2845
+ "model_name": "LayoutModel",
2846
  "state": {
2847
  "_model_module": "@jupyter-widgets/base",
2848
  "_model_module_version": "1.2.0",
 
2852
  "_view_module_version": "1.2.0",
2853
  "_view_name": "LayoutView",
2854
  "align_content": null,
2855
+ "align_items": "center",
2856
  "align_self": null,
2857
  "border": null,
2858
  "bottom": null,
2859
+ "display": "flex",
2860
  "flex": null,
2861
+ "flex_flow": "column",
2862
  "grid_area": null,
2863
  "grid_auto_columns": null,
2864
  "grid_auto_flow": null,
 
2888
  "right": null,
2889
  "top": null,
2890
  "visibility": null,
2891
+ "width": "50%"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2892
  }
2893
  }
2894
  }
 
2896
  },
2897
  "nbformat": 4,
2898
  "nbformat_minor": 0
2899
+ }
Finetune_ColPali.ipynb CHANGED
@@ -1,15 +1,5 @@
1
  {
2
  "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {
6
- "colab_type": "text",
7
- "id": "view-in-github"
8
- },
9
- "source": [
10
- "<a href=\"https://colab.research.google.com/github/merveenoyan/smol-vision/blob/main/Finetune_ColPali.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
11
- ]
12
- },
13
  {
14
  "cell_type": "markdown",
15
  "metadata": {
 
1
  {
2
  "cells": [
 
 
 
 
 
 
 
 
 
 
3
  {
4
  "cell_type": "markdown",
5
  "metadata": {
Fit_in_vision_models_using_quanto.ipynb CHANGED
The diff for this file is too large to render. See raw diff
 
Gemma_3_for_Video_Understanding.ipynb CHANGED
The diff for this file is too large to render. See raw diff
 
Gemma_3n_Video_Vibe_Tests.ipynb CHANGED
@@ -1,475 +1,571 @@
1
  {
2
- "nbformat": 4,
3
- "nbformat_minor": 0,
4
- "metadata": {
5
- "colab": {
6
- "provenance": [],
7
- "machine_shape": "hm",
8
- "gpuType": "A100",
9
- "include_colab_link": true
 
10
  },
11
- "kernelspec": {
12
- "name": "python3",
13
- "display_name": "Python 3"
 
 
 
 
 
14
  },
15
- "language_info": {
16
- "name": "python"
 
 
 
 
 
 
 
 
17
  },
18
- "accelerator": "GPU",
19
- "widgets": {
20
- "application/vnd.jupyter.widget-state+json": {
21
- "542490f74e974451bc44009a6fa174bd": {
22
- "model_module": "@jupyter-widgets/controls",
23
- "model_name": "VBoxModel",
24
- "model_module_version": "1.5.0",
25
- "state": {
26
- "_dom_classes": [],
27
- "_model_module": "@jupyter-widgets/controls",
28
- "_model_module_version": "1.5.0",
29
- "_model_name": "VBoxModel",
30
- "_view_count": null,
31
- "_view_module": "@jupyter-widgets/controls",
32
- "_view_module_version": "1.5.0",
33
- "_view_name": "VBoxView",
34
- "box_style": "",
35
- "children": [],
36
- "layout": "IPY_MODEL_8d0e5abdd7c549f1a66ee198c9fa1430"
37
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  },
39
- "409f985be1134b468b81136fbdb54408": {
40
- "model_module": "@jupyter-widgets/controls",
41
- "model_name": "HTMLModel",
42
- "model_module_version": "1.5.0",
43
- "state": {
44
- "_dom_classes": [],
45
- "_model_module": "@jupyter-widgets/controls",
46
- "_model_module_version": "1.5.0",
47
- "_model_name": "HTMLModel",
48
- "_view_count": null,
49
- "_view_module": "@jupyter-widgets/controls",
50
- "_view_module_version": "1.5.0",
51
- "_view_name": "HTMLView",
52
- "description": "",
53
- "description_tooltip": null,
54
- "layout": "IPY_MODEL_c72dd3d6a4c246cfa6590c314783c8f0",
55
- "placeholder": "",
56
- "style": "IPY_MODEL_c0e471e664dd41eab98efe08301ef5e1",
57
- "value": "<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.svg\nalt='Hugging Face'> <br> Copy a token from <a\nhref=\"https://huggingface.co/settings/tokens\" target=\"_blank\">your Hugging Face\ntokens page</a> and paste it below. <br> Immediately click login after copying\nyour token or it might be stored in plain text in this notebook file. </center>"
58
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
  },
60
- "57cb1e931c614980a4147cb125524d7d": {
61
- "model_module": "@jupyter-widgets/controls",
62
- "model_name": "PasswordModel",
63
- "model_module_version": "1.5.0",
64
- "state": {
65
- "_dom_classes": [],
66
- "_model_module": "@jupyter-widgets/controls",
67
- "_model_module_version": "1.5.0",
68
- "_model_name": "PasswordModel",
69
- "_view_count": null,
70
- "_view_module": "@jupyter-widgets/controls",
71
- "_view_module_version": "1.5.0",
72
- "_view_name": "PasswordView",
73
- "continuous_update": true,
74
- "description": "Token:",
75
- "description_tooltip": null,
76
- "disabled": false,
77
- "layout": "IPY_MODEL_868f63ea9455442d837dc2c422918800",
78
- "placeholder": "​",
79
- "style": "IPY_MODEL_5b7b4707b1bf4159a10bf7e289bde435",
80
- "value": ""
81
- }
82
  },
83
- "87dc7aaf52e349a7bb43bb1b8bc137ee": {
84
- "model_module": "@jupyter-widgets/controls",
85
- "model_name": "CheckboxModel",
86
- "model_module_version": "1.5.0",
87
- "state": {
88
- "_dom_classes": [],
89
- "_model_module": "@jupyter-widgets/controls",
90
- "_model_module_version": "1.5.0",
91
- "_model_name": "CheckboxModel",
92
- "_view_count": null,
93
- "_view_module": "@jupyter-widgets/controls",
94
- "_view_module_version": "1.5.0",
95
- "_view_name": "CheckboxView",
96
- "description": "Add token as git credential?",
97
- "description_tooltip": null,
98
- "disabled": false,
99
- "indent": true,
100
- "layout": "IPY_MODEL_889d0d1ed24e4de2b89896511d008e60",
101
- "style": "IPY_MODEL_68fc757825dd44a48ab2383db20958db",
102
- "value": true
103
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
104
  },
105
- "983ed4cb4eea42daa9ae8c0417021a21": {
106
- "model_module": "@jupyter-widgets/controls",
107
- "model_name": "ButtonModel",
108
- "model_module_version": "1.5.0",
109
- "state": {
110
- "_dom_classes": [],
111
- "_model_module": "@jupyter-widgets/controls",
112
- "_model_module_version": "1.5.0",
113
- "_model_name": "ButtonModel",
114
- "_view_count": null,
115
- "_view_module": "@jupyter-widgets/controls",
116
- "_view_module_version": "1.5.0",
117
- "_view_name": "ButtonView",
118
- "button_style": "",
119
- "description": "Login",
120
- "disabled": false,
121
- "icon": "",
122
- "layout": "IPY_MODEL_cb76f933e6e640d9a688f7838e5fb0b3",
123
- "style": "IPY_MODEL_8704264bff4d46c9813ac9acf92da962",
124
- "tooltip": ""
125
- }
126
- },
127
- "40c381fd7bb04b43a879044a4e988cc6": {
128
- "model_module": "@jupyter-widgets/controls",
129
- "model_name": "HTMLModel",
130
- "model_module_version": "1.5.0",
131
- "state": {
132
- "_dom_classes": [],
133
- "_model_module": "@jupyter-widgets/controls",
134
- "_model_module_version": "1.5.0",
135
- "_model_name": "HTMLModel",
136
- "_view_count": null,
137
- "_view_module": "@jupyter-widgets/controls",
138
- "_view_module_version": "1.5.0",
139
- "_view_name": "HTMLView",
140
- "description": "",
141
- "description_tooltip": null,
142
- "layout": "IPY_MODEL_9b5d87960dde401baeaf8b6144fb8bad",
143
- "placeholder": "​",
144
- "style": "IPY_MODEL_76e06881e5e94197a24944e07fdf3189",
145
- "value": "\n<b>Pro Tip:</b> If you don't already have one, you can create a dedicated\n'notebooks' token with 'write' access, that you can then easily reuse for all\nnotebooks. </center>"
146
- }
147
- },
148
- "8d0e5abdd7c549f1a66ee198c9fa1430": {
149
- "model_module": "@jupyter-widgets/base",
150
- "model_name": "LayoutModel",
151
- "model_module_version": "1.2.0",
152
- "state": {
153
- "_model_module": "@jupyter-widgets/base",
154
- "_model_module_version": "1.2.0",
155
- "_model_name": "LayoutModel",
156
- "_view_count": null,
157
- "_view_module": "@jupyter-widgets/base",
158
- "_view_module_version": "1.2.0",
159
- "_view_name": "LayoutView",
160
- "align_content": null,
161
- "align_items": "center",
162
- "align_self": null,
163
- "border": null,
164
- "bottom": null,
165
- "display": "flex",
166
- "flex": null,
167
- "flex_flow": "column",
168
- "grid_area": null,
169
- "grid_auto_columns": null,
170
- "grid_auto_flow": null,
171
- "grid_auto_rows": null,
172
- "grid_column": null,
173
- "grid_gap": null,
174
- "grid_row": null,
175
- "grid_template_areas": null,
176
- "grid_template_columns": null,
177
- "grid_template_rows": null,
178
- "height": null,
179
- "justify_content": null,
180
- "justify_items": null,
181
- "left": null,
182
- "margin": null,
183
- "max_height": null,
184
- "max_width": null,
185
- "min_height": null,
186
- "min_width": null,
187
- "object_fit": null,
188
- "object_position": null,
189
- "order": null,
190
- "overflow": null,
191
- "overflow_x": null,
192
- "overflow_y": null,
193
- "padding": null,
194
- "right": null,
195
- "top": null,
196
- "visibility": null,
197
- "width": "50%"
198
- }
199
- },
200
- "c72dd3d6a4c246cfa6590c314783c8f0": {
201
- "model_module": "@jupyter-widgets/base",
202
- "model_name": "LayoutModel",
203
- "model_module_version": "1.2.0",
204
- "state": {
205
- "_model_module": "@jupyter-widgets/base",
206
- "_model_module_version": "1.2.0",
207
- "_model_name": "LayoutModel",
208
- "_view_count": null,
209
- "_view_module": "@jupyter-widgets/base",
210
- "_view_module_version": "1.2.0",
211
- "_view_name": "LayoutView",
212
- "align_content": null,
213
- "align_items": null,
214
- "align_self": null,
215
- "border": null,
216
- "bottom": null,
217
- "display": null,
218
- "flex": null,
219
- "flex_flow": null,
220
- "grid_area": null,
221
- "grid_auto_columns": null,
222
- "grid_auto_flow": null,
223
- "grid_auto_rows": null,
224
- "grid_column": null,
225
- "grid_gap": null,
226
- "grid_row": null,
227
- "grid_template_areas": null,
228
- "grid_template_columns": null,
229
- "grid_template_rows": null,
230
- "height": null,
231
- "justify_content": null,
232
- "justify_items": null,
233
- "left": null,
234
- "margin": null,
235
- "max_height": null,
236
- "max_width": null,
237
- "min_height": null,
238
- "min_width": null,
239
- "object_fit": null,
240
- "object_position": null,
241
- "order": null,
242
- "overflow": null,
243
- "overflow_x": null,
244
- "overflow_y": null,
245
- "padding": null,
246
- "right": null,
247
- "top": null,
248
- "visibility": null,
249
- "width": null
250
- }
251
- },
252
- "c0e471e664dd41eab98efe08301ef5e1": {
253
- "model_module": "@jupyter-widgets/controls",
254
- "model_name": "DescriptionStyleModel",
255
- "model_module_version": "1.5.0",
256
- "state": {
257
- "_model_module": "@jupyter-widgets/controls",
258
- "_model_module_version": "1.5.0",
259
- "_model_name": "DescriptionStyleModel",
260
- "_view_count": null,
261
- "_view_module": "@jupyter-widgets/base",
262
- "_view_module_version": "1.2.0",
263
- "_view_name": "StyleView",
264
- "description_width": ""
265
- }
266
  },
267
- "868f63ea9455442d837dc2c422918800": {
268
- "model_module": "@jupyter-widgets/base",
269
- "model_name": "LayoutModel",
270
- "model_module_version": "1.2.0",
271
- "state": {
272
- "_model_module": "@jupyter-widgets/base",
273
- "_model_module_version": "1.2.0",
274
- "_model_name": "LayoutModel",
275
- "_view_count": null,
276
- "_view_module": "@jupyter-widgets/base",
277
- "_view_module_version": "1.2.0",
278
- "_view_name": "LayoutView",
279
- "align_content": null,
280
- "align_items": null,
281
- "align_self": null,
282
- "border": null,
283
- "bottom": null,
284
- "display": null,
285
- "flex": null,
286
- "flex_flow": null,
287
- "grid_area": null,
288
- "grid_auto_columns": null,
289
- "grid_auto_flow": null,
290
- "grid_auto_rows": null,
291
- "grid_column": null,
292
- "grid_gap": null,
293
- "grid_row": null,
294
- "grid_template_areas": null,
295
- "grid_template_columns": null,
296
- "grid_template_rows": null,
297
- "height": null,
298
- "justify_content": null,
299
- "justify_items": null,
300
- "left": null,
301
- "margin": null,
302
- "max_height": null,
303
- "max_width": null,
304
- "min_height": null,
305
- "min_width": null,
306
- "object_fit": null,
307
- "object_position": null,
308
- "order": null,
309
- "overflow": null,
310
- "overflow_x": null,
311
- "overflow_y": null,
312
- "padding": null,
313
- "right": null,
314
- "top": null,
315
- "visibility": null,
316
- "width": null
317
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
318
  },
319
- "5b7b4707b1bf4159a10bf7e289bde435": {
320
- "model_module": "@jupyter-widgets/controls",
321
- "model_name": "DescriptionStyleModel",
322
- "model_module_version": "1.5.0",
323
- "state": {
324
- "_model_module": "@jupyter-widgets/controls",
325
- "_model_module_version": "1.5.0",
326
- "_model_name": "DescriptionStyleModel",
327
- "_view_count": null,
328
- "_view_module": "@jupyter-widgets/base",
329
- "_view_module_version": "1.2.0",
330
- "_view_name": "StyleView",
331
- "description_width": ""
332
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
333
  },
334
- "889d0d1ed24e4de2b89896511d008e60": {
335
- "model_module": "@jupyter-widgets/base",
336
- "model_name": "LayoutModel",
337
- "model_module_version": "1.2.0",
338
- "state": {
339
- "_model_module": "@jupyter-widgets/base",
340
- "_model_module_version": "1.2.0",
341
- "_model_name": "LayoutModel",
342
- "_view_count": null,
343
- "_view_module": "@jupyter-widgets/base",
344
- "_view_module_version": "1.2.0",
345
- "_view_name": "LayoutView",
346
- "align_content": null,
347
- "align_items": null,
348
- "align_self": null,
349
- "border": null,
350
- "bottom": null,
351
- "display": null,
352
- "flex": null,
353
- "flex_flow": null,
354
- "grid_area": null,
355
- "grid_auto_columns": null,
356
- "grid_auto_flow": null,
357
- "grid_auto_rows": null,
358
- "grid_column": null,
359
- "grid_gap": null,
360
- "grid_row": null,
361
- "grid_template_areas": null,
362
- "grid_template_columns": null,
363
- "grid_template_rows": null,
364
- "height": null,
365
- "justify_content": null,
366
- "justify_items": null,
367
- "left": null,
368
- "margin": null,
369
- "max_height": null,
370
- "max_width": null,
371
- "min_height": null,
372
- "min_width": null,
373
- "object_fit": null,
374
- "object_position": null,
375
- "order": null,
376
- "overflow": null,
377
- "overflow_x": null,
378
- "overflow_y": null,
379
- "padding": null,
380
- "right": null,
381
- "top": null,
382
- "visibility": null,
383
- "width": null
384
- }
 
 
 
 
 
 
 
 
385
  },
386
- "68fc757825dd44a48ab2383db20958db": {
387
- "model_module": "@jupyter-widgets/controls",
388
- "model_name": "DescriptionStyleModel",
389
- "model_module_version": "1.5.0",
390
- "state": {
391
- "_model_module": "@jupyter-widgets/controls",
392
- "_model_module_version": "1.5.0",
393
- "_model_name": "DescriptionStyleModel",
394
- "_view_count": null,
395
- "_view_module": "@jupyter-widgets/base",
396
- "_view_module_version": "1.2.0",
397
- "_view_name": "StyleView",
398
- "description_width": ""
399
- }
 
 
 
 
 
 
 
 
 
 
 
400
  },
401
- "cb76f933e6e640d9a688f7838e5fb0b3": {
402
- "model_module": "@jupyter-widgets/base",
403
- "model_name": "LayoutModel",
404
- "model_module_version": "1.2.0",
405
- "state": {
406
- "_model_module": "@jupyter-widgets/base",
407
- "_model_module_version": "1.2.0",
408
- "_model_name": "LayoutModel",
409
- "_view_count": null,
410
- "_view_module": "@jupyter-widgets/base",
411
- "_view_module_version": "1.2.0",
412
- "_view_name": "LayoutView",
413
- "align_content": null,
414
- "align_items": null,
415
- "align_self": null,
416
- "border": null,
417
- "bottom": null,
418
- "display": null,
419
- "flex": null,
420
- "flex_flow": null,
421
- "grid_area": null,
422
- "grid_auto_columns": null,
423
- "grid_auto_flow": null,
424
- "grid_auto_rows": null,
425
- "grid_column": null,
426
- "grid_gap": null,
427
- "grid_row": null,
428
- "grid_template_areas": null,
429
- "grid_template_columns": null,
430
- "grid_template_rows": null,
431
- "height": null,
432
- "justify_content": null,
433
- "justify_items": null,
434
- "left": null,
435
- "margin": null,
436
- "max_height": null,
437
- "max_width": null,
438
- "min_height": null,
439
- "min_width": null,
440
- "object_fit": null,
441
- "object_position": null,
442
- "order": null,
443
- "overflow": null,
444
- "overflow_x": null,
445
- "overflow_y": null,
446
- "padding": null,
447
- "right": null,
448
- "top": null,
449
- "visibility": null,
450
- "width": null
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
451
  }
452
  },
453
- "8704264bff4d46c9813ac9acf92da962": {
454
  "model_module": "@jupyter-widgets/controls",
455
- "model_name": "ButtonStyleModel",
456
  "model_module_version": "1.5.0",
 
457
  "state": {
458
  "_model_module": "@jupyter-widgets/controls",
459
  "_model_module_version": "1.5.0",
460
- "_model_name": "ButtonStyleModel",
461
  "_view_count": null,
462
  "_view_module": "@jupyter-widgets/base",
463
  "_view_module_version": "1.2.0",
464
  "_view_name": "StyleView",
465
- "button_color": null,
466
- "font_weight": ""
467
  }
468
  },
469
- "9b5d87960dde401baeaf8b6144fb8bad": {
470
  "model_module": "@jupyter-widgets/base",
471
- "model_name": "LayoutModel",
472
  "model_module_version": "1.2.0",
 
473
  "state": {
474
  "_model_module": "@jupyter-widgets/base",
475
  "_model_module_version": "1.2.0",
@@ -518,46 +614,52 @@
518
  "width": null
519
  }
520
  },
521
- "76e06881e5e94197a24944e07fdf3189": {
522
  "model_module": "@jupyter-widgets/controls",
523
- "model_name": "DescriptionStyleModel",
524
  "model_module_version": "1.5.0",
 
525
  "state": {
 
526
  "_model_module": "@jupyter-widgets/controls",
527
  "_model_module_version": "1.5.0",
528
- "_model_name": "DescriptionStyleModel",
529
  "_view_count": null,
530
- "_view_module": "@jupyter-widgets/base",
531
- "_view_module_version": "1.2.0",
532
- "_view_name": "StyleView",
533
- "description_width": ""
 
 
 
 
 
534
  }
535
  },
536
- "f40dd696acc64c6284c6f8f485f3ce9d": {
537
  "model_module": "@jupyter-widgets/controls",
538
- "model_name": "LabelModel",
539
  "model_module_version": "1.5.0",
 
540
  "state": {
541
  "_dom_classes": [],
542
  "_model_module": "@jupyter-widgets/controls",
543
  "_model_module_version": "1.5.0",
544
- "_model_name": "LabelModel",
545
  "_view_count": null,
546
  "_view_module": "@jupyter-widgets/controls",
547
  "_view_module_version": "1.5.0",
548
- "_view_name": "LabelView",
549
  "description": "",
550
  "description_tooltip": null,
551
- "layout": "IPY_MODEL_4488de26dce74cbbb39d99ae09bd21fa",
552
  "placeholder": "​",
553
- "style": "IPY_MODEL_ded62e6c032745ec88ca0ab694b0d397",
554
- "value": "Connecting..."
555
  }
556
  },
557
  "4488de26dce74cbbb39d99ae09bd21fa": {
558
  "model_module": "@jupyter-widgets/base",
559
- "model_name": "LayoutModel",
560
  "model_module_version": "1.2.0",
 
561
  "state": {
562
  "_model_module": "@jupyter-widgets/base",
563
  "_model_module_version": "1.2.0",
@@ -606,165 +708,111 @@
606
  "width": null
607
  }
608
  },
609
- "ded62e6c032745ec88ca0ab694b0d397": {
610
  "model_module": "@jupyter-widgets/controls",
611
- "model_name": "DescriptionStyleModel",
612
  "model_module_version": "1.5.0",
 
613
  "state": {
 
614
  "_model_module": "@jupyter-widgets/controls",
615
  "_model_module_version": "1.5.0",
616
- "_model_name": "DescriptionStyleModel",
617
  "_view_count": null,
618
- "_view_module": "@jupyter-widgets/base",
619
- "_view_module_version": "1.2.0",
620
- "_view_name": "StyleView",
621
- "description_width": ""
 
 
622
  }
623
  },
624
- "be523e956910487ca263d943a7a58395": {
625
  "model_module": "@jupyter-widgets/controls",
626
- "model_name": "HBoxModel",
627
  "model_module_version": "1.5.0",
 
628
  "state": {
629
  "_dom_classes": [],
630
  "_model_module": "@jupyter-widgets/controls",
631
  "_model_module_version": "1.5.0",
632
- "_model_name": "HBoxModel",
633
  "_view_count": null,
634
  "_view_module": "@jupyter-widgets/controls",
635
  "_view_module_version": "1.5.0",
636
- "_view_name": "HBoxView",
637
- "box_style": "",
638
- "children": [
639
- "IPY_MODEL_01dc23faab3d42cda41fdfdd2a7dfed5",
640
- "IPY_MODEL_777d7addfb144fd8896b77a1e0d54f25",
641
- "IPY_MODEL_c518268069244b21810e84380502c190"
642
- ],
643
- "layout": "IPY_MODEL_fee72c1c455549b59092028b855a082a"
 
644
  }
645
  },
646
- "01dc23faab3d42cda41fdfdd2a7dfed5": {
647
  "model_module": "@jupyter-widgets/controls",
648
- "model_name": "HTMLModel",
649
  "model_module_version": "1.5.0",
 
650
  "state": {
651
- "_dom_classes": [],
652
  "_model_module": "@jupyter-widgets/controls",
653
  "_model_module_version": "1.5.0",
654
- "_model_name": "HTMLModel",
655
  "_view_count": null,
656
- "_view_module": "@jupyter-widgets/controls",
657
- "_view_module_version": "1.5.0",
658
- "_view_name": "HTMLView",
659
- "description": "",
660
- "description_tooltip": null,
661
- "layout": "IPY_MODEL_ed0fa93199b94fb486c125d4f322d59f",
662
- "placeholder": "​",
663
- "style": "IPY_MODEL_66f82e7ef3694c699e3d4a2bd826392b",
664
- "value": "Loading checkpoint shards: 100%"
665
  }
666
  },
667
- "777d7addfb144fd8896b77a1e0d54f25": {
668
  "model_module": "@jupyter-widgets/controls",
669
- "model_name": "FloatProgressModel",
670
  "model_module_version": "1.5.0",
 
671
  "state": {
672
- "_dom_classes": [],
673
  "_model_module": "@jupyter-widgets/controls",
674
  "_model_module_version": "1.5.0",
675
- "_model_name": "FloatProgressModel",
676
  "_view_count": null,
677
- "_view_module": "@jupyter-widgets/controls",
678
- "_view_module_version": "1.5.0",
679
- "_view_name": "ProgressView",
680
- "bar_style": "success",
681
- "description": "",
682
- "description_tooltip": null,
683
- "layout": "IPY_MODEL_2bfd51e3ae954008ae83704c24dbd6cb",
684
- "max": 4,
685
- "min": 0,
686
- "orientation": "horizontal",
687
- "style": "IPY_MODEL_f8b84d8c06384680973ef6fe787b5a5d",
688
- "value": 4
689
  }
690
  },
691
- "c518268069244b21810e84380502c190": {
692
  "model_module": "@jupyter-widgets/controls",
693
- "model_name": "HTMLModel",
694
  "model_module_version": "1.5.0",
 
695
  "state": {
696
- "_dom_classes": [],
697
  "_model_module": "@jupyter-widgets/controls",
698
  "_model_module_version": "1.5.0",
699
- "_model_name": "HTMLModel",
700
  "_view_count": null,
701
- "_view_module": "@jupyter-widgets/controls",
702
- "_view_module_version": "1.5.0",
703
- "_view_name": "HTMLView",
704
- "description": "",
705
- "description_tooltip": null,
706
- "layout": "IPY_MODEL_770341dc116148a8b7571cce3a2f2baf",
707
- "placeholder": "​",
708
- "style": "IPY_MODEL_29416122cc0b4a5592668ddced7686ba",
709
- "value": " 4/4 [00:00&lt;00:00,  5.03it/s]"
710
  }
711
  },
712
- "fee72c1c455549b59092028b855a082a": {
713
- "model_module": "@jupyter-widgets/base",
714
- "model_name": "LayoutModel",
715
- "model_module_version": "1.2.0",
716
  "state": {
717
- "_model_module": "@jupyter-widgets/base",
718
- "_model_module_version": "1.2.0",
719
- "_model_name": "LayoutModel",
720
  "_view_count": null,
721
  "_view_module": "@jupyter-widgets/base",
722
  "_view_module_version": "1.2.0",
723
- "_view_name": "LayoutView",
724
- "align_content": null,
725
- "align_items": null,
726
- "align_self": null,
727
- "border": null,
728
- "bottom": null,
729
- "display": null,
730
- "flex": null,
731
- "flex_flow": null,
732
- "grid_area": null,
733
- "grid_auto_columns": null,
734
- "grid_auto_flow": null,
735
- "grid_auto_rows": null,
736
- "grid_column": null,
737
- "grid_gap": null,
738
- "grid_row": null,
739
- "grid_template_areas": null,
740
- "grid_template_columns": null,
741
- "grid_template_rows": null,
742
- "height": null,
743
- "justify_content": null,
744
- "justify_items": null,
745
- "left": null,
746
- "margin": null,
747
- "max_height": null,
748
- "max_width": null,
749
- "min_height": null,
750
- "min_width": null,
751
- "object_fit": null,
752
- "object_position": null,
753
- "order": null,
754
- "overflow": null,
755
- "overflow_x": null,
756
- "overflow_y": null,
757
- "padding": null,
758
- "right": null,
759
- "top": null,
760
- "visibility": null,
761
- "width": null
762
  }
763
  },
764
- "ed0fa93199b94fb486c125d4f322d59f": {
765
  "model_module": "@jupyter-widgets/base",
766
- "model_name": "LayoutModel",
767
  "model_module_version": "1.2.0",
 
768
  "state": {
769
  "_model_module": "@jupyter-widgets/base",
770
  "_model_module_version": "1.2.0",
@@ -813,25 +861,34 @@
813
  "width": null
814
  }
815
  },
816
- "66f82e7ef3694c699e3d4a2bd826392b": {
817
  "model_module": "@jupyter-widgets/controls",
818
- "model_name": "DescriptionStyleModel",
819
  "model_module_version": "1.5.0",
 
820
  "state": {
 
821
  "_model_module": "@jupyter-widgets/controls",
822
  "_model_module_version": "1.5.0",
823
- "_model_name": "DescriptionStyleModel",
824
- "_view_count": null,
825
- "_view_module": "@jupyter-widgets/base",
826
- "_view_module_version": "1.2.0",
827
- "_view_name": "StyleView",
828
- "description_width": ""
 
 
 
 
 
 
 
 
829
  }
830
  },
831
- "2bfd51e3ae954008ae83704c24dbd6cb": {
832
  "model_module": "@jupyter-widgets/base",
833
- "model_name": "LayoutModel",
834
  "model_module_version": "1.2.0",
 
835
  "state": {
836
  "_model_module": "@jupyter-widgets/base",
837
  "_model_module_version": "1.2.0",
@@ -880,26 +937,48 @@
880
  "width": null
881
  }
882
  },
883
- "f8b84d8c06384680973ef6fe787b5a5d": {
884
  "model_module": "@jupyter-widgets/controls",
885
- "model_name": "ProgressStyleModel",
886
  "model_module_version": "1.5.0",
 
887
  "state": {
888
  "_model_module": "@jupyter-widgets/controls",
889
  "_model_module_version": "1.5.0",
890
- "_model_name": "ProgressStyleModel",
891
  "_view_count": null,
892
  "_view_module": "@jupyter-widgets/base",
893
  "_view_module_version": "1.2.0",
894
  "_view_name": "StyleView",
895
- "bar_color": null,
896
- "description_width": ""
897
  }
898
  },
899
- "770341dc116148a8b7571cce3a2f2baf": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
900
  "model_module": "@jupyter-widgets/base",
901
- "model_name": "LayoutModel",
902
  "model_module_version": "1.2.0",
 
903
  "state": {
904
  "_model_module": "@jupyter-widgets/base",
905
  "_model_module_version": "1.2.0",
@@ -933,557 +1012,468 @@
933
  "margin": null,
934
  "max_height": null,
935
  "max_width": null,
936
- "min_height": null,
937
- "min_width": null,
938
- "object_fit": null,
939
- "object_position": null,
940
- "order": null,
941
- "overflow": null,
942
- "overflow_x": null,
943
- "overflow_y": null,
944
- "padding": null,
945
- "right": null,
946
- "top": null,
947
- "visibility": null,
948
- "width": null
949
- }
950
- },
951
- "29416122cc0b4a5592668ddced7686ba": {
952
- "model_module": "@jupyter-widgets/controls",
953
- "model_name": "DescriptionStyleModel",
954
- "model_module_version": "1.5.0",
955
- "state": {
956
- "_model_module": "@jupyter-widgets/controls",
957
- "_model_module_version": "1.5.0",
958
- "_model_name": "DescriptionStyleModel",
959
- "_view_count": null,
960
- "_view_module": "@jupyter-widgets/base",
961
- "_view_module_version": "1.2.0",
962
- "_view_name": "StyleView",
963
- "description_width": ""
964
- }
965
- }
966
- }
967
- }
968
- },
969
- "cells": [
970
- {
971
- "cell_type": "markdown",
972
- "metadata": {
973
- "id": "view-in-github",
974
- "colab_type": "text"
975
- },
976
- "source": [
977
- "<a href=\"https://colab.research.google.com/github/merveenoyan/smol-vision/blob/main/Gemma_3n_Video_Vibe_Tests.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
978
- ]
979
- },
980
- {
981
- "cell_type": "markdown",
982
- "source": [
983
- "## Gemma 3n Video with Audio Inference"
984
- ],
985
- "metadata": {
986
- "id": "onFz3_7AqnaB"
987
- }
988
- },
989
- {
990
- "cell_type": "markdown",
991
- "source": [
992
- "In this notebook we'll infer Gemma-3n videos with audios inside."
993
- ],
994
- "metadata": {
995
- "id": "KKUnhy4JqqAg"
996
- }
997
- },
998
- {
999
- "cell_type": "code",
1000
- "source": [
1001
- "!pip install -U -q transformers timm datasets"
1002
- ],
1003
- "metadata": {
1004
- "id": "Vf-VvnrNjuxF"
1005
- },
1006
- "execution_count": null,
1007
- "outputs": []
1008
- },
1009
- {
1010
- "cell_type": "markdown",
1011
- "source": [
1012
- "We will load three examples from FineVideo dataset and Gemma-3n model so make sure you have access to both and provide access token."
1013
- ],
1014
- "metadata": {
1015
- "id": "gcJbxIPLqvjH"
1016
- }
1017
- },
1018
- {
1019
- "cell_type": "code",
1020
- "source": [
1021
- "from huggingface_hub import login\n",
1022
- "login()"
1023
- ],
1024
- "metadata": {
1025
- "id": "bROdG2-Jj9lT",
1026
- "colab": {
1027
- "base_uri": "https://localhost:8080/",
1028
- "height": 17,
1029
- "referenced_widgets": [
1030
- "542490f74e974451bc44009a6fa174bd",
1031
- "409f985be1134b468b81136fbdb54408",
1032
- "57cb1e931c614980a4147cb125524d7d",
1033
- "87dc7aaf52e349a7bb43bb1b8bc137ee",
1034
- "983ed4cb4eea42daa9ae8c0417021a21",
1035
- "40c381fd7bb04b43a879044a4e988cc6",
1036
- "8d0e5abdd7c549f1a66ee198c9fa1430",
1037
- "c72dd3d6a4c246cfa6590c314783c8f0",
1038
- "c0e471e664dd41eab98efe08301ef5e1",
1039
- "868f63ea9455442d837dc2c422918800",
1040
- "5b7b4707b1bf4159a10bf7e289bde435",
1041
- "889d0d1ed24e4de2b89896511d008e60",
1042
- "68fc757825dd44a48ab2383db20958db",
1043
- "cb76f933e6e640d9a688f7838e5fb0b3",
1044
- "8704264bff4d46c9813ac9acf92da962",
1045
- "9b5d87960dde401baeaf8b6144fb8bad",
1046
- "76e06881e5e94197a24944e07fdf3189",
1047
- "f40dd696acc64c6284c6f8f485f3ce9d",
1048
- "4488de26dce74cbbb39d99ae09bd21fa",
1049
- "ded62e6c032745ec88ca0ab694b0d397"
1050
- ]
1051
- },
1052
- "outputId": "1978e9bd-3b52-40b8-e643-418f9872476d"
1053
- },
1054
- "execution_count": null,
1055
- "outputs": [
1056
- {
1057
- "output_type": "display_data",
1058
- "data": {
1059
- "text/plain": [
1060
- "VBox(children=(HTML(value='<center> <img\\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…"
1061
- ],
1062
- "application/vnd.jupyter.widget-view+json": {
1063
- "version_major": 2,
1064
- "version_minor": 0,
1065
- "model_id": "542490f74e974451bc44009a6fa174bd"
1066
- }
1067
- },
1068
- "metadata": {}
1069
- }
1070
- ]
1071
- },
1072
- {
1073
- "cell_type": "code",
1074
- "execution_count": null,
1075
- "metadata": {
1076
- "id": "TMiKyRtAjjAc",
1077
- "colab": {
1078
- "base_uri": "https://localhost:8080/",
1079
- "height": 173,
1080
- "referenced_widgets": [
1081
- "be523e956910487ca263d943a7a58395",
1082
- "01dc23faab3d42cda41fdfdd2a7dfed5",
1083
- "777d7addfb144fd8896b77a1e0d54f25",
1084
- "c518268069244b21810e84380502c190",
1085
- "fee72c1c455549b59092028b855a082a",
1086
- "ed0fa93199b94fb486c125d4f322d59f",
1087
- "66f82e7ef3694c699e3d4a2bd826392b",
1088
- "2bfd51e3ae954008ae83704c24dbd6cb",
1089
- "f8b84d8c06384680973ef6fe787b5a5d",
1090
- "770341dc116148a8b7571cce3a2f2baf",
1091
- "29416122cc0b4a5592668ddced7686ba"
1092
- ]
1093
- },
1094
- "outputId": "7351e21a-3c82-4d0c-c827-24b66812f181"
1095
- },
1096
- "outputs": [
1097
- {
1098
- "output_type": "stream",
1099
- "name": "stderr",
1100
- "text": [
1101
- "/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning: \n",
1102
- "The secret `HF_TOKEN` does not exist in your Colab secrets.\n",
1103
- "To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.\n",
1104
- "You will be able to reuse this secret in all of your notebooks.\n",
1105
- "Please note that authentication is recommended but still optional to access public models or datasets.\n",
1106
- " warnings.warn(\n"
1107
- ]
1108
- },
1109
- {
1110
- "output_type": "display_data",
1111
- "data": {
1112
- "text/plain": [
1113
- "Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]"
1114
- ],
1115
- "application/vnd.jupyter.widget-view+json": {
1116
- "version_major": 2,
1117
- "version_minor": 0,
1118
- "model_id": "be523e956910487ca263d943a7a58395"
1119
- }
1120
- },
1121
- "metadata": {}
1122
- }
1123
- ],
1124
- "source": [
1125
- "from transformers import AutoProcessor, Gemma3nForConditionalGeneration\n",
1126
- "import torch\n",
1127
- "model = Gemma3nForConditionalGeneration.from_pretrained(\n",
1128
- " \"google/gemma-3n-E4B-it\", torch_dtype=torch.bfloat16,\n",
1129
- ").to(\"cuda\")\n",
1130
- "processor = AutoProcessor.from_pretrained(\n",
1131
- " \"google/gemma-3n-E4B-it\",\n",
1132
- ")\n",
1133
- "processor.tokenizer.padding_side = \"right\""
1134
- ]
1135
- },
1136
- {
1137
- "cell_type": "markdown",
1138
- "source": [
1139
- "Download video for inference."
1140
- ],
1141
- "metadata": {
1142
- "id": "mQzrURJlNRwW"
1143
- }
1144
- },
1145
- {
1146
- "cell_type": "code",
1147
- "source": [
1148
- "!wget https://huggingface.co/datasets/merve/vlm_test_images/resolve/main/IMG_8137.mp4"
1149
- ],
1150
- "metadata": {
1151
- "colab": {
1152
- "base_uri": "https://localhost:8080/"
1153
  },
1154
- "id": "PAQ1S2uDMIzj",
1155
- "outputId": "c584ee8c-b960-4f82-f2c6-be194709256f"
1156
- },
1157
- "execution_count": null,
1158
- "outputs": [
1159
- {
1160
- "output_type": "stream",
1161
- "name": "stdout",
1162
- "text": [
1163
- "--2025-07-01 13:39:22-- https://huggingface.co/datasets/merve/vlm_test_images/resolve/main/IMG_8137.mp4\n",
1164
- "Resolving huggingface.co (huggingface.co)... 18.172.134.4, 18.172.134.24, 18.172.134.124, ...\n",
1165
- "Connecting to huggingface.co (huggingface.co)|18.172.134.4|:443... connected.\n",
1166
- "HTTP request sent, awaiting response... 302 Found\n",
1167
- "Location: https://cdn-lfs-us-1.hf.co/repos/7b/14/7b14679bb56cefbf7829be71f3f444110ccc308f431bd8596f534e743367ea5c/6331cbb913feb48349e3b7015a7969e04ce3cd594b1bda7278e4e33fe4a3f5f3?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27IMG_8137.mp4%3B+filename%3D%22IMG_8137.mp4%22%3B&response-content-type=video%2Fmp4&Expires=1751380762&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc1MTM4MDc2Mn19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zLzdiLzE0LzdiMTQ2NzliYjU2Y2VmYmY3ODI5YmU3MWYzZjQ0NDExMGNjYzMwOGY0MzFiZDg1OTZmNTM0ZTc0MzM2N2VhNWMvNjMzMWNiYjkxM2ZlYjQ4MzQ5ZTNiNzAxNWE3OTY5ZTA0Y2UzY2Q1OTRiMWJkYTcyNzhlNGUzM2ZlNGEzZjVmMz9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=MsPaMyO17sK%7Eo3U41ncCYEHd2vpjR6Jvv2IiqrhIy45kp-2WPdIGaYg5F7g9ENDJfFqmYavs6VH26AdLbX3HLPBUoR%7EAV8Iew8V1lFK1SpMkyCkh0SMtYNHqSw27jJ1ZSIhMKnHA7hRGi5b8LAhBiGzmlikz4a%7EtZAjjQZ18ZyN8GxCvTironzCp3uKUExWpRQF%7EwEwqurBb%7EKs-uJ6KDLvshYInzF%7Eo1LEoRNlXdxmDk8Q5Q7ZnBFM5m%7EPvBt-OQ4WWDPQZ86qblHwtoAgf483cdviYLPd8PjGzarQxgrjxbqELMvXM-nvUdXcOuAwhbBzpzSwBGQManPZxOFKTFw__&Key-Pair-Id=K24J24Z295AEI9 [following]\n",
1168
- "--2025-07-01 13:39:22-- https://cdn-lfs-us-1.hf.co/repos/7b/14/7b14679bb56cefbf7829be71f3f444110ccc308f431bd8596f534e743367ea5c/6331cbb913feb48349e3b7015a7969e04ce3cd594b1bda7278e4e33fe4a3f5f3?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27IMG_8137.mp4%3B+filename%3D%22IMG_8137.mp4%22%3B&response-content-type=video%2Fmp4&Expires=1751380762&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc1MTM4MDc2Mn19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zLzdiLzE0LzdiMTQ2NzliYjU2Y2VmYmY3ODI5YmU3MWYzZjQ0NDExMGNjYzMwOGY0MzFiZDg1OTZmNTM0ZTc0MzM2N2VhNWMvNjMzMWNiYjkxM2ZlYjQ4MzQ5ZTNiNzAxNWE3OTY5ZTA0Y2UzY2Q1OTRiMWJkYTcyNzhlNGUzM2ZlNGEzZjVmMz9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=MsPaMyO17sK%7Eo3U41ncCYEHd2vpjR6Jvv2IiqrhIy45kp-2WPdIGaYg5F7g9ENDJfFqmYavs6VH26AdLbX3HLPBUoR%7EAV8Iew8V1lFK1SpMkyCkh0SMtYNHqSw27jJ1ZSIhMKnHA7hRGi5b8LAhBiGzmlikz4a%7EtZAjjQZ18ZyN8GxCvTironzCp3uKUExWpRQF%7EwEwqurBb%7EKs-uJ6KDLvshYInzF%7Eo1LEoRNlXdxmDk8Q5Q7ZnBFM5m%7EPvBt-OQ4WWDPQZ86qblHwtoAgf483cdviYLPd8PjGzarQxgrjxbqELMvXM-nvUdXcOuAwhbBzpzSwBGQManPZxOFKTFw__&Key-Pair-Id=K24J24Z295AEI9\n",
1169
- "Resolving cdn-lfs-us-1.hf.co (cdn-lfs-us-1.hf.co)... 3.167.138.114, 3.167.138.90, 3.167.138.39, ...\n",
1170
- "Connecting to cdn-lfs-us-1.hf.co (cdn-lfs-us-1.hf.co)|3.167.138.114|:443... connected.\n",
1171
- "HTTP request sent, awaiting response... 200 OK\n",
1172
- "Length: 5340706 (5.1M) [video/mp4]\n",
1173
- "Saving to: ‘IMG_8137.mp4’\n",
1174
- "\n",
1175
- "IMG_8137.mp4 100%[===================>] 5.09M 27.1MB/s in 0.2s \n",
1176
- "\n",
1177
- "2025-07-01 13:39:22 (27.1 MB/s) - ‘IMG_8137.mp4’ saved [5340706/5340706]\n",
1178
- "\n"
1179
- ]
1180
- }
1181
- ]
1182
- },
1183
- {
1184
- "cell_type": "markdown",
1185
- "source": [
1186
- "Strip audios from video."
1187
- ],
1188
- "metadata": {
1189
- "id": "KXlBj7dVtUFZ"
1190
- }
1191
- },
1192
- {
1193
- "cell_type": "code",
1194
- "source": [
1195
- "import os\n",
1196
- "import subprocess\n",
1197
- "filename = \"IMG_8137.mp4\"\n",
1198
- "audio_path = os.path.join(\"audios\", f\"audio.wav\")\n",
1199
- "\n",
1200
- "subprocess.run([\n",
1201
- " \"ffmpeg\", \"-i\", filename,\n",
1202
- " \"-q:a\", \"0\", \"-map\", \"a\",\n",
1203
- " audio_path,\n",
1204
- " \"-y\"\n",
1205
- "], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)"
1206
- ],
1207
- "metadata": {
1208
- "colab": {
1209
- "base_uri": "https://localhost:8080/"
1210
  },
1211
- "id": "FQhKimtlMOHe",
1212
- "outputId": "ef05231a-ce56-4733-b0be-d6b423a143ae"
1213
- },
1214
- "execution_count": null,
1215
- "outputs": [
1216
- {
1217
- "output_type": "execute_result",
1218
- "data": {
1219
- "text/plain": [
1220
- "CompletedProcess(args=['ffmpeg', '-i', 'IMG_8137.mp4', '-q:a', '0', '-map', 'a', 'audios/audio.wav', '-y'], returncode=0)"
1221
- ]
1222
- },
1223
- "metadata": {},
1224
- "execution_count": 57
1225
- }
1226
- ]
1227
- },
1228
- {
1229
- "cell_type": "code",
1230
- "source": [
1231
- "import cv2\n",
1232
- "from PIL import Image\n",
1233
- "import numpy as np\n",
1234
- "\n",
1235
- "def downsample_video(video_path):\n",
1236
- " vidcap = cv2.VideoCapture(video_path)\n",
1237
- " total_frames = int(vidcap.get(cv2.CAP_PROP_FRAME_COUNT))\n",
1238
- " fps = vidcap.get(cv2.CAP_PROP_FPS)\n",
1239
- "\n",
1240
- " frames = []\n",
1241
- " frame_indices = np.linspace(0, total_frames - 1, 7, dtype=int)\n",
1242
- "\n",
1243
- " for i in frame_indices:\n",
1244
- " vidcap.set(cv2.CAP_PROP_POS_FRAMES, i)\n",
1245
- " success, image = vidcap.read()\n",
1246
- " if success:\n",
1247
- " image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Convert from BGR to RGB\n",
1248
- " pil_image = Image.fromarray(image)\n",
1249
- " timestamp = round(i / fps, 2)\n",
1250
- " frames.append((pil_image, timestamp))\n",
1251
- "\n",
1252
- " vidcap.release()\n",
1253
- " return frames\n"
1254
- ],
1255
- "metadata": {
1256
- "id": "6e_cExwMjx7v"
1257
- },
1258
- "execution_count": null,
1259
- "outputs": []
1260
- },
1261
- {
1262
- "cell_type": "markdown",
1263
- "source": [
1264
- "We will generate descriptions to videos and compare them to irl description in the metadata for the vibecheck.\n",
1265
- "\n",
1266
- "We need to downsample video to frames."
1267
- ],
1268
- "metadata": {
1269
- "id": "mRKCPRabuMs6"
1270
- }
1271
- },
1272
- {
1273
- "cell_type": "code",
1274
- "source": [
1275
- "frames = downsample_video(filename)"
1276
- ],
1277
- "metadata": {
1278
- "id": "UMJESbFulYTi"
1279
- },
1280
- "execution_count": null,
1281
- "outputs": []
1282
- },
1283
- {
1284
- "cell_type": "code",
1285
- "source": [
1286
- "frames"
1287
- ],
1288
- "metadata": {
1289
- "colab": {
1290
- "base_uri": "https://localhost:8080/"
1291
  },
1292
- "id": "wJKdYXasMfEG",
1293
- "outputId": "2cff578c-df4d-41ca-8d9e-f85b4fed3456"
1294
- },
1295
- "execution_count": null,
1296
- "outputs": [
1297
- {
1298
- "output_type": "execute_result",
1299
- "data": {
1300
- "text/plain": [
1301
- "[(<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(0.0)),\n",
1302
- " (<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(1.03)),\n",
1303
- " (<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(2.09)),\n",
1304
- " (<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(3.12)),\n",
1305
- " (<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(4.17)),\n",
1306
- " (<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(5.21)),\n",
1307
- " (<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(6.26))]"
1308
- ]
1309
- },
1310
- "metadata": {},
1311
- "execution_count": 52
1312
- }
1313
- ]
1314
- },
1315
- {
1316
- "cell_type": "code",
1317
- "source": [
1318
- "messages = [\n",
1319
- " {\n",
1320
- " \"role\": \"system\",\n",
1321
- " \"content\": [{\"type\": \"text\", \"text\": \"You are a helpful assistant.\"}]\n",
1322
- " },\n",
1323
- " {\n",
1324
- " \"role\": \"user\",\n",
1325
- " \"content\": [\n",
1326
- " {\"type\": \"text\", \"text\": f\"What is happening in this video? Summarize the events.\"}]\n",
1327
- " }\n",
1328
- "]\n",
1329
- "for frame in frames:\n",
1330
- " image, timestamp = frame\n",
1331
- " messages[1][\"content\"].append({\"type\": \"text\", \"text\": f\"Frame {timestamp}: \"})\n",
1332
- " image.save(f\"image_{timestamp}.png\")\n",
1333
- " messages[1][\"content\"].append({\"type\": \"image\", \"url\": f\"./image_{timestamp}.png\"})\n",
1334
- "messages[1][\"content\"].append({\"type\": \"audio\", \"audio\": f\"audios/audio.wav\"})"
1335
- ],
1336
- "metadata": {
1337
- "id": "u8itVHCflZYQ"
1338
- },
1339
- "execution_count": null,
1340
- "outputs": []
1341
- },
1342
- {
1343
- "cell_type": "code",
1344
- "source": [
1345
- "messages"
1346
- ],
1347
- "metadata": {
1348
- "id": "dBX4mNxXxGoC",
1349
- "colab": {
1350
- "base_uri": "https://localhost:8080/"
1351
  },
1352
- "outputId": "b738e828-bf9b-4f13-bbb2-9f38bea50b6a"
1353
- },
1354
- "execution_count": null,
1355
- "outputs": [
1356
- {
1357
- "output_type": "execute_result",
1358
- "data": {
1359
- "text/plain": [
1360
- "[{'role': 'system',\n",
1361
- " 'content': [{'type': 'text', 'text': 'You are a helpful assistant.'}]},\n",
1362
- " {'role': 'user',\n",
1363
- " 'content': [{'type': 'text',\n",
1364
- " 'text': 'What is happening in this video? Summarize the events.'},\n",
1365
- " {'type': 'text', 'text': 'Frame 0.0: '},\n",
1366
- " {'type': 'image', 'url': './image_0.0.png'},\n",
1367
- " {'type': 'text', 'text': 'Frame 1.03: '},\n",
1368
- " {'type': 'image', 'url': './image_1.03.png'},\n",
1369
- " {'type': 'text', 'text': 'Frame 2.09: '},\n",
1370
- " {'type': 'image', 'url': './image_2.09.png'},\n",
1371
- " {'type': 'text', 'text': 'Frame 3.12: '},\n",
1372
- " {'type': 'image', 'url': './image_3.12.png'},\n",
1373
- " {'type': 'text', 'text': 'Frame 4.17: '},\n",
1374
- " {'type': 'image', 'url': './image_4.17.png'},\n",
1375
- " {'type': 'text', 'text': 'Frame 5.21: '},\n",
1376
- " {'type': 'image', 'url': './image_5.21.png'},\n",
1377
- " {'type': 'text', 'text': 'Frame 6.26: '},\n",
1378
- " {'type': 'image', 'url': './image_6.26.png'},\n",
1379
- " {'type': 'audio', 'audio': 'audios/audio.wav'}]}]"
1380
- ]
1381
- },
1382
- "metadata": {},
1383
- "execution_count": 59
1384
- }
1385
- ]
1386
- },
1387
- {
1388
- "cell_type": "code",
1389
- "source": [
1390
- "#processor.tokenizer.padding_side = \"right\"\n",
1391
- "inputs = processor.apply_chat_template(\n",
1392
- " messages, add_generation_prompt=True, tokenize=True,\n",
1393
- " return_dict=True, return_tensors=\"pt\"\n",
1394
- ").to(model.device).to(model.dtype)"
1395
- ],
1396
- "metadata": {
1397
- "id": "e4f0qr67lcjo"
1398
- },
1399
- "execution_count": null,
1400
- "outputs": []
1401
- },
1402
- {
1403
- "cell_type": "code",
1404
- "source": [
1405
- "inputs[\"input_ids\"].shape[-1]"
1406
- ],
1407
- "metadata": {
1408
- "colab": {
1409
- "base_uri": "https://localhost:8080/"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1410
  },
1411
- "id": "EOiBpgkI9kXi",
1412
- "outputId": "911a6013-f76f-4fed-c402-8039d67b1e05"
1413
- },
1414
- "execution_count": null,
1415
- "outputs": [
1416
- {
1417
- "output_type": "execute_result",
1418
- "data": {
1419
- "text/plain": [
1420
- "2087"
1421
- ]
1422
- },
1423
- "metadata": {},
1424
- "execution_count": 61
1425
- }
1426
- ]
1427
- },
1428
- {
1429
- "cell_type": "code",
1430
- "source": [
1431
- "with torch.inference_mode():\n",
1432
- " generation = model.generate(**inputs, max_new_tokens=200, do_sample=False)"
1433
- ],
1434
- "metadata": {
1435
- "id": "yJ95UXBqvXPM",
1436
- "colab": {
1437
- "base_uri": "https://localhost:8080/"
1438
  },
1439
- "outputId": "721839dc-aa78-401b-e802-b858690980da"
1440
- },
1441
- "execution_count": null,
1442
- "outputs": [
1443
- {
1444
- "output_type": "stream",
1445
- "name": "stderr",
1446
- "text": [
1447
- "The following generation flags are not valid and may be ignored: ['top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.\n"
1448
- ]
1449
- }
1450
- ]
1451
- },
1452
- {
1453
- "cell_type": "code",
1454
- "source": [
1455
- "input_len = inputs[\"input_ids\"].shape[-1]\n",
1456
- "\n",
1457
- "generation = generation[0][input_len:]\n",
1458
- "\n",
1459
- "decoded = processor.decode(generation, skip_special_tokens=True)\n",
1460
- "print(decoded)"
1461
- ],
1462
- "metadata": {
1463
- "colab": {
1464
- "base_uri": "https://localhost:8080/"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1465
  },
1466
- "id": "3ifVZy9c74St",
1467
- "outputId": "f8ab51c6-e5a3-4a16-875b-d07404041396"
1468
- },
1469
- "execution_count": null,
1470
- "outputs": [
1471
- {
1472
- "output_type": "stream",
1473
- "name": "stdout",
1474
- "text": [
1475
- "Here's a summary of what's happening in the video:\n",
1476
- "\n",
1477
- "The video appears to be taken at a ski resort. The main subject is a person snowboarding down a snowy slope. \n",
1478
- "\n",
1479
- "**Initial Scene (0.0 - 1.03):** The snowboarder is initially positioned on the slope, seemingly having fallen or stopped. Other skiers and snowboarders are visible in the background, waiting at what looks like a lift station.\n",
1480
- "\n",
1481
- "**Mid-Video (1.03 - 6.26):** The snowboarder gets back up and continues down the slope. They navigate past other people, including skiers and snowboarders, and eventually reach a lift station. The video shows the snowboarder interacting with others at the lift, possibly waiting for the lift to start or having just gotten off. There are also other skiers and snowboarders around the lift station.\n",
1482
- "\n",
1483
- "**End Scene (6.26):** The snowboarder is still at the lift station,\n"
1484
- ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1485
  }
1486
- ]
1487
  }
1488
- ]
1489
- }
 
 
 
1
  {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {
6
+ "id": "onFz3_7AqnaB"
7
+ },
8
+ "source": [
9
+ "## Gemma 3n Video with Audio Inference"
10
+ ]
11
  },
12
+ {
13
+ "cell_type": "markdown",
14
+ "metadata": {
15
+ "id": "KKUnhy4JqqAg"
16
+ },
17
+ "source": [
18
+ "In this notebook we'll infer Gemma-3n videos with audios inside."
19
+ ]
20
  },
21
+ {
22
+ "cell_type": "code",
23
+ "execution_count": null,
24
+ "metadata": {
25
+ "id": "Vf-VvnrNjuxF"
26
+ },
27
+ "outputs": [],
28
+ "source": [
29
+ "!pip install -U -q transformers timm datasets"
30
+ ]
31
  },
32
+ {
33
+ "cell_type": "markdown",
34
+ "metadata": {
35
+ "id": "gcJbxIPLqvjH"
36
+ },
37
+ "source": [
38
+ "We will load three examples from FineVideo dataset and Gemma-3n model so make sure you have access to both and provide access token."
39
+ ]
40
+ },
41
+ {
42
+ "cell_type": "code",
43
+ "execution_count": null,
44
+ "metadata": {
45
+ "colab": {
46
+ "base_uri": "https://localhost:8080/",
47
+ "height": 17,
48
+ "referenced_widgets": [
49
+ "542490f74e974451bc44009a6fa174bd",
50
+ "409f985be1134b468b81136fbdb54408",
51
+ "57cb1e931c614980a4147cb125524d7d",
52
+ "87dc7aaf52e349a7bb43bb1b8bc137ee",
53
+ "983ed4cb4eea42daa9ae8c0417021a21",
54
+ "40c381fd7bb04b43a879044a4e988cc6",
55
+ "8d0e5abdd7c549f1a66ee198c9fa1430",
56
+ "c72dd3d6a4c246cfa6590c314783c8f0",
57
+ "c0e471e664dd41eab98efe08301ef5e1",
58
+ "868f63ea9455442d837dc2c422918800",
59
+ "5b7b4707b1bf4159a10bf7e289bde435",
60
+ "889d0d1ed24e4de2b89896511d008e60",
61
+ "68fc757825dd44a48ab2383db20958db",
62
+ "cb76f933e6e640d9a688f7838e5fb0b3",
63
+ "8704264bff4d46c9813ac9acf92da962",
64
+ "9b5d87960dde401baeaf8b6144fb8bad",
65
+ "76e06881e5e94197a24944e07fdf3189",
66
+ "f40dd696acc64c6284c6f8f485f3ce9d",
67
+ "4488de26dce74cbbb39d99ae09bd21fa",
68
+ "ded62e6c032745ec88ca0ab694b0d397"
69
+ ]
70
  },
71
+ "id": "bROdG2-Jj9lT",
72
+ "outputId": "1978e9bd-3b52-40b8-e643-418f9872476d"
73
+ },
74
+ "outputs": [
75
+ {
76
+ "data": {
77
+ "application/vnd.jupyter.widget-view+json": {
78
+ "model_id": "542490f74e974451bc44009a6fa174bd",
79
+ "version_major": 2,
80
+ "version_minor": 0
81
+ },
82
+ "text/plain": [
83
+ "VBox(children=(HTML(value='<center> <img\\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…"
84
+ ]
85
+ },
86
+ "metadata": {},
87
+ "output_type": "display_data"
88
+ }
89
+ ],
90
+ "source": [
91
+ "from huggingface_hub import login\n",
92
+ "login()"
93
+ ]
94
+ },
95
+ {
96
+ "cell_type": "code",
97
+ "execution_count": null,
98
+ "metadata": {
99
+ "colab": {
100
+ "base_uri": "https://localhost:8080/",
101
+ "height": 173,
102
+ "referenced_widgets": [
103
+ "be523e956910487ca263d943a7a58395",
104
+ "01dc23faab3d42cda41fdfdd2a7dfed5",
105
+ "777d7addfb144fd8896b77a1e0d54f25",
106
+ "c518268069244b21810e84380502c190",
107
+ "fee72c1c455549b59092028b855a082a",
108
+ "ed0fa93199b94fb486c125d4f322d59f",
109
+ "66f82e7ef3694c699e3d4a2bd826392b",
110
+ "2bfd51e3ae954008ae83704c24dbd6cb",
111
+ "f8b84d8c06384680973ef6fe787b5a5d",
112
+ "770341dc116148a8b7571cce3a2f2baf",
113
+ "29416122cc0b4a5592668ddced7686ba"
114
+ ]
115
  },
116
+ "id": "TMiKyRtAjjAc",
117
+ "outputId": "7351e21a-3c82-4d0c-c827-24b66812f181"
118
+ },
119
+ "outputs": [
120
+ {
121
+ "name": "stderr",
122
+ "output_type": "stream",
123
+ "text": [
124
+ "/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning: \n",
125
+ "The secret `HF_TOKEN` does not exist in your Colab secrets.\n",
126
+ "To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.\n",
127
+ "You will be able to reuse this secret in all of your notebooks.\n",
128
+ "Please note that authentication is recommended but still optional to access public models or datasets.\n",
129
+ " warnings.warn(\n"
130
+ ]
 
 
 
 
 
 
 
131
  },
132
+ {
133
+ "data": {
134
+ "application/vnd.jupyter.widget-view+json": {
135
+ "model_id": "be523e956910487ca263d943a7a58395",
136
+ "version_major": 2,
137
+ "version_minor": 0
138
+ },
139
+ "text/plain": [
140
+ "Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]"
141
+ ]
142
+ },
143
+ "metadata": {},
144
+ "output_type": "display_data"
145
+ }
146
+ ],
147
+ "source": [
148
+ "from transformers import AutoProcessor, Gemma3nForConditionalGeneration\n",
149
+ "import torch\n",
150
+ "model = Gemma3nForConditionalGeneration.from_pretrained(\n",
151
+ " \"google/gemma-3n-E4B-it\", torch_dtype=torch.bfloat16,\n",
152
+ ").to(\"cuda\")\n",
153
+ "processor = AutoProcessor.from_pretrained(\n",
154
+ " \"google/gemma-3n-E4B-it\",\n",
155
+ ")\n",
156
+ "processor.tokenizer.padding_side = \"right\""
157
+ ]
158
+ },
159
+ {
160
+ "cell_type": "markdown",
161
+ "metadata": {
162
+ "id": "mQzrURJlNRwW"
163
+ },
164
+ "source": [
165
+ "Download video for inference."
166
+ ]
167
+ },
168
+ {
169
+ "cell_type": "code",
170
+ "execution_count": null,
171
+ "metadata": {
172
+ "colab": {
173
+ "base_uri": "https://localhost:8080/"
174
  },
175
+ "id": "PAQ1S2uDMIzj",
176
+ "outputId": "c584ee8c-b960-4f82-f2c6-be194709256f"
177
+ },
178
+ "outputs": [
179
+ {
180
+ "name": "stdout",
181
+ "output_type": "stream",
182
+ "text": [
183
+ "--2025-07-01 13:39:22-- https://huggingface.co/datasets/merve/vlm_test_images/resolve/main/IMG_8137.mp4\n",
184
+ "Resolving huggingface.co (huggingface.co)... 18.172.134.4, 18.172.134.24, 18.172.134.124, ...\n",
185
+ "Connecting to huggingface.co (huggingface.co)|18.172.134.4|:443... connected.\n",
186
+ "HTTP request sent, awaiting response... 302 Found\n",
187
+ "Location: https://cdn-lfs-us-1.hf.co/repos/7b/14/7b14679bb56cefbf7829be71f3f444110ccc308f431bd8596f534e743367ea5c/6331cbb913feb48349e3b7015a7969e04ce3cd594b1bda7278e4e33fe4a3f5f3?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27IMG_8137.mp4%3B+filename%3D%22IMG_8137.mp4%22%3B&response-content-type=video%2Fmp4&Expires=1751380762&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc1MTM4MDc2Mn19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zLzdiLzE0LzdiMTQ2NzliYjU2Y2VmYmY3ODI5YmU3MWYzZjQ0NDExMGNjYzMwOGY0MzFiZDg1OTZmNTM0ZTc0MzM2N2VhNWMvNjMzMWNiYjkxM2ZlYjQ4MzQ5ZTNiNzAxNWE3OTY5ZTA0Y2UzY2Q1OTRiMWJkYTcyNzhlNGUzM2ZlNGEzZjVmMz9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=MsPaMyO17sK%7Eo3U41ncCYEHd2vpjR6Jvv2IiqrhIy45kp-2WPdIGaYg5F7g9ENDJfFqmYavs6VH26AdLbX3HLPBUoR%7EAV8Iew8V1lFK1SpMkyCkh0SMtYNHqSw27jJ1ZSIhMKnHA7hRGi5b8LAhBiGzmlikz4a%7EtZAjjQZ18ZyN8GxCvTironzCp3uKUExWpRQF%7EwEwqurBb%7EKs-uJ6KDLvshYInzF%7Eo1LEoRNlXdxmDk8Q5Q7ZnBFM5m%7EPvBt-OQ4WWDPQZ86qblHwtoAgf483cdviYLPd8PjGzarQxgrjxbqELMvXM-nvUdXcOuAwhbBzpzSwBGQManPZxOFKTFw__&Key-Pair-Id=K24J24Z295AEI9 [following]\n",
188
+ "--2025-07-01 13:39:22-- https://cdn-lfs-us-1.hf.co/repos/7b/14/7b14679bb56cefbf7829be71f3f444110ccc308f431bd8596f534e743367ea5c/6331cbb913feb48349e3b7015a7969e04ce3cd594b1bda7278e4e33fe4a3f5f3?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27IMG_8137.mp4%3B+filename%3D%22IMG_8137.mp4%22%3B&response-content-type=video%2Fmp4&Expires=1751380762&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc1MTM4MDc2Mn19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zLzdiLzE0LzdiMTQ2NzliYjU2Y2VmYmY3ODI5YmU3MWYzZjQ0NDExMGNjYzMwOGY0MzFiZDg1OTZmNTM0ZTc0MzM2N2VhNWMvNjMzMWNiYjkxM2ZlYjQ4MzQ5ZTNiNzAxNWE3OTY5ZTA0Y2UzY2Q1OTRiMWJkYTcyNzhlNGUzM2ZlNGEzZjVmMz9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=MsPaMyO17sK%7Eo3U41ncCYEHd2vpjR6Jvv2IiqrhIy45kp-2WPdIGaYg5F7g9ENDJfFqmYavs6VH26AdLbX3HLPBUoR%7EAV8Iew8V1lFK1SpMkyCkh0SMtYNHqSw27jJ1ZSIhMKnHA7hRGi5b8LAhBiGzmlikz4a%7EtZAjjQZ18ZyN8GxCvTironzCp3uKUExWpRQF%7EwEwqurBb%7EKs-uJ6KDLvshYInzF%7Eo1LEoRNlXdxmDk8Q5Q7ZnBFM5m%7EPvBt-OQ4WWDPQZ86qblHwtoAgf483cdviYLPd8PjGzarQxgrjxbqELMvXM-nvUdXcOuAwhbBzpzSwBGQManPZxOFKTFw__&Key-Pair-Id=K24J24Z295AEI9\n",
189
+ "Resolving cdn-lfs-us-1.hf.co (cdn-lfs-us-1.hf.co)... 3.167.138.114, 3.167.138.90, 3.167.138.39, ...\n",
190
+ "Connecting to cdn-lfs-us-1.hf.co (cdn-lfs-us-1.hf.co)|3.167.138.114|:443... connected.\n",
191
+ "HTTP request sent, awaiting response... 200 OK\n",
192
+ "Length: 5340706 (5.1M) [video/mp4]\n",
193
+ "Saving to: ‘IMG_8137.mp4’\n",
194
+ "\n",
195
+ "IMG_8137.mp4 100%[===================>] 5.09M 27.1MB/s in 0.2s \n",
196
+ "\n",
197
+ "2025-07-01 13:39:22 (27.1 MB/s) - ‘IMG_8137.mp4’ saved [5340706/5340706]\n",
198
+ "\n"
199
+ ]
200
+ }
201
+ ],
202
+ "source": [
203
+ "!wget https://huggingface.co/datasets/merve/vlm_test_images/resolve/main/IMG_8137.mp4"
204
+ ]
205
+ },
206
+ {
207
+ "cell_type": "markdown",
208
+ "metadata": {
209
+ "id": "KXlBj7dVtUFZ"
210
+ },
211
+ "source": [
212
+ "Strip audios from video."
213
+ ]
214
+ },
215
+ {
216
+ "cell_type": "code",
217
+ "execution_count": null,
218
+ "metadata": {
219
+ "colab": {
220
+ "base_uri": "https://localhost:8080/"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
221
  },
222
+ "id": "FQhKimtlMOHe",
223
+ "outputId": "ef05231a-ce56-4733-b0be-d6b423a143ae"
224
+ },
225
+ "outputs": [
226
+ {
227
+ "data": {
228
+ "text/plain": [
229
+ "CompletedProcess(args=['ffmpeg', '-i', 'IMG_8137.mp4', '-q:a', '0', '-map', 'a', 'audios/audio.wav', '-y'], returncode=0)"
230
+ ]
231
+ },
232
+ "execution_count": 57,
233
+ "metadata": {},
234
+ "output_type": "execute_result"
235
+ }
236
+ ],
237
+ "source": [
238
+ "import os\n",
239
+ "import subprocess\n",
240
+ "filename = \"IMG_8137.mp4\"\n",
241
+ "audio_path = os.path.join(\"audios\", f\"audio.wav\")\n",
242
+ "\n",
243
+ "subprocess.run([\n",
244
+ " \"ffmpeg\", \"-i\", filename,\n",
245
+ " \"-q:a\", \"0\", \"-map\", \"a\",\n",
246
+ " audio_path,\n",
247
+ " \"-y\"\n",
248
+ "], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)"
249
+ ]
250
+ },
251
+ {
252
+ "cell_type": "code",
253
+ "execution_count": null,
254
+ "metadata": {
255
+ "id": "6e_cExwMjx7v"
256
+ },
257
+ "outputs": [],
258
+ "source": [
259
+ "import cv2\n",
260
+ "from PIL import Image\n",
261
+ "import numpy as np\n",
262
+ "\n",
263
+ "def downsample_video(video_path):\n",
264
+ " vidcap = cv2.VideoCapture(video_path)\n",
265
+ " total_frames = int(vidcap.get(cv2.CAP_PROP_FRAME_COUNT))\n",
266
+ " fps = vidcap.get(cv2.CAP_PROP_FPS)\n",
267
+ "\n",
268
+ " frames = []\n",
269
+ " frame_indices = np.linspace(0, total_frames - 1, 7, dtype=int)\n",
270
+ "\n",
271
+ " for i in frame_indices:\n",
272
+ " vidcap.set(cv2.CAP_PROP_POS_FRAMES, i)\n",
273
+ " success, image = vidcap.read()\n",
274
+ " if success:\n",
275
+ " image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Convert from BGR to RGB\n",
276
+ " pil_image = Image.fromarray(image)\n",
277
+ " timestamp = round(i / fps, 2)\n",
278
+ " frames.append((pil_image, timestamp))\n",
279
+ "\n",
280
+ " vidcap.release()\n",
281
+ " return frames\n"
282
+ ]
283
+ },
284
+ {
285
+ "cell_type": "markdown",
286
+ "metadata": {
287
+ "id": "mRKCPRabuMs6"
288
+ },
289
+ "source": [
290
+ "We will generate descriptions to videos and compare them to irl description in the metadata for the vibecheck.\n",
291
+ "\n",
292
+ "We need to downsample video to frames."
293
+ ]
294
+ },
295
+ {
296
+ "cell_type": "code",
297
+ "execution_count": null,
298
+ "metadata": {
299
+ "id": "UMJESbFulYTi"
300
+ },
301
+ "outputs": [],
302
+ "source": [
303
+ "frames = downsample_video(filename)"
304
+ ]
305
+ },
306
+ {
307
+ "cell_type": "code",
308
+ "execution_count": null,
309
+ "metadata": {
310
+ "colab": {
311
+ "base_uri": "https://localhost:8080/"
312
  },
313
+ "id": "wJKdYXasMfEG",
314
+ "outputId": "2cff578c-df4d-41ca-8d9e-f85b4fed3456"
315
+ },
316
+ "outputs": [
317
+ {
318
+ "data": {
319
+ "text/plain": [
320
+ "[(<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(0.0)),\n",
321
+ " (<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(1.03)),\n",
322
+ " (<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(2.09)),\n",
323
+ " (<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(3.12)),\n",
324
+ " (<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(4.17)),\n",
325
+ " (<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(5.21)),\n",
326
+ " (<PIL.Image.Image image mode=RGB size=1080x1920>, np.float64(6.26))]"
327
+ ]
328
+ },
329
+ "execution_count": 52,
330
+ "metadata": {},
331
+ "output_type": "execute_result"
332
+ }
333
+ ],
334
+ "source": [
335
+ "frames"
336
+ ]
337
+ },
338
+ {
339
+ "cell_type": "code",
340
+ "execution_count": null,
341
+ "metadata": {
342
+ "id": "u8itVHCflZYQ"
343
+ },
344
+ "outputs": [],
345
+ "source": [
346
+ "messages = [\n",
347
+ " {\n",
348
+ " \"role\": \"system\",\n",
349
+ " \"content\": [{\"type\": \"text\", \"text\": \"You are a helpful assistant.\"}]\n",
350
+ " },\n",
351
+ " {\n",
352
+ " \"role\": \"user\",\n",
353
+ " \"content\": [\n",
354
+ " {\"type\": \"text\", \"text\": f\"What is happening in this video? Summarize the events.\"}]\n",
355
+ " }\n",
356
+ "]\n",
357
+ "for frame in frames:\n",
358
+ " image, timestamp = frame\n",
359
+ " messages[1][\"content\"].append({\"type\": \"text\", \"text\": f\"Frame {timestamp}: \"})\n",
360
+ " image.save(f\"image_{timestamp}.png\")\n",
361
+ " messages[1][\"content\"].append({\"type\": \"image\", \"url\": f\"./image_{timestamp}.png\"})\n",
362
+ "messages[1][\"content\"].append({\"type\": \"audio\", \"audio\": f\"audios/audio.wav\"})"
363
+ ]
364
+ },
365
+ {
366
+ "cell_type": "code",
367
+ "execution_count": null,
368
+ "metadata": {
369
+ "colab": {
370
+ "base_uri": "https://localhost:8080/"
371
  },
372
+ "id": "dBX4mNxXxGoC",
373
+ "outputId": "b738e828-bf9b-4f13-bbb2-9f38bea50b6a"
374
+ },
375
+ "outputs": [
376
+ {
377
+ "data": {
378
+ "text/plain": [
379
+ "[{'role': 'system',\n",
380
+ " 'content': [{'type': 'text', 'text': 'You are a helpful assistant.'}]},\n",
381
+ " {'role': 'user',\n",
382
+ " 'content': [{'type': 'text',\n",
383
+ " 'text': 'What is happening in this video? Summarize the events.'},\n",
384
+ " {'type': 'text', 'text': 'Frame 0.0: '},\n",
385
+ " {'type': 'image', 'url': './image_0.0.png'},\n",
386
+ " {'type': 'text', 'text': 'Frame 1.03: '},\n",
387
+ " {'type': 'image', 'url': './image_1.03.png'},\n",
388
+ " {'type': 'text', 'text': 'Frame 2.09: '},\n",
389
+ " {'type': 'image', 'url': './image_2.09.png'},\n",
390
+ " {'type': 'text', 'text': 'Frame 3.12: '},\n",
391
+ " {'type': 'image', 'url': './image_3.12.png'},\n",
392
+ " {'type': 'text', 'text': 'Frame 4.17: '},\n",
393
+ " {'type': 'image', 'url': './image_4.17.png'},\n",
394
+ " {'type': 'text', 'text': 'Frame 5.21: '},\n",
395
+ " {'type': 'image', 'url': './image_5.21.png'},\n",
396
+ " {'type': 'text', 'text': 'Frame 6.26: '},\n",
397
+ " {'type': 'image', 'url': './image_6.26.png'},\n",
398
+ " {'type': 'audio', 'audio': 'audios/audio.wav'}]}]"
399
+ ]
400
+ },
401
+ "execution_count": 59,
402
+ "metadata": {},
403
+ "output_type": "execute_result"
404
+ }
405
+ ],
406
+ "source": [
407
+ "messages"
408
+ ]
409
+ },
410
+ {
411
+ "cell_type": "code",
412
+ "execution_count": null,
413
+ "metadata": {
414
+ "id": "e4f0qr67lcjo"
415
+ },
416
+ "outputs": [],
417
+ "source": [
418
+ "#processor.tokenizer.padding_side = \"right\"\n",
419
+ "inputs = processor.apply_chat_template(\n",
420
+ " messages, add_generation_prompt=True, tokenize=True,\n",
421
+ " return_dict=True, return_tensors=\"pt\"\n",
422
+ ").to(model.device).to(model.dtype)"
423
+ ]
424
+ },
425
+ {
426
+ "cell_type": "code",
427
+ "execution_count": null,
428
+ "metadata": {
429
+ "colab": {
430
+ "base_uri": "https://localhost:8080/"
431
  },
432
+ "id": "EOiBpgkI9kXi",
433
+ "outputId": "911a6013-f76f-4fed-c402-8039d67b1e05"
434
+ },
435
+ "outputs": [
436
+ {
437
+ "data": {
438
+ "text/plain": [
439
+ "2087"
440
+ ]
441
+ },
442
+ "execution_count": 61,
443
+ "metadata": {},
444
+ "output_type": "execute_result"
445
+ }
446
+ ],
447
+ "source": [
448
+ "inputs[\"input_ids\"].shape[-1]"
449
+ ]
450
+ },
451
+ {
452
+ "cell_type": "code",
453
+ "execution_count": null,
454
+ "metadata": {
455
+ "colab": {
456
+ "base_uri": "https://localhost:8080/"
457
  },
458
+ "id": "yJ95UXBqvXPM",
459
+ "outputId": "721839dc-aa78-401b-e802-b858690980da"
460
+ },
461
+ "outputs": [
462
+ {
463
+ "name": "stderr",
464
+ "output_type": "stream",
465
+ "text": [
466
+ "The following generation flags are not valid and may be ignored: ['top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.\n"
467
+ ]
468
+ }
469
+ ],
470
+ "source": [
471
+ "with torch.inference_mode():\n",
472
+ " generation = model.generate(**inputs, max_new_tokens=200, do_sample=False)"
473
+ ]
474
+ },
475
+ {
476
+ "cell_type": "code",
477
+ "execution_count": null,
478
+ "metadata": {
479
+ "colab": {
480
+ "base_uri": "https://localhost:8080/"
481
+ },
482
+ "id": "3ifVZy9c74St",
483
+ "outputId": "f8ab51c6-e5a3-4a16-875b-d07404041396"
484
+ },
485
+ "outputs": [
486
+ {
487
+ "name": "stdout",
488
+ "output_type": "stream",
489
+ "text": [
490
+ "Here's a summary of what's happening in the video:\n",
491
+ "\n",
492
+ "The video appears to be taken at a ski resort. The main subject is a person snowboarding down a snowy slope. \n",
493
+ "\n",
494
+ "**Initial Scene (0.0 - 1.03):** The snowboarder is initially positioned on the slope, seemingly having fallen or stopped. Other skiers and snowboarders are visible in the background, waiting at what looks like a lift station.\n",
495
+ "\n",
496
+ "**Mid-Video (1.03 - 6.26):** The snowboarder gets back up and continues down the slope. They navigate past other people, including skiers and snowboarders, and eventually reach a lift station. The video shows the snowboarder interacting with others at the lift, possibly waiting for the lift to start or having just gotten off. There are also other skiers and snowboarders around the lift station.\n",
497
+ "\n",
498
+ "**End Scene (6.26):** The snowboarder is still at the lift station,\n"
499
+ ]
500
+ }
501
+ ],
502
+ "source": [
503
+ "input_len = inputs[\"input_ids\"].shape[-1]\n",
504
+ "\n",
505
+ "generation = generation[0][input_len:]\n",
506
+ "\n",
507
+ "decoded = processor.decode(generation, skip_special_tokens=True)\n",
508
+ "print(decoded)"
509
+ ]
510
+ }
511
+ ],
512
+ "metadata": {
513
+ "accelerator": "GPU",
514
+ "colab": {
515
+ "gpuType": "A100",
516
+ "include_colab_link": true,
517
+ "machine_shape": "hm",
518
+ "provenance": []
519
+ },
520
+ "kernelspec": {
521
+ "display_name": "Python 3",
522
+ "name": "python3"
523
+ },
524
+ "language_info": {
525
+ "name": "python"
526
+ },
527
+ "widgets": {
528
+ "application/vnd.jupyter.widget-state+json": {
529
+ "01dc23faab3d42cda41fdfdd2a7dfed5": {
530
+ "model_module": "@jupyter-widgets/controls",
531
+ "model_module_version": "1.5.0",
532
+ "model_name": "HTMLModel",
533
+ "state": {
534
+ "_dom_classes": [],
535
+ "_model_module": "@jupyter-widgets/controls",
536
+ "_model_module_version": "1.5.0",
537
+ "_model_name": "HTMLModel",
538
+ "_view_count": null,
539
+ "_view_module": "@jupyter-widgets/controls",
540
+ "_view_module_version": "1.5.0",
541
+ "_view_name": "HTMLView",
542
+ "description": "",
543
+ "description_tooltip": null,
544
+ "layout": "IPY_MODEL_ed0fa93199b94fb486c125d4f322d59f",
545
+ "placeholder": "​",
546
+ "style": "IPY_MODEL_66f82e7ef3694c699e3d4a2bd826392b",
547
+ "value": "Loading checkpoint shards: 100%"
548
  }
549
  },
550
+ "29416122cc0b4a5592668ddced7686ba": {
551
  "model_module": "@jupyter-widgets/controls",
 
552
  "model_module_version": "1.5.0",
553
+ "model_name": "DescriptionStyleModel",
554
  "state": {
555
  "_model_module": "@jupyter-widgets/controls",
556
  "_model_module_version": "1.5.0",
557
+ "_model_name": "DescriptionStyleModel",
558
  "_view_count": null,
559
  "_view_module": "@jupyter-widgets/base",
560
  "_view_module_version": "1.2.0",
561
  "_view_name": "StyleView",
562
+ "description_width": ""
 
563
  }
564
  },
565
+ "2bfd51e3ae954008ae83704c24dbd6cb": {
566
  "model_module": "@jupyter-widgets/base",
 
567
  "model_module_version": "1.2.0",
568
+ "model_name": "LayoutModel",
569
  "state": {
570
  "_model_module": "@jupyter-widgets/base",
571
  "_model_module_version": "1.2.0",
 
614
  "width": null
615
  }
616
  },
617
+ "409f985be1134b468b81136fbdb54408": {
618
  "model_module": "@jupyter-widgets/controls",
 
619
  "model_module_version": "1.5.0",
620
+ "model_name": "HTMLModel",
621
  "state": {
622
+ "_dom_classes": [],
623
  "_model_module": "@jupyter-widgets/controls",
624
  "_model_module_version": "1.5.0",
625
+ "_model_name": "HTMLModel",
626
  "_view_count": null,
627
+ "_view_module": "@jupyter-widgets/controls",
628
+ "_view_module_version": "1.5.0",
629
+ "_view_name": "HTMLView",
630
+ "description": "",
631
+ "description_tooltip": null,
632
+ "layout": "IPY_MODEL_c72dd3d6a4c246cfa6590c314783c8f0",
633
+ "placeholder": "​",
634
+ "style": "IPY_MODEL_c0e471e664dd41eab98efe08301ef5e1",
635
+ "value": "<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.svg\nalt='Hugging Face'> <br> Copy a token from <a\nhref=\"https://huggingface.co/settings/tokens\" target=\"_blank\">your Hugging Face\ntokens page</a> and paste it below. <br> Immediately click login after copying\nyour token or it might be stored in plain text in this notebook file. </center>"
636
  }
637
  },
638
+ "40c381fd7bb04b43a879044a4e988cc6": {
639
  "model_module": "@jupyter-widgets/controls",
 
640
  "model_module_version": "1.5.0",
641
+ "model_name": "HTMLModel",
642
  "state": {
643
  "_dom_classes": [],
644
  "_model_module": "@jupyter-widgets/controls",
645
  "_model_module_version": "1.5.0",
646
+ "_model_name": "HTMLModel",
647
  "_view_count": null,
648
  "_view_module": "@jupyter-widgets/controls",
649
  "_view_module_version": "1.5.0",
650
+ "_view_name": "HTMLView",
651
  "description": "",
652
  "description_tooltip": null,
653
+ "layout": "IPY_MODEL_9b5d87960dde401baeaf8b6144fb8bad",
654
  "placeholder": "​",
655
+ "style": "IPY_MODEL_76e06881e5e94197a24944e07fdf3189",
656
+ "value": "\n<b>Pro Tip:</b> If you don't already have one, you can create a dedicated\n'notebooks' token with 'write' access, that you can then easily reuse for all\nnotebooks. </center>"
657
  }
658
  },
659
  "4488de26dce74cbbb39d99ae09bd21fa": {
660
  "model_module": "@jupyter-widgets/base",
 
661
  "model_module_version": "1.2.0",
662
+ "model_name": "LayoutModel",
663
  "state": {
664
  "_model_module": "@jupyter-widgets/base",
665
  "_model_module_version": "1.2.0",
 
708
  "width": null
709
  }
710
  },
711
+ "542490f74e974451bc44009a6fa174bd": {
712
  "model_module": "@jupyter-widgets/controls",
 
713
  "model_module_version": "1.5.0",
714
+ "model_name": "VBoxModel",
715
  "state": {
716
+ "_dom_classes": [],
717
  "_model_module": "@jupyter-widgets/controls",
718
  "_model_module_version": "1.5.0",
719
+ "_model_name": "VBoxModel",
720
  "_view_count": null,
721
+ "_view_module": "@jupyter-widgets/controls",
722
+ "_view_module_version": "1.5.0",
723
+ "_view_name": "VBoxView",
724
+ "box_style": "",
725
+ "children": [],
726
+ "layout": "IPY_MODEL_8d0e5abdd7c549f1a66ee198c9fa1430"
727
  }
728
  },
729
+ "57cb1e931c614980a4147cb125524d7d": {
730
  "model_module": "@jupyter-widgets/controls",
 
731
  "model_module_version": "1.5.0",
732
+ "model_name": "PasswordModel",
733
  "state": {
734
  "_dom_classes": [],
735
  "_model_module": "@jupyter-widgets/controls",
736
  "_model_module_version": "1.5.0",
737
+ "_model_name": "PasswordModel",
738
  "_view_count": null,
739
  "_view_module": "@jupyter-widgets/controls",
740
  "_view_module_version": "1.5.0",
741
+ "_view_name": "PasswordView",
742
+ "continuous_update": true,
743
+ "description": "Token:",
744
+ "description_tooltip": null,
745
+ "disabled": false,
746
+ "layout": "IPY_MODEL_868f63ea9455442d837dc2c422918800",
747
+ "placeholder": "​",
748
+ "style": "IPY_MODEL_5b7b4707b1bf4159a10bf7e289bde435",
749
+ "value": ""
750
  }
751
  },
752
+ "5b7b4707b1bf4159a10bf7e289bde435": {
753
  "model_module": "@jupyter-widgets/controls",
 
754
  "model_module_version": "1.5.0",
755
+ "model_name": "DescriptionStyleModel",
756
  "state": {
 
757
  "_model_module": "@jupyter-widgets/controls",
758
  "_model_module_version": "1.5.0",
759
+ "_model_name": "DescriptionStyleModel",
760
  "_view_count": null,
761
+ "_view_module": "@jupyter-widgets/base",
762
+ "_view_module_version": "1.2.0",
763
+ "_view_name": "StyleView",
764
+ "description_width": ""
 
 
 
 
 
765
  }
766
  },
767
+ "66f82e7ef3694c699e3d4a2bd826392b": {
768
  "model_module": "@jupyter-widgets/controls",
 
769
  "model_module_version": "1.5.0",
770
+ "model_name": "DescriptionStyleModel",
771
  "state": {
 
772
  "_model_module": "@jupyter-widgets/controls",
773
  "_model_module_version": "1.5.0",
774
+ "_model_name": "DescriptionStyleModel",
775
  "_view_count": null,
776
+ "_view_module": "@jupyter-widgets/base",
777
+ "_view_module_version": "1.2.0",
778
+ "_view_name": "StyleView",
779
+ "description_width": ""
 
 
 
 
 
 
 
 
780
  }
781
  },
782
+ "68fc757825dd44a48ab2383db20958db": {
783
  "model_module": "@jupyter-widgets/controls",
 
784
  "model_module_version": "1.5.0",
785
+ "model_name": "DescriptionStyleModel",
786
  "state": {
 
787
  "_model_module": "@jupyter-widgets/controls",
788
  "_model_module_version": "1.5.0",
789
+ "_model_name": "DescriptionStyleModel",
790
  "_view_count": null,
791
+ "_view_module": "@jupyter-widgets/base",
792
+ "_view_module_version": "1.2.0",
793
+ "_view_name": "StyleView",
794
+ "description_width": ""
 
 
 
 
 
795
  }
796
  },
797
+ "76e06881e5e94197a24944e07fdf3189": {
798
+ "model_module": "@jupyter-widgets/controls",
799
+ "model_module_version": "1.5.0",
800
+ "model_name": "DescriptionStyleModel",
801
  "state": {
802
+ "_model_module": "@jupyter-widgets/controls",
803
+ "_model_module_version": "1.5.0",
804
+ "_model_name": "DescriptionStyleModel",
805
  "_view_count": null,
806
  "_view_module": "@jupyter-widgets/base",
807
  "_view_module_version": "1.2.0",
808
+ "_view_name": "StyleView",
809
+ "description_width": ""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
810
  }
811
  },
812
+ "770341dc116148a8b7571cce3a2f2baf": {
813
  "model_module": "@jupyter-widgets/base",
 
814
  "model_module_version": "1.2.0",
815
+ "model_name": "LayoutModel",
816
  "state": {
817
  "_model_module": "@jupyter-widgets/base",
818
  "_model_module_version": "1.2.0",
 
861
  "width": null
862
  }
863
  },
864
+ "777d7addfb144fd8896b77a1e0d54f25": {
865
  "model_module": "@jupyter-widgets/controls",
 
866
  "model_module_version": "1.5.0",
867
+ "model_name": "FloatProgressModel",
868
  "state": {
869
+ "_dom_classes": [],
870
  "_model_module": "@jupyter-widgets/controls",
871
  "_model_module_version": "1.5.0",
872
+ "_model_name": "FloatProgressModel",
873
+ "_view_count": null,
874
+ "_view_module": "@jupyter-widgets/controls",
875
+ "_view_module_version": "1.5.0",
876
+ "_view_name": "ProgressView",
877
+ "bar_style": "success",
878
+ "description": "",
879
+ "description_tooltip": null,
880
+ "layout": "IPY_MODEL_2bfd51e3ae954008ae83704c24dbd6cb",
881
+ "max": 4,
882
+ "min": 0,
883
+ "orientation": "horizontal",
884
+ "style": "IPY_MODEL_f8b84d8c06384680973ef6fe787b5a5d",
885
+ "value": 4
886
  }
887
  },
888
+ "868f63ea9455442d837dc2c422918800": {
889
  "model_module": "@jupyter-widgets/base",
 
890
  "model_module_version": "1.2.0",
891
+ "model_name": "LayoutModel",
892
  "state": {
893
  "_model_module": "@jupyter-widgets/base",
894
  "_model_module_version": "1.2.0",
 
937
  "width": null
938
  }
939
  },
940
+ "8704264bff4d46c9813ac9acf92da962": {
941
  "model_module": "@jupyter-widgets/controls",
 
942
  "model_module_version": "1.5.0",
943
+ "model_name": "ButtonStyleModel",
944
  "state": {
945
  "_model_module": "@jupyter-widgets/controls",
946
  "_model_module_version": "1.5.0",
947
+ "_model_name": "ButtonStyleModel",
948
  "_view_count": null,
949
  "_view_module": "@jupyter-widgets/base",
950
  "_view_module_version": "1.2.0",
951
  "_view_name": "StyleView",
952
+ "button_color": null,
953
+ "font_weight": ""
954
  }
955
  },
956
+ "87dc7aaf52e349a7bb43bb1b8bc137ee": {
957
+ "model_module": "@jupyter-widgets/controls",
958
+ "model_module_version": "1.5.0",
959
+ "model_name": "CheckboxModel",
960
+ "state": {
961
+ "_dom_classes": [],
962
+ "_model_module": "@jupyter-widgets/controls",
963
+ "_model_module_version": "1.5.0",
964
+ "_model_name": "CheckboxModel",
965
+ "_view_count": null,
966
+ "_view_module": "@jupyter-widgets/controls",
967
+ "_view_module_version": "1.5.0",
968
+ "_view_name": "CheckboxView",
969
+ "description": "Add token as git credential?",
970
+ "description_tooltip": null,
971
+ "disabled": false,
972
+ "indent": true,
973
+ "layout": "IPY_MODEL_889d0d1ed24e4de2b89896511d008e60",
974
+ "style": "IPY_MODEL_68fc757825dd44a48ab2383db20958db",
975
+ "value": true
976
+ }
977
+ },
978
+ "889d0d1ed24e4de2b89896511d008e60": {
979
  "model_module": "@jupyter-widgets/base",
 
980
  "model_module_version": "1.2.0",
981
+ "model_name": "LayoutModel",
982
  "state": {
983
  "_model_module": "@jupyter-widgets/base",
984
  "_model_module_version": "1.2.0",
 
1012
  "margin": null,
1013
  "max_height": null,
1014
  "max_width": null,
1015
+ "min_height": null,
1016
+ "min_width": null,
1017
+ "object_fit": null,
1018
+ "object_position": null,
1019
+ "order": null,
1020
+ "overflow": null,
1021
+ "overflow_x": null,
1022
+ "overflow_y": null,
1023
+ "padding": null,
1024
+ "right": null,
1025
+ "top": null,
1026
+ "visibility": null,
1027
+ "width": null
1028
+ }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1029
  },
1030
+ "8d0e5abdd7c549f1a66ee198c9fa1430": {
1031
+ "model_module": "@jupyter-widgets/base",
1032
+ "model_module_version": "1.2.0",
1033
+ "model_name": "LayoutModel",
1034
+ "state": {
1035
+ "_model_module": "@jupyter-widgets/base",
1036
+ "_model_module_version": "1.2.0",
1037
+ "_model_name": "LayoutModel",
1038
+ "_view_count": null,
1039
+ "_view_module": "@jupyter-widgets/base",
1040
+ "_view_module_version": "1.2.0",
1041
+ "_view_name": "LayoutView",
1042
+ "align_content": null,
1043
+ "align_items": "center",
1044
+ "align_self": null,
1045
+ "border": null,
1046
+ "bottom": null,
1047
+ "display": "flex",
1048
+ "flex": null,
1049
+ "flex_flow": "column",
1050
+ "grid_area": null,
1051
+ "grid_auto_columns": null,
1052
+ "grid_auto_flow": null,
1053
+ "grid_auto_rows": null,
1054
+ "grid_column": null,
1055
+ "grid_gap": null,
1056
+ "grid_row": null,
1057
+ "grid_template_areas": null,
1058
+ "grid_template_columns": null,
1059
+ "grid_template_rows": null,
1060
+ "height": null,
1061
+ "justify_content": null,
1062
+ "justify_items": null,
1063
+ "left": null,
1064
+ "margin": null,
1065
+ "max_height": null,
1066
+ "max_width": null,
1067
+ "min_height": null,
1068
+ "min_width": null,
1069
+ "object_fit": null,
1070
+ "object_position": null,
1071
+ "order": null,
1072
+ "overflow": null,
1073
+ "overflow_x": null,
1074
+ "overflow_y": null,
1075
+ "padding": null,
1076
+ "right": null,
1077
+ "top": null,
1078
+ "visibility": null,
1079
+ "width": "50%"
1080
+ }
 
 
 
 
 
1081
  },
1082
+ "983ed4cb4eea42daa9ae8c0417021a21": {
1083
+ "model_module": "@jupyter-widgets/controls",
1084
+ "model_module_version": "1.5.0",
1085
+ "model_name": "ButtonModel",
1086
+ "state": {
1087
+ "_dom_classes": [],
1088
+ "_model_module": "@jupyter-widgets/controls",
1089
+ "_model_module_version": "1.5.0",
1090
+ "_model_name": "ButtonModel",
1091
+ "_view_count": null,
1092
+ "_view_module": "@jupyter-widgets/controls",
1093
+ "_view_module_version": "1.5.0",
1094
+ "_view_name": "ButtonView",
1095
+ "button_style": "",
1096
+ "description": "Login",
1097
+ "disabled": false,
1098
+ "icon": "",
1099
+ "layout": "IPY_MODEL_cb76f933e6e640d9a688f7838e5fb0b3",
1100
+ "style": "IPY_MODEL_8704264bff4d46c9813ac9acf92da962",
1101
+ "tooltip": ""
1102
+ }
1103
+ },
1104
+ "9b5d87960dde401baeaf8b6144fb8bad": {
1105
+ "model_module": "@jupyter-widgets/base",
1106
+ "model_module_version": "1.2.0",
1107
+ "model_name": "LayoutModel",
1108
+ "state": {
1109
+ "_model_module": "@jupyter-widgets/base",
1110
+ "_model_module_version": "1.2.0",
1111
+ "_model_name": "LayoutModel",
1112
+ "_view_count": null,
1113
+ "_view_module": "@jupyter-widgets/base",
1114
+ "_view_module_version": "1.2.0",
1115
+ "_view_name": "LayoutView",
1116
+ "align_content": null,
1117
+ "align_items": null,
1118
+ "align_self": null,
1119
+ "border": null,
1120
+ "bottom": null,
1121
+ "display": null,
1122
+ "flex": null,
1123
+ "flex_flow": null,
1124
+ "grid_area": null,
1125
+ "grid_auto_columns": null,
1126
+ "grid_auto_flow": null,
1127
+ "grid_auto_rows": null,
1128
+ "grid_column": null,
1129
+ "grid_gap": null,
1130
+ "grid_row": null,
1131
+ "grid_template_areas": null,
1132
+ "grid_template_columns": null,
1133
+ "grid_template_rows": null,
1134
+ "height": null,
1135
+ "justify_content": null,
1136
+ "justify_items": null,
1137
+ "left": null,
1138
+ "margin": null,
1139
+ "max_height": null,
1140
+ "max_width": null,
1141
+ "min_height": null,
1142
+ "min_width": null,
1143
+ "object_fit": null,
1144
+ "object_position": null,
1145
+ "order": null,
1146
+ "overflow": null,
1147
+ "overflow_x": null,
1148
+ "overflow_y": null,
1149
+ "padding": null,
1150
+ "right": null,
1151
+ "top": null,
1152
+ "visibility": null,
1153
+ "width": null
1154
+ }
 
 
 
 
 
 
 
1155
  },
1156
+ "be523e956910487ca263d943a7a58395": {
1157
+ "model_module": "@jupyter-widgets/controls",
1158
+ "model_module_version": "1.5.0",
1159
+ "model_name": "HBoxModel",
1160
+ "state": {
1161
+ "_dom_classes": [],
1162
+ "_model_module": "@jupyter-widgets/controls",
1163
+ "_model_module_version": "1.5.0",
1164
+ "_model_name": "HBoxModel",
1165
+ "_view_count": null,
1166
+ "_view_module": "@jupyter-widgets/controls",
1167
+ "_view_module_version": "1.5.0",
1168
+ "_view_name": "HBoxView",
1169
+ "box_style": "",
1170
+ "children": [
1171
+ "IPY_MODEL_01dc23faab3d42cda41fdfdd2a7dfed5",
1172
+ "IPY_MODEL_777d7addfb144fd8896b77a1e0d54f25",
1173
+ "IPY_MODEL_c518268069244b21810e84380502c190"
1174
+ ],
1175
+ "layout": "IPY_MODEL_fee72c1c455549b59092028b855a082a"
1176
+ }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1177
  },
1178
+ "c0e471e664dd41eab98efe08301ef5e1": {
1179
+ "model_module": "@jupyter-widgets/controls",
1180
+ "model_module_version": "1.5.0",
1181
+ "model_name": "DescriptionStyleModel",
1182
+ "state": {
1183
+ "_model_module": "@jupyter-widgets/controls",
1184
+ "_model_module_version": "1.5.0",
1185
+ "_model_name": "DescriptionStyleModel",
1186
+ "_view_count": null,
1187
+ "_view_module": "@jupyter-widgets/base",
1188
+ "_view_module_version": "1.2.0",
1189
+ "_view_name": "StyleView",
1190
+ "description_width": ""
1191
+ }
1192
+ },
1193
+ "c518268069244b21810e84380502c190": {
1194
+ "model_module": "@jupyter-widgets/controls",
1195
+ "model_module_version": "1.5.0",
1196
+ "model_name": "HTMLModel",
1197
+ "state": {
1198
+ "_dom_classes": [],
1199
+ "_model_module": "@jupyter-widgets/controls",
1200
+ "_model_module_version": "1.5.0",
1201
+ "_model_name": "HTMLModel",
1202
+ "_view_count": null,
1203
+ "_view_module": "@jupyter-widgets/controls",
1204
+ "_view_module_version": "1.5.0",
1205
+ "_view_name": "HTMLView",
1206
+ "description": "",
1207
+ "description_tooltip": null,
1208
+ "layout": "IPY_MODEL_770341dc116148a8b7571cce3a2f2baf",
1209
+ "placeholder": "​",
1210
+ "style": "IPY_MODEL_29416122cc0b4a5592668ddced7686ba",
1211
+ "value": " 4/4 [00:00&lt;00:00,  5.03it/s]"
1212
+ }
1213
+ },
1214
+ "c72dd3d6a4c246cfa6590c314783c8f0": {
1215
+ "model_module": "@jupyter-widgets/base",
1216
+ "model_module_version": "1.2.0",
1217
+ "model_name": "LayoutModel",
1218
+ "state": {
1219
+ "_model_module": "@jupyter-widgets/base",
1220
+ "_model_module_version": "1.2.0",
1221
+ "_model_name": "LayoutModel",
1222
+ "_view_count": null,
1223
+ "_view_module": "@jupyter-widgets/base",
1224
+ "_view_module_version": "1.2.0",
1225
+ "_view_name": "LayoutView",
1226
+ "align_content": null,
1227
+ "align_items": null,
1228
+ "align_self": null,
1229
+ "border": null,
1230
+ "bottom": null,
1231
+ "display": null,
1232
+ "flex": null,
1233
+ "flex_flow": null,
1234
+ "grid_area": null,
1235
+ "grid_auto_columns": null,
1236
+ "grid_auto_flow": null,
1237
+ "grid_auto_rows": null,
1238
+ "grid_column": null,
1239
+ "grid_gap": null,
1240
+ "grid_row": null,
1241
+ "grid_template_areas": null,
1242
+ "grid_template_columns": null,
1243
+ "grid_template_rows": null,
1244
+ "height": null,
1245
+ "justify_content": null,
1246
+ "justify_items": null,
1247
+ "left": null,
1248
+ "margin": null,
1249
+ "max_height": null,
1250
+ "max_width": null,
1251
+ "min_height": null,
1252
+ "min_width": null,
1253
+ "object_fit": null,
1254
+ "object_position": null,
1255
+ "order": null,
1256
+ "overflow": null,
1257
+ "overflow_x": null,
1258
+ "overflow_y": null,
1259
+ "padding": null,
1260
+ "right": null,
1261
+ "top": null,
1262
+ "visibility": null,
1263
+ "width": null
1264
+ }
1265
+ },
1266
+ "cb76f933e6e640d9a688f7838e5fb0b3": {
1267
+ "model_module": "@jupyter-widgets/base",
1268
+ "model_module_version": "1.2.0",
1269
+ "model_name": "LayoutModel",
1270
+ "state": {
1271
+ "_model_module": "@jupyter-widgets/base",
1272
+ "_model_module_version": "1.2.0",
1273
+ "_model_name": "LayoutModel",
1274
+ "_view_count": null,
1275
+ "_view_module": "@jupyter-widgets/base",
1276
+ "_view_module_version": "1.2.0",
1277
+ "_view_name": "LayoutView",
1278
+ "align_content": null,
1279
+ "align_items": null,
1280
+ "align_self": null,
1281
+ "border": null,
1282
+ "bottom": null,
1283
+ "display": null,
1284
+ "flex": null,
1285
+ "flex_flow": null,
1286
+ "grid_area": null,
1287
+ "grid_auto_columns": null,
1288
+ "grid_auto_flow": null,
1289
+ "grid_auto_rows": null,
1290
+ "grid_column": null,
1291
+ "grid_gap": null,
1292
+ "grid_row": null,
1293
+ "grid_template_areas": null,
1294
+ "grid_template_columns": null,
1295
+ "grid_template_rows": null,
1296
+ "height": null,
1297
+ "justify_content": null,
1298
+ "justify_items": null,
1299
+ "left": null,
1300
+ "margin": null,
1301
+ "max_height": null,
1302
+ "max_width": null,
1303
+ "min_height": null,
1304
+ "min_width": null,
1305
+ "object_fit": null,
1306
+ "object_position": null,
1307
+ "order": null,
1308
+ "overflow": null,
1309
+ "overflow_x": null,
1310
+ "overflow_y": null,
1311
+ "padding": null,
1312
+ "right": null,
1313
+ "top": null,
1314
+ "visibility": null,
1315
+ "width": null
1316
+ }
1317
  },
1318
+ "ded62e6c032745ec88ca0ab694b0d397": {
1319
+ "model_module": "@jupyter-widgets/controls",
1320
+ "model_module_version": "1.5.0",
1321
+ "model_name": "DescriptionStyleModel",
1322
+ "state": {
1323
+ "_model_module": "@jupyter-widgets/controls",
1324
+ "_model_module_version": "1.5.0",
1325
+ "_model_name": "DescriptionStyleModel",
1326
+ "_view_count": null,
1327
+ "_view_module": "@jupyter-widgets/base",
1328
+ "_view_module_version": "1.2.0",
1329
+ "_view_name": "StyleView",
1330
+ "description_width": ""
1331
+ }
 
 
 
 
 
 
 
 
 
 
 
 
 
1332
  },
1333
+ "ed0fa93199b94fb486c125d4f322d59f": {
1334
+ "model_module": "@jupyter-widgets/base",
1335
+ "model_module_version": "1.2.0",
1336
+ "model_name": "LayoutModel",
1337
+ "state": {
1338
+ "_model_module": "@jupyter-widgets/base",
1339
+ "_model_module_version": "1.2.0",
1340
+ "_model_name": "LayoutModel",
1341
+ "_view_count": null,
1342
+ "_view_module": "@jupyter-widgets/base",
1343
+ "_view_module_version": "1.2.0",
1344
+ "_view_name": "LayoutView",
1345
+ "align_content": null,
1346
+ "align_items": null,
1347
+ "align_self": null,
1348
+ "border": null,
1349
+ "bottom": null,
1350
+ "display": null,
1351
+ "flex": null,
1352
+ "flex_flow": null,
1353
+ "grid_area": null,
1354
+ "grid_auto_columns": null,
1355
+ "grid_auto_flow": null,
1356
+ "grid_auto_rows": null,
1357
+ "grid_column": null,
1358
+ "grid_gap": null,
1359
+ "grid_row": null,
1360
+ "grid_template_areas": null,
1361
+ "grid_template_columns": null,
1362
+ "grid_template_rows": null,
1363
+ "height": null,
1364
+ "justify_content": null,
1365
+ "justify_items": null,
1366
+ "left": null,
1367
+ "margin": null,
1368
+ "max_height": null,
1369
+ "max_width": null,
1370
+ "min_height": null,
1371
+ "min_width": null,
1372
+ "object_fit": null,
1373
+ "object_position": null,
1374
+ "order": null,
1375
+ "overflow": null,
1376
+ "overflow_x": null,
1377
+ "overflow_y": null,
1378
+ "padding": null,
1379
+ "right": null,
1380
+ "top": null,
1381
+ "visibility": null,
1382
+ "width": null
1383
+ }
1384
  },
1385
+ "f40dd696acc64c6284c6f8f485f3ce9d": {
1386
+ "model_module": "@jupyter-widgets/controls",
1387
+ "model_module_version": "1.5.0",
1388
+ "model_name": "LabelModel",
1389
+ "state": {
1390
+ "_dom_classes": [],
1391
+ "_model_module": "@jupyter-widgets/controls",
1392
+ "_model_module_version": "1.5.0",
1393
+ "_model_name": "LabelModel",
1394
+ "_view_count": null,
1395
+ "_view_module": "@jupyter-widgets/controls",
1396
+ "_view_module_version": "1.5.0",
1397
+ "_view_name": "LabelView",
1398
+ "description": "",
1399
+ "description_tooltip": null,
1400
+ "layout": "IPY_MODEL_4488de26dce74cbbb39d99ae09bd21fa",
1401
+ "placeholder": "​",
1402
+ "style": "IPY_MODEL_ded62e6c032745ec88ca0ab694b0d397",
1403
+ "value": "Connecting..."
1404
+ }
1405
+ },
1406
+ "f8b84d8c06384680973ef6fe787b5a5d": {
1407
+ "model_module": "@jupyter-widgets/controls",
1408
+ "model_module_version": "1.5.0",
1409
+ "model_name": "ProgressStyleModel",
1410
+ "state": {
1411
+ "_model_module": "@jupyter-widgets/controls",
1412
+ "_model_module_version": "1.5.0",
1413
+ "_model_name": "ProgressStyleModel",
1414
+ "_view_count": null,
1415
+ "_view_module": "@jupyter-widgets/base",
1416
+ "_view_module_version": "1.2.0",
1417
+ "_view_name": "StyleView",
1418
+ "bar_color": null,
1419
+ "description_width": ""
1420
+ }
1421
+ },
1422
+ "fee72c1c455549b59092028b855a082a": {
1423
+ "model_module": "@jupyter-widgets/base",
1424
+ "model_module_version": "1.2.0",
1425
+ "model_name": "LayoutModel",
1426
+ "state": {
1427
+ "_model_module": "@jupyter-widgets/base",
1428
+ "_model_module_version": "1.2.0",
1429
+ "_model_name": "LayoutModel",
1430
+ "_view_count": null,
1431
+ "_view_module": "@jupyter-widgets/base",
1432
+ "_view_module_version": "1.2.0",
1433
+ "_view_name": "LayoutView",
1434
+ "align_content": null,
1435
+ "align_items": null,
1436
+ "align_self": null,
1437
+ "border": null,
1438
+ "bottom": null,
1439
+ "display": null,
1440
+ "flex": null,
1441
+ "flex_flow": null,
1442
+ "grid_area": null,
1443
+ "grid_auto_columns": null,
1444
+ "grid_auto_flow": null,
1445
+ "grid_auto_rows": null,
1446
+ "grid_column": null,
1447
+ "grid_gap": null,
1448
+ "grid_row": null,
1449
+ "grid_template_areas": null,
1450
+ "grid_template_columns": null,
1451
+ "grid_template_rows": null,
1452
+ "height": null,
1453
+ "justify_content": null,
1454
+ "justify_items": null,
1455
+ "left": null,
1456
+ "margin": null,
1457
+ "max_height": null,
1458
+ "max_width": null,
1459
+ "min_height": null,
1460
+ "min_width": null,
1461
+ "object_fit": null,
1462
+ "object_position": null,
1463
+ "order": null,
1464
+ "overflow": null,
1465
+ "overflow_x": null,
1466
+ "overflow_y": null,
1467
+ "padding": null,
1468
+ "right": null,
1469
+ "top": null,
1470
+ "visibility": null,
1471
+ "width": null
1472
+ }
1473
  }
1474
+ }
1475
  }
1476
+ },
1477
+ "nbformat": 4,
1478
+ "nbformat_minor": 0
1479
+ }
PaliGemma_DPO.ipynb CHANGED
The diff for this file is too large to render. See raw diff
 
Reduce_any_model_to_fp16_using_🤗_Optimum_DETR.ipynb CHANGED
The diff for this file is too large to render. See raw diff
 
ShieldGemma_2_for_Vision_LM_Safety.ipynb CHANGED
The diff for this file is too large to render. See raw diff