{ "cells": [ { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "view-in-github" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "metadata": { "id": "eyuTPRZYl1lH" }, "source": [ "# Optimizing DETR with πŸ€— Optimum" ] }, { "cell_type": "markdown", "metadata": { "id": "bl1QkPFlrUHT" }, "source": [ "In this notebook we will use ONNX and πŸ€— Optimum to convert DETR-ResNet-50 to fp16 and reduce the model size by half. Let's start with the libraries." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "35-7D7BI146d" }, "outputs": [], "source": [ "!pip install -q optimum onnxruntime-gpu onnx tensorrt timm" ] }, { "cell_type": "markdown", "metadata": { "id": "pLL7H58pQXWJ" }, "source": [ "## Export the Model\n", "\n", "We just need a single line of code to export any Hugging Face Transformers model to ONNX with fp16 precision. This will pull the model from Hugging Face Hub, export the model to ONNX, use GPU to half the model weights and save the model." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "AtzOiwXjPasj", "outputId": "985abea9-a28c-482b-93f6-80391e1d1c18" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2024-02-14 15:38:04.863648: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n", "2024-02-14 15:38:04.863696: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n", "2024-02-14 15:38:04.864980: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n", "2024-02-14 15:38:06.090479: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT\n", "Framework not specified. Using pt to export to ONNX.\n", "Some weights of the model checkpoint at facebook/detr-resnet-50 were not used when initializing DetrForObjectDetection: ['model.backbone.conv_encoder.model.layer1.0.downsample.1.num_batches_tracked', 'model.backbone.conv_encoder.model.layer3.0.downsample.1.num_batches_tracked', 'model.backbone.conv_encoder.model.layer2.0.downsample.1.num_batches_tracked', 'model.backbone.conv_encoder.model.layer4.0.downsample.1.num_batches_tracked']\n", "- This IS expected if you are initializing DetrForObjectDetection from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n", "- This IS NOT expected if you are initializing DetrForObjectDetection from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n", "Automatic task detection to object-detection.\n", "Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.\n", "The `max_size` parameter is deprecated and will be removed in v4.26. Please specify in `size['longest_edge'] instead`.\n", "/usr/local/lib/python3.10/dist-packages/transformers/models/detr/feature_extraction_detr.py:38: FutureWarning: The class DetrFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use DetrImageProcessor instead.\n", " warnings.warn(\n", "Using the export variant default. Available variants are:\n", " - default: The default ONNX variant.\n", "Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.\n", "Using framework PyTorch: 2.1.0+cu121\n", "/usr/local/lib/python3.10/dist-packages/transformers/models/detr/modeling_detr.py:614: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n", " if attn_weights.size() != (batch_size * self.num_heads, target_len, source_len):\n", "/usr/local/lib/python3.10/dist-packages/transformers/models/detr/modeling_detr.py:621: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n", " if attention_mask.size() != (batch_size, 1, target_len, source_len):\n", "/usr/local/lib/python3.10/dist-packages/transformers/models/detr/modeling_detr.py:645: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n", " if attn_output.size() != (batch_size * self.num_heads, target_len, self.head_dim):\n", "\u001b[1;31m2024-02-14 15:38:31.444508840 [E:onnxruntime:Default, provider_bridge_ort.cc:1546 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1209 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so.11: cannot open shared object file: No such file or directory\n", "\u001b[m\n", "\u001b[0;93m2024-02-14 15:38:31.444553861 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:861 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirementsto ensure all dependencies are met.\u001b[m\n", "Post-processing the exported models...\n", "Weight deduplication check in the ONNX export requires accelerate. Please install accelerate to run it.\n", "Validating models in subprocesses...\n", "2024-02-14 15:38:35.328317: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n", "2024-02-14 15:38:35.328373: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n", "2024-02-14 15:38:35.329636: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n", "2024-02-14 15:38:36.545817: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT\n", "Validating ONNX model detr_onnx_fp16/model.onnx...\n", "\u001b[1;31m2024-02-14 15:38:39.081092751 [E:onnxruntime:Default, provider_bridge_ort.cc:1546 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1209 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so.11: cannot open shared object file: No such file or directory\n", "\u001b[m\n", "\u001b[0;93m2024-02-14 15:38:39.081123400 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:861 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirementsto ensure all dependencies are met.\u001b[m\n", "\t-[βœ“] ONNX model output names match reference model (pred_boxes, logits)\n", "\t- Validating ONNX Model output \"logits\":\n", "\t\t-[βœ“] (2, 100, 92) matches (2, 100, 92)\n", "\t\t-[x] values not close enough, max diff: 0.09375 (atol: 1e-05)\n", "\t- Validating ONNX Model output \"pred_boxes\":\n", "\t\t-[βœ“] (2, 100, 4) matches (2, 100, 4)\n", "\t\t-[x] values not close enough, max diff: 0.01171875 (atol: 1e-05)\n", "The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 1e-05:\n", "- logits: max diff = 0.09375\n", "- pred_boxes: max diff = 0.01171875.\n", " The exported model was saved at: detr_onnx_fp16\n" ] } ], "source": [ "!optimum-cli export onnx --model facebook/detr-resnet-50 --device cuda --fp16 detr_onnx_fp16/" ] }, { "cell_type": "markdown", "metadata": { "id": "r4tTUyNQsCIg" }, "source": [ "We can push the model to πŸ€— Hub." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 53 }, "id": "CVsRVQiO5yOZ", "outputId": "81cb6db1-0620-419f-f8ab-9d9e7ad0fd47" }, "outputs": [ { "data": { "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" }, "text/plain": [ "RepoUrl('https://huggingface.co/merve/detr-resnet-50-fp16', endpoint='https://huggingface.co', repo_type='model', repo_id='merve/detr-resnet-50-fp16')" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hf_api.create_repo(\"merve/detr-resnet-50-fp16\", repo_type=\"model\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 85, "referenced_widgets": [ "d64df65bfee247a8b9f92681b3c1640f", "932ba309d8da470692c2358081e0a7bc", "9de9adad9a5842509a4315ec9a067e1f", "b38cdace36ca4a16897f9bbb522f8507", "bb0762b720964ea09c04054aaf77c876", "93b4c9def21f490392dbcb24c861b3bf", "89aff979937e483da2e5a8aa0198a618", "ddc713c2cb05473d87346872febf1da5", "fc593b2201004439bcb9954e1fb3aa03", "51d750229af74ff59fa74487ab9d1266", "ee3fc74003f04d39b4d3894c7ad69441" ] }, "id": "mtdtJJiN55rX", "outputId": "94d5dd73-c74b-4ab7-df86-a0618e441dcb" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d64df65bfee247a8b9f92681b3c1640f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "model.onnx: 0%| | 0.00/83.7M [00:00" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "img" ] } ], "metadata": { "accelerator": "GPU", "colab": { "gpuType": "T4", "include_colab_link": true, "provenance": [] }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "51d750229af74ff59fa74487ab9d1266": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "89aff979937e483da2e5a8aa0198a618": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "932ba309d8da470692c2358081e0a7bc": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_93b4c9def21f490392dbcb24c861b3bf", "placeholder": "​", "style": "IPY_MODEL_89aff979937e483da2e5a8aa0198a618", "value": "model.onnx: 100%" } }, "93b4c9def21f490392dbcb24c861b3bf": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "9de9adad9a5842509a4315ec9a067e1f": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_ddc713c2cb05473d87346872febf1da5", "max": 83664513, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_fc593b2201004439bcb9954e1fb3aa03", "value": 83664513 } }, "b38cdace36ca4a16897f9bbb522f8507": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_51d750229af74ff59fa74487ab9d1266", "placeholder": "​", "style": "IPY_MODEL_ee3fc74003f04d39b4d3894c7ad69441", "value": " 83.7M/83.7M [00:11<00:00, 5.76MB/s]" } }, "bb0762b720964ea09c04054aaf77c876": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "d64df65bfee247a8b9f92681b3c1640f": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_932ba309d8da470692c2358081e0a7bc", "IPY_MODEL_9de9adad9a5842509a4315ec9a067e1f", "IPY_MODEL_b38cdace36ca4a16897f9bbb522f8507" ], "layout": "IPY_MODEL_bb0762b720964ea09c04054aaf77c876" } }, "ddc713c2cb05473d87346872febf1da5": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "ee3fc74003f04d39b4d3894c7ad69441": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "fc593b2201004439bcb9954e1fb3aa03": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } } } } }, "nbformat": 4, "nbformat_minor": 0 }