File size: 4,257 Bytes
188bfd1 b49ce6d 5289124 54fbbbf 5289124 ce5111d 5289124 0423b63 de54f56 0423b63 5289124 ef25945 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
---
license: apache-2.0
language:
- fr
- en
pipeline_tag: zero-shot-object-detection
library_name: transformers
base_model:
- omlab/omdet-turbo-swin-tiny-hf
tags:
- endpoints-template
---
# Fork of [omlab/omdet-turbo-swin-tiny-hf](https://huggingface.co/omlab/omdet-turbo-swin-tiny-hf) for a `zero-shot-object-detection` Inference endpoint.
This repository implements a `custom` task for `zero-shot-object-detection` for 🤗 Inference Endpoints. The code for the customized handler is in the [handler.py](https://huggingface.co/Blueway/inference-endpoint-for-omdet-turbo-swin-tiny-hf/blob/main/handler.py).
To use deploy this model a an Inference Endpoint you have to select `Custom` as task to use the `handler.py` file.
The repository contains a requirements.txt to download the timm library.
### expected Request payload
```json
{
inputs:{
"image": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgICAgMC....", // base64 image as bytes
"candiates":["broken curb", "broken road", "broken road sign", "broken sidewalk"]
}
}
```
below is an example on how to run a request using Python and `requests`.
## Run Request
``` python
import json
from typing import List
import requests as r
import base64
ENDPOINT_URL = ""
HF_TOKEN = ""
def predict(path_to_image: str = None, candidates: List[str] = None):
with open(path_to_image, "rb") as i:
b64 = base64.b64encode(i.read())
payload = {"inputs": {"image": b64.decode("utf-8"), "candidates": candidates}}
response = r.post(
ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload
)
return response.json()
prediction = predict(
path_to_image="image/brokencurb.jpg", candidates=["broken curb", "broken road", "broken road sign", "broken sidewalk"]
)
print(json.dumps(prediction, indent=2))
```
expected output
``` python
{
"boxes": [
[
1.919342041015625,
231.1556396484375,
1011.4019775390625,
680.3773193359375
],
[
610.9949951171875,
397.6180419921875,
1019.9259033203125,
510.8144226074219
],
[
1.919342041015625,
231.1556396484375,
1011.4019775390625,
680.3773193359375
],
[
786.1240234375,
68.618896484375,
916.1265869140625,
225.0513458251953
]
],
"scores": [
0.4329715967178345,
0.4215811491012573,
0.3389397859573364,
0.3133399784564972
],
"candidates": [
"broken sidewalk",
"broken road sign",
"broken road",
"broken road sign"
]
}
```
The boxes are structured like {x_min, y_min, x_max, y_max}
## visualize result
<figure>
<img src="https://cdn-uploads.huggingface.co/production/uploads/661e3161112a872ebdee8bbc/KFT09GSYWn2gEllATejSZ.png" alt="image/png">
<figcaption>input image</figcaption>
</figure>
To visualize the result of the request you can implement this code
``` python
prediction = predict(
path_to_image="image/cat_and_remote.jpg", candidates=["cat", "remote", "pot hole"]
)
import matplotlib.pyplot as plt
import matplotlib.patches as patches
with open("image/cat_and_remote.jpg", "rb") as i:
image = plt.imread(i)
# Plot image
fig, ax = plt.subplots(1)
ax.imshow(image)
for score, class_name, box in zip(
prediction["scores"], prediction["candidates"], prediction["boxes"]
):
# Create a Rectangle patch
rect = patches.Rectangle([int(box[0]), int(box[1])], int(box[2] - box[0]), int(box[3] - box[1]), linewidth=1, edgecolor='r', facecolor='none')
# Add the patch to the Axes
ax.add_patch(rect)
ax.text(int(box[0]), int(box[1]), str(round(score, 2)) + " " + str(class_name), color='white', fontsize=6, bbox=dict(facecolor='red', alpha=0.5))
plt.savefig('image_result/cat_and_remote_with_bboxes_zero_shot.jpeg')
```
**result**
<figure>
<img src="https://cdn-uploads.huggingface.co/production/uploads/661e3161112a872ebdee8bbc/8xPoidjVyRQBs990hR4sq.png" alt="image/png">
<figcaption>output image</figcaption>
</figure>
## Credits
This adaptation for huggingface inference endpoint was inspiered by [@philschmid](https://huggingface.co/philschmid) work on [philschmid/clip-zero-shot-image-classification](https://huggingface.co/philschmid/clip-zero-shot-image-classification).
|