Update README.md

ce5111d verified 2 months ago

4.26 kB

	---
	license: apache-2.0
	language:
	- fr
	- en
	pipeline_tag: zero-shot-object-detection
	library_name: transformers
	base_model:
	- omlab/omdet-turbo-swin-tiny-hf
	tags:
	- endpoints-template
	---

	# Fork of [omlab/omdet-turbo-swin-tiny-hf](https://huggingface.co/omlab/omdet-turbo-swin-tiny-hf) for a `zero-shot-object-detection` Inference endpoint.

	This repository implements a `custom` task for `zero-shot-object-detection` for 🤗 Inference Endpoints. The code for the customized handler is in the [handler.py](https://huggingface.co/Blueway/inference-endpoint-for-omdet-turbo-swin-tiny-hf/blob/main/handler.py).

	To use deploy this model a an Inference Endpoint you have to select `Custom` as task to use the `handler.py` file.

	The repository contains a requirements.txt to download the timm library.

	### expected Request payload

	```json
	{
	inputs:{
	"image": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgICAgMC....", // base64 image as bytes
	"candiates":["broken curb", "broken road", "broken road sign", "broken sidewalk"]
	}
	}
	```

	below is an example on how to run a request using Python and `requests`.

	## Run Request

	``` python
	import json
	from typing import List
	import requests as r
	import base64

	ENDPOINT_URL = ""
	HF_TOKEN = ""

	def predict(path_to_image: str = None, candidates: List[str] = None):
	with open(path_to_image, "rb") as i:
	b64 = base64.b64encode(i.read())

	payload = {"inputs": {"image": b64.decode("utf-8"), "candidates": candidates}}
	response = r.post(
	ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload
	)
	return response.json()


	prediction = predict(
	path_to_image="image/brokencurb.jpg", candidates=["broken curb", "broken road", "broken road sign", "broken sidewalk"]
	)
	print(json.dumps(prediction, indent=2))
	```
	expected output

	``` python
	{
	"boxes": [
	[
	1.919342041015625,
	231.1556396484375,
	1011.4019775390625,
	680.3773193359375
	],
	[
	610.9949951171875,
	397.6180419921875,
	1019.9259033203125,
	510.8144226074219
	],
	[
	1.919342041015625,
	231.1556396484375,
	1011.4019775390625,
	680.3773193359375
	],
	[
	786.1240234375,
	68.618896484375,
	916.1265869140625,
	225.0513458251953
	]
	],
	"scores": [
	0.4329715967178345,
	0.4215811491012573,
	0.3389397859573364,
	0.3133399784564972
	],
	"candidates": [
	"broken sidewalk",
	"broken road sign",
	"broken road",
	"broken road sign"
	]
	}
	```
	The boxes are structured like {x_min, y_min, x_max, y_max}

	## visualize result

	<figure>
	<img src="https://cdn-uploads.huggingface.co/production/uploads/661e3161112a872ebdee8bbc/KFT09GSYWn2gEllATejSZ.png" alt="image/png">
	<figcaption>input image</figcaption>
	</figure>

	To visualize the result of the request you can implement this code

	``` python
	prediction = predict(
	path_to_image="image/cat_and_remote.jpg", candidates=["cat", "remote", "pot hole"]
	)

	import matplotlib.pyplot as plt
	import matplotlib.patches as patches

	with open("image/cat_and_remote.jpg", "rb") as i:
	image = plt.imread(i)

	# Plot image
	fig, ax = plt.subplots(1)
	ax.imshow(image)
	for score, class_name, box in zip(
	prediction["scores"], prediction["candidates"], prediction["boxes"]
	):
	# Create a Rectangle patch
	rect = patches.Rectangle([int(box[0]), int(box[1])], int(box[2] - box[0]), int(box[3] - box[1]), linewidth=1, edgecolor='r', facecolor='none')
	# Add the patch to the Axes
	ax.add_patch(rect)

	ax.text(int(box[0]), int(box[1]), str(round(score, 2)) + " " + str(class_name), color='white', fontsize=6, bbox=dict(facecolor='red', alpha=0.5))

	plt.savefig('image_result/cat_and_remote_with_bboxes_zero_shot.jpeg')
	```

	result

	<figure>
	<img src="https://cdn-uploads.huggingface.co/production/uploads/661e3161112a872ebdee8bbc/8xPoidjVyRQBs990hR4sq.png" alt="image/png">
	<figcaption>output image</figcaption>
	</figure>


	## Credits

	This adaptation for huggingface inference endpoint was inspiered by [@philschmid](https://huggingface.co/philschmid) work on [philschmid/clip-zero-shot-image-classification](https://huggingface.co/philschmid/clip-zero-shot-image-classification).