UmaDiffusion
/

ULTIMA-YOLOv9

Object Detection

Model card Files Files and versions Community

ULTIMA-YOLOv9 / README.md

BootsofLagrangian's picture

BootsofLagrangian

Update README.md

2e15393 verified over 1 year ago

|

2.3 kB

	---
	license: other
	license_name: umamusume-derivativework-guidelines
	license_link: https://umamusume.jp/derivativework_guidelines/
	---

	This is the model repository for ULTIMA-YOLOv9, containing the following checkpoints:
	- YOLO9-E

	# About ULTIMA-YOLO models

	This is a part of [ULTIMA](https://huggingface.co/datasets/UmaDiffusion/ULTIMA) project.

	ULTIMA-YOLOv9 model is a facial detection model for Uma Musumes in illustrations and based on [yolov9-e](https://arxiv.org/abs/2402.13616) and [ULTIMA-YOLO dataset](https://huggingface.co/datasets/UmaDiffusion/ULTIMA-YOLO)

	[ULTIMA Dataset](https://huggingface.co/datasets/UmaDiffusion/ULTIMA) is Uma Musume Labeled Text-Image Multimodal Alignment Dataset.


	### How to Use

	Clone YOLOv9 repository.

	```
	git clone https://github.com/WongKinYiu/yolov9.git
	cd yolov9
	```

	Download the weights using `hf_hub_download` and use the loading function in helpers of YOLOv9.

	```python
	from huggingface_hub import hf_hub_download
	hf_hub_download("UmaDiffusion/ULTIMA-YOLOv9", filename="ultima_yolov9-e.pt", local_dir="./")
	```

	Load the model.

	```python
	# make sure you have the following dependencies
	import torch
	import numpy as np
	from models.common import DetectMultiBackend
	from utils.general import non_max_suppression, scale_boxes
	from utils.torch_utils import select_device, smart_inference_mode
	from utils.augmentations import letterbox
	import PIL.Image

	@smart_inference_mode()
	def predict(image_path, weights='ultima_yolov9-e.pt', imgsz=640, conf_thres=0.1, iou_thres=0.45):
	# Initialize
	device = select_device('0')
	model = DetectMultiBackend(weights='yolov9-e.pt', device="0", fp16=False, data='data/coco.yaml')
	stride, names, pt = model.stride, model.names, model.pt

	# Load image
	image = np.array(PIL.Image.open(image_path))
	img = letterbox(img0, imgsz, stride=stride, auto=True)[0]
	img = img[:, :, ::-1].transpose(2, 0, 1)
	img = np.ascontiguousarray(img)
	img = torch.from_numpy(img).to(device).float()
	img /= 255.0
	if img.ndimension() == 3:
	img = img.unsqueeze(0)

	# Inference
	pred = model(img, augment=False, visualize=False)

	# Apply NMS
	pred = non_max_suppression(pred[0][0], conf_thres, iou_thres, classes=None, max_det=1000)
	```