README.md · deepghs/anime_real_cls at b716032791a320d0c76b5f6e6d3948421b08aecc

metadata

license: openrail
datasets:
  - deepghs/anime_real_cls
metrics:
  - f1
  - accuracy
pipeline_tag: image-classification
tags:
  - art

Name	FLOPS	Params	Accuracy	AUC	Confusion	Labels
caformer_s36_v1.3_fixed	22.10G	37.21M	99.08%	0.9989	confusion	`anime`, `real`
caformer_s36_v1.3_fp32	22.10G	37.21M	99.08%	0.9987	confusion	`anime`, `real`
mobilenetv3_v1.3_dist	0.63G	4.18M	98.29%	0.9977	confusion	`anime`, `real`
caformer_s36_v1.3	22.10G	37.21M	99.06%	0.9987	confusion	`anime`, `real`
caformer_s36_v1.3_ls0.1	22.10G	37.21M	99.10%	0.9972	confusion	`anime`, `real`
mobilenetv3_v1.2_dist	0.63G	4.18M	98.63%	0.9984	confusion	`anime`, `real`
caformer_s36_v1.2	22.10G	37.21M	99.08%	0.999	confusion	`anime`, `real`
mobilenetv3_v1.1_dist_ls0.1	0.63G	4.18M	98.57%	0.9969	confusion	`anime`, `real`
caformer_s36_v1.1_ls0.1	22.10G	37.21M	99.03%	0.9979	confusion	`anime`, `real`
caformer_s36_v1.1	22.10G	37.21M	98.51%	0.9971	confusion	`anime`, `real`
caformer_s36_v1	22.10G	37.21M	98.90%	0.9986	confusion	`anime`, `real`
mobilenetv3_v1_dist_ls0.1	0.63G	4.18M	98.77%	0.998	confusion	`anime`, `real`
caformer_s36_v1_ls0.1	22.10G	37.21M	99.18%	0.9981	confusion	`anime`, `real`
mobilenetv3_v0_dist	0.63G	4.18M	99.14%	0.9986	confusion	`anime`, `real`
caformer_s36_v0	22.10G	37.21M	99.34%	0.9988	confusion	`anime`, `real`

import json
import numpy as np
from PIL import Image
from imgutils.data import load_image, rgb_encode
from onnxruntime import InferenceSession, SessionOptions, GraphOptimizationLevel

class Anime_Real_Cls():
    def __init__(self, model_dir):
        model_path = f'{model_dir}/model.onnx'
        self.model = self.load_local_onnx_model(model_path)
        with open(f'{model_dir}/meta.json', 'r') as f:
            self.labels = json.load(f)['labels']

    def _img_encode(self, image_path, size=(384, 384), normalize=(0.5, 0.5)):
        image = Image.open(image_path)
        image = load_image(image, mode='RGB')
        image = image.resize(size, Image.BILINEAR)
        data = rgb_encode(image, order_='CHW')
        if normalize:
            mean_, std_ = normalize
            mean = np.asarray([mean_]).reshape((-1, 1, 1))
            std = np.asarray([std_]).reshape((-1, 1, 1))
            data = (data - mean) / std
        return data.astype(np.float32)

    def load_local_onnx_model(self, model_path: str) -> InferenceSession:
        options = SessionOptions()
        options.graph_optimization_level = GraphOptimizationLevel.ORT_ENABLE_ALL
        return InferenceSession(model_path, options)

    def __call__(self, image_path):
        input_ = self._img_encode(image_path, size=(384, 384))[None, ...]
        output, = self.model.run(['output'], {'input': input_})
        values = dict(zip(self.labels, map(lambda x: x.item(), output[0])))
        print("values: ", values)
        max_key = max(values, key=values.get)
        return max_key

if __name__ == "__main__":
    classifier = Anime_Real_Cls(model_dir="./caformer_s36_v1.3_fixed")
    image_path = '1.webp'
    class_result = classifier(image_path)
    print("class_result: ", class_result)