Multimodal Models
Collection
15 items
β’
Updated
This SDK enables efficient Open-Vocabulary-Object-Detection using YOLO-Worldv2 Large, optimized for Axeraβs NPU-based SoC platforms including AX650 Series, AX630C Series, AX8850 Series, or Axera's dedicated AI accelerator.
For those who are interested in model conversion, you can try to export axmodel through
Model | Input Shape | Latency (ms) | CMM Usage (MB) |
---|---|---|---|
yolo_u16_ax650.axmodel | 1 x 640 x 640 x 3 | 9.522 ms | 21 MB |
clip_b1_u16_ax650.axmodel | 1 x 77 | 2.997 ms | 137 MB |
yolo_u16_ax630c.axmodel | 1 x 640 x 640 x 3 | 43.450 ms | 31 MB |
clip_b1_u16_ax630c.axmodel | 1 x 77 | 10.703 ms | 134 MB |
Download all files from this repository to the device
(py312) axera@raspberrypi:~/samples/yoloworldv2 $ tree
.
βββ config.json
βββ football.jpg
βββ install
β βββ bin
β β βββ axcl_aarch64
β β β βββ test_detect_by_text
β β βββ axcl_x86
β β β βββ test_detect_by_text
β β βββ host_650
β β βββ test_detect_by_text
β βββ lib
β βββ axcl_aarch64
β β βββ libyoloworld.so
β βββ axcl_x86
β β βββ libyoloworld.so
β βββ host_650
β βββ libyoloworld.so
βββ models
β βββ clip_b1_u16_ax630c.axmodel
β βββ clip_b1_u16_ax650.axmodel
β βββ yolo_u16_ax630c.axmodel
β βββ yolo_u16_ax650.axmodel
βββ pyyoloworld
β βββ example.py
β βββ gardio_example.jpg
β βββ gradio_example.py
β βββ libyoloworld.so
β βββ pyaxdev.py
β βββ __pycache__
β β βββ pyaxdev.cpython-312.pyc
β β βββ pyyoloworld.cpython-312.pyc
β βββ pyyoloworld.py
β βββ requirements.txt
βββ README.md
βββ vocab.txt
13 directories, 23 files
pip install -r pyyoloworld/requirements.txt
TODO
What is M.2 Accelerator card?, Show this DEMO based on Raspberry PI 5.
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg $ export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libstdc++.so.6
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg $ cp install/lib/axcl_aarch64/libyoloworld.so pyyoloworld/
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg $ cd pyyoloworld/
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg/pyyoloworld $ python gradio_example.py --yoloworld ../models/yolo_u16_ax650.axmodel --tenc ../models/clip_b1_u16_ax650.axmodel --vocab ../vocab.txt
Trying to load: /home/axera/samples/yoloworldv2-new.hg/pyyoloworld/aarch64/libyoloworld.so
β
Successfully loaded: /home/axera/samples/yoloworldv2-new.hg/pyyoloworld/libyoloworld.so
[I][ run][ 31]: AXCLWorker start with devid 0
input size: 2
name: images [unknown] [unknown]
1 x 640 x 640 x 3 size: 1228800
name: txt_feats [unknown] [unknown]
1 x 4 x 512 size: 8192
output size: 3
name: stride8
1 x 80 x 80 x 68 size: 1740800
name: stride16
1 x 40 x 40 x 68 size: 435200
name: stride32
1 x 20 x 20 x 68 size: 108800
[I][ yw_create][ 408]: num_classes: 4, num_features: 512, input w: 640, h: 640
is_output_nhwc: 1
input size: 1
name: text_token [unknown] [unknown]
1 x 77 size: 308
output size: 1
name: 2202
1 x 1 x 512 size: 2048
[I][ load_text_encoder][ 44]: text feature len 512
[I][ load_tokenizer][ 60]: text token len 77
* Running on local URL: http://0.0.0.0:7860
* To create a public link, set `share=True` in `launch()`.
If your Raspberry PI 5 IP Address is 192.168.1.100, so using this URL http://192.168.1.100:7860
with your WebApp.
InputοΌman
, shoes
, ball
, person
and the test image
ResultοΌ