Samantha-omni / README.md
Guilherme34's picture
Update README.md
c7a6d54 verified
metadata
pipeline_tag: any-to-any
datasets:
  - openbmb/RLAIF-V-Dataset
library_name: transformers
language:
  - multilingual
tags:
  - minicpm-o
  - omni
  - vision
  - ocr
  - multi-image
  - video
  - custom_code
  - audio
  - speech
  - voice cloning
  - live Streaming
  - realtime speech conversation
  - asr
  - tts

A FINETUNE OF THE Minicpm omni, to make my new model named Samantha-Omni

A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

image/png

@article{yao2024minicpm,
  title={MiniCPM-V: A GPT-4V Level MLLM on Your Phone},
  author={Yao, Yuan and Yu, Tianyu and Zhang, Ao and Wang, Chongyi and Cui, Junbo and Zhu, Hongji and Cai, Tianchi and Li, Haoyu and Zhao, Weilin and He, Zhihui and others},
  journal={arXiv preprint arXiv:2408.01800},
  year={2024}
}