nanonets / qwen2vl ocr / rolmocr / aya vision
Generate audio from text and reference voices
Demo of GOT-OCR 2.0's Transformers implementation