Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
google
/
gemma-3n-E4B-it
like
199
Follow
Google
17.7k
Image-Text-to-Text
Transformers
Safetensors
gemma3n
automatic-speech-recognition
automatic-speech-translation
audio-text-to-text
video-text-to-text
conversational
arxiv:
17 papers
License:
gemma
Model card
Files
Files and versions
Community
11
Train
Deploy
Use this model
how about those multimodal Benchmark like VideoBench ?
#9
by
LukeAlan
- opened
about 18 hours ago
Discussion
LukeAlan
about 18 hours ago
as a omni like model, those benchmarks performance is important.
See translation
Edit
Preview
Upload images, audio, and videos by dragging in the text input, pasting, or
clicking here
.
Tap or paste here to upload images
Comment
·
Sign up
or
log in
to comment