ColorfulAI's picture
update: data card
8b0c63a
|
raw
history blame
640 Bytes
metadata
license: mit

M4-Audio-LongVA-7B-Qwen2

Enhancing Interactive Capabilities in VideoLLM

M4-Audio-7B is an extension of LongVA-7B, further trained using the M4-IT dataset, which comprises 9,963 visual-audio instruction tuning instances. This training was conducted without any special modifications to the existing training pipeline.

Usage

images

For more information about the interaction inference pipeline, please visit the M4 GitHub repository.