|
--- |
|
base_model: |
|
- mistralai/Mistral-7B-v0.1 |
|
- m-a-p/MERT-v1-95M |
|
library_name: peft |
|
license: apache-2.0 |
|
datasets: |
|
- amaai-lab/MusicBench |
|
language: |
|
- en |
|
metrics: |
|
- bertscore |
|
- bleu |
|
pipeline_tag: audio-text-to-text |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
SonicVerse is a model that performs music captioning. Trained with concrete music feature labels to guide the captioning process, it provides features such as key, vocals, vocals gender, instrument, mood/theme, genre, in the generated caption. |
|
The model is trained for 10 second snippets of music for detailed captioning. The [Spaces demo](https://huggingface.co/spaces/annabeth97c/SonicVerse) allows chaining captions of multiple chunks of 10 seconds of music to generate a long detailed caption. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
Trained with a multi-tasking projector that outputs aligned language tokens from music input. Additionally, feature extraction (eg. key classification, vocals classification) is trained and then projected to language tokens, guiding the captioning. |
|
|
|
- **Developed by:** AMAAI Lab |
|
- **Funded by [optional]:** [More Information Needed] |
|
- **Shared by [optional]:** [More Information Needed] |
|
- **Model type:** Multi-modal Audio Text to Text model |
|
- **Language(s) (NLP):** English |
|
- **License:** Apache-2.0 |
|
- **Finetuned from model :** mistralai/Mistral-7B-v0.1 |
|
|
|
### Model Sources |
|
|
|
- **Repository:** https://github.com/annabeth97c/sonicverse |
|
- **Paper [optional]:** [More Information Needed] |
|
- **Demo :** https://annabeth97c.github.io/sonicverse/ |
|
|
|
## Uses |
|
|
|
Model can be used for music-text paired dataset generation |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the instructions provided on the [repository](https://github.com/annabeth97c/sonicverse) to run inference locally. Alternatively try out the model on the [spaces page](https://huggingface.co/spaces/annabeth97c/SonicVerse). |
|
|
|
## Citation [optional] |
|
|
|
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. --> |
|
|
|
**BibTeX:** |
|
|
|
[More Information Needed] |
|
|
|
**APA:** |
|
|
|
[More Information Needed] |
|
|
|
## Glossary [optional] |
|
|
|
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. --> |
|
|
|
[More Information Needed] |
|
|
|
## More Information [optional] |
|
|
|
[More Information Needed] |
|
|
|
## Model Card Authors [optional] |
|
|
|
[More Information Needed] |
|
|
|
## Model Card Contact |
|
|
|
[More Information Needed] |
|
### Framework versions |
|
|
|
- PEFT 0.10.0 |