amaai-lab
/

SonicVerse

Audio-Text-to-Text

Model card Files Files and versions

SonicVerse / README.md

annabeth97c's picture

Initial commit

1146765 4 months ago

|

2.48 kB

	---
	base_model:
	- mistralai/Mistral-7B-v0.1
	- m-a-p/MERT-v1-95M
	library_name: peft
	license: apache-2.0
	datasets:
	- amaai-lab/MusicBench
	language:
	- en
	metrics:
	- bertscore
	- bleu
	pipeline_tag: audio-text-to-text
	---

	# Model Card for Model ID

	SonicVerse is a model that performs music captioning. Trained with concrete music feature labels to guide the captioning process, it provides features such as key, vocals, vocals gender, instrument, mood/theme, genre, in the generated caption.
	The model is trained for 10 second snippets of music for detailed captioning. The [Spaces demo](https://huggingface.co/spaces/annabeth97c/SonicVerse) allows chaining captions of multiple chunks of 10 seconds of music to generate a long detailed caption.

	## Model Details

	### Model Description

	Trained with a multi-tasking projector that outputs aligned language tokens from music input. Additionally, feature extraction (eg. key classification, vocals classification) is trained and then projected to language tokens, guiding the captioning.

	- Developed by: AMAAI Lab
	- Funded by [optional]: [More Information Needed]
	- Shared by [optional]: [More Information Needed]
	- Model type: Multi-modal Audio Text to Text model
	- Language(s) (NLP): English
	- License: Apache-2.0
	- Finetuned from model : mistralai/Mistral-7B-v0.1

	### Model Sources

	- Repository: https://github.com/annabeth97c/sonicverse
	- Paper [optional]: [More Information Needed]
	- Demo : https://annabeth97c.github.io/sonicverse/

	## Uses

	Model can be used for music-text paired dataset generation

	## How to Get Started with the Model

	Use the instructions provided on the [repository](https://github.com/annabeth97c/sonicverse) to run inference locally. Alternatively try out the model on the [spaces page](https://huggingface.co/spaces/annabeth97c/SonicVerse).

	## Citation [optional]

	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

	BibTeX:

	[More Information Needed]

	APA:

	[More Information Needed]

	## Glossary [optional]

	<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

	[More Information Needed]

	## More Information [optional]

	[More Information Needed]

	## Model Card Authors [optional]

	[More Information Needed]

	## Model Card Contact

	[More Information Needed]
	### Framework versions

	- PEFT 0.10.0