|
--- |
|
license: cc-by-nc-4.0 |
|
inference: true |
|
tags: |
|
- Blaze |
|
pipeline_tag: text-to-audio |
|
widget: |
|
- text: A relaxing piano piece with soft melodies |
|
example_title: Prompt 1 |
|
- text: An acoustic guitar ballad with emotional depth |
|
example_title: Prompt 2 |
|
- text: An upbeat electronic dance track with heavy bass |
|
example_title: Prompt 3 |
|
--- |
|
|
|
# Blaze - Mini |
|
|
|
**Blaze** is a text-to-music generation model capable of producing high-quality music samples from natural language prompts. |
|
It is a single-stage, auto-regressive Transformer trained over a 32 kHz EnCodec tokenizer using 4 audio codebooks sampled at 50 Hz. |
|
|
|
Unlike earlier methods that depend on intermediate semantic representations, Blaze directly predicts all 4 codebooks in a single forward pass. |
|
By introducing a slight delay between codebooks, Blaze achieves efficient parallel generation β reducing autoregressive steps to just 50 per second of audio. |
|
|
|
--- |
|
|
|
## π€ Transformers Usage |
|
|
|
You can use Blaze via the π€ Transformers `text-to-audio` pipeline: |
|
|
|
### 1. Install required packages: |
|
```bash |
|
pip install --upgrade pip |
|
pip install --upgrade transformers scipy |
|
``` |
|
|
|
### 2. Run text-to-audio inference: |
|
```python |
|
from transformers import pipeline |
|
import scipy |
|
|
|
synthesizer = pipeline("text-to-audio", "SVECTOR-CORPORATION/Blaze") |
|
|
|
music = synthesizer("lo-fi music with a soothing melody", forward_params={"do_sample": True}) |
|
|
|
scipy.io.wavfile.write("blaze_output.wav", rate=music["sampling_rate"], data=music["audio"]) |
|
``` |
|
|
|
## Intended Use |
|
|
|
**Primary Use:** |
|
|
|
* Research on generative AI in music |
|
* Music prototyping guided by text |
|
* Exploring transformer models for creative generation |
|
|
|
**Out of Scope:** |
|
|
|
* Commercial deployment without license |
|
* Harmful, biased, or culturally disrespectful content generation |