metadata

license: cc-by-nc-4.0
inference: true
tags:
  - Blaze
pipeline_tag: text-to-audio
widget:
  - text: A relaxing piano piece with soft melodies
    example_title: Prompt 1
  - text: An acoustic guitar ballad with emotional depth
    example_title: Prompt 2
  - text: An upbeat electronic dance track with heavy bass
    example_title: Prompt 3

Blaze - Mini

Blaze is a text-to-music generation model capable of producing high-quality music samples from natural language prompts.
It is a single-stage, auto-regressive Transformer trained over a 32 kHz EnCodec tokenizer using 4 audio codebooks sampled at 50 Hz.

Unlike earlier methods that depend on intermediate semantic representations, Blaze directly predicts all 4 codebooks in a single forward pass.
By introducing a slight delay between codebooks, Blaze achieves efficient parallel generation — reducing autoregressive steps to just 50 per second of audio.

🤗 Transformers Usage

You can use Blaze via the 🤗 Transformers text-to-audio pipeline:

1. Install required packages:

pip install --upgrade pip
pip install --upgrade transformers scipy

2. Run text-to-audio inference:

from transformers import pipeline
import scipy

synthesizer = pipeline("text-to-audio", "SVECTOR-CORPORATION/Blaze")

music = synthesizer("lo-fi music with a soothing melody", forward_params={"do_sample": True})

scipy.io.wavfile.write("blaze_output.wav", rate=music["sampling_rate"], data=music["audio"])

Intended Use

Primary Use:

Research on generative AI in music
Music prototyping guided by text
Exploring transformer models for creative generation

Out of Scope:

Commercial deployment without license
Harmful, biased, or culturally disrespectful content generation