slseanwu
/

compose-and-embellish-pop1k7

@@ -32,8 +32,12 @@ Generates **melody and chord progression** from scratch.
   - Training sequence length: 2400
 ### Stage 2: "Embellish" model
 Generates **accompaniment, timing and dynamics** conditioned on Stage 1 outputs.
-  - Model backbone: 12-layer **Performer** ([paper](https://arxiv.org/abs/2009.14794), [implementation](https://github.com/idiap/fast-transformers))
-  - Num trainable params: 38.2M
   - Token vocabulary: [Revamped MIDI-derived events](https://arxiv.org/abs/2002.00212) (**REMI**) w/ slight modifications
   - Training dataset: [AILabs.tw Pop1K7](https://github.com/YatingMusic/compound-word-transformer) (**Pop1K7**), 1747 songs
   - Training sequence length: 3072

   - Training sequence length: 2400
 ### Stage 2: "Embellish" model
 Generates **accompaniment, timing and dynamics** conditioned on Stage 1 outputs.
+  - `embellish_model_gpt2_pop1k7_loss0.398.bin`
+    - Model backbone: 12-layer **GPT-2 Transformer** ([implementation](https://huggingface.co/docs/transformers/en/model_doc/gpt2))
+    - Num trainable params: 38.2M
+  - `embellish_model_pop1k7_loss0.399.bin` (requires `fast-transformers` package, which is outdated as of Jul. 2024)
+    - Model backbone: 12-layer **Performer** ([paper](https://arxiv.org/abs/2009.14794), [implementation](https://github.com/idiap/fast-transformers))
+    - Num trainable params: 38.2M
   - Token vocabulary: [Revamped MIDI-derived events](https://arxiv.org/abs/2002.00212) (**REMI**) w/ slight modifications
   - Training dataset: [AILabs.tw Pop1K7](https://github.com/YatingMusic/compound-word-transformer) (**Pop1K7**), 1747 songs
   - Training sequence length: 3072