File size: 3,365 Bytes

---
library_name: transformers
tags:
- generated_from_trainer
- music
metrics:
- accuracy
datasets:
- sandernotenbaert/lmd_matched
training_config:
  vocab_size: 30000
  hidden_size: 256
  intermediate_size: 512
  num_hidden_layers: 4
  num_attention_heads: 4
  num_key_value_heads: 4
  sliding_window: 4
  max_position_embeddings: 1024
  pad_token_id: 0
  bos_token_id: 1
  eos_token_id: 2
pipeline_tag: other
model-index:
- name: OKAI-midi-gen-v-001
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# OKAI-midi-gen-v-001

This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 10.1912
- Accuracy: 0.0008

## Model description

First test with small subset on M1Pro. Generates valid files, notes very clustered with long gaps

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

training_config:
  vocab_size: 30000
  hidden_size: 256
  intermediate_size: 512
  num_hidden_layers: 4
  num_attention_heads: 4
  num_key_value_heads: 4
  sliding_window: 4
  max_position_embeddings: 1024
  pad_token_id: 0
  bos_token_id: 1
  eos_token_id: 2

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 16
- seed: 444
- gradient_accumulation_steps: 3
- total_train_batch_size: 24
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine_with_restarts
- lr_scheduler_warmup_ratio: 0.3
- training_steps: 2000

### Training results

| Training Loss | Epoch   | Step | Accuracy | Validation Loss |
|:-------------:|:-------:|:----:|:--------:|:---------------:|
| 10.2727       | 3.2283  | 100  | 0.0000   | 10.3284         |
| 9.7582        | 6.4565  | 200  | 0.0026   | 10.0966         |
| 9.2052        | 9.6848  | 300  | 0.0037   | 9.9513          |
| 8.8216        | 12.9130 | 400  | 0.0034   | 9.9538          |
| 8.406         | 16.1304 | 500  | 0.0029   | 9.9524          |
| 7.8326        | 19.3587 | 600  | 0.0021   | 9.9458          |
| 7.1956        | 22.5870 | 700  | 0.0017   | 9.9864          |
| 6.5659        | 25.8152 | 800  | 0.0015   | 9.9258          |
| 5.9719        | 29.0326 | 900  | 0.0015   | 9.9710          |
| 5.4031        | 32.2609 | 1000 | 0.0011   | 9.9116          |
| 4.9784        | 35.4891 | 1100 | 0.0012   | 9.9819          |
| 4.6684        | 38.7174 | 1200 | 0.0009   | 10.0142         |
| 4.3184        | 41.9783 | 1300 | 10.0483  | 0.0010          |
| 4.1251        | 45.1957 | 1400 | 10.0964  | 0.0008          |
| 3.909         | 48.4239 | 1500 | 10.1322  | 0.0009          |
| 3.7535        | 51.6522 | 1600 | 10.1587  | 0.0009          |
| 3.681         | 54.8804 | 1700 | 10.1785  | 0.0008          |
| 3.688         | 58.0978 | 1800 | 10.1871  | 0.0008          |
| 3.6685        | 61.3261 | 1900 | 10.1912  | 0.0008          |
| 3.6326        | 64.5543 | 2000 | 10.1912  | 0.0008          |


### Framework versions

- Transformers 4.52.3
- Pytorch 2.6.0
- Datasets 3.6.0
- Tokenizers 0.21.1