akanatas commited on
Commit
7c970c8
·
verified ·
1 Parent(s): 2921a27

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -4
README.md CHANGED
@@ -15,7 +15,8 @@ library_name: transformers
15
  # CultureMERT: Continual Pre-Training for Cross-Cultural Music Representation Learning
16
  📑 [**Read the full paper (to be presented at ISMIR 2025)**](...TODO)
17
 
18
- **CultureMERT-TA-95M** is a multi-culturally adapted 95M-parameter music foundation model created by merging single-culture adaptations of [MERT-v1-95M](https://huggingface.co/m-a-p/MERT-v1-95M) through **task arithmetic**. Instead of continual pre-training on a multi-cultural data mix, as done in [CultureMERT-95M](https://huggingface.co/ntua-slp/CultureMERT-95M), we merge multiple single-culture adapted models, each continually pre-trained via our two-stage strategy on a distinct musical tradition:
 
19
 
20
  | Dataset | Music Tradition | Hours Used |
21
  |-----------------|-----------------------------|------------|
@@ -24,9 +25,10 @@ library_name: transformers
24
  | [*Hindustani*](https://dunya.compmusic.upf.edu/hindustani/) | North Indian classical | 200h |
25
  | [*Carnatic*](https://dunya.compmusic.upf.edu/carnatic/) | South Indian classical | 200h |
26
 
27
- > 🧪 The final model was merged using a scaling factor of **λ = 0.2**, which yielded the best performance across all variants tested.
 
28
 
29
- 🔀 This is an alternative variant of [**CultureMERT-95M**](https://huggingface.co/ntua-slp/CultureMERT-95M), where culturally specialized models are merged in weight space via task arithmetic to form a unified multi-cultural model.
30
 
31
  ---
32
 
@@ -34,6 +36,7 @@ library_name: transformers
34
 
35
  We follow the exact same evaluation protocol as in [CultureMERT-95M](https://huggingface.co/ntua-slp/CultureMERT-95M). Below are the evaluation results, along with comparisons to both [CultureMERT-95M](https://huggingface.co/ntua-slp/CultureMERT-95M) and the original [MERT-v1-95M](https://huggingface.co/m-a-p/MERT-v1-95M):
36
 
 
37
  ## ROC-AUC / mAP
38
 
39
  | Model | Turkish-makam | Hindustani | Carnatic | Lyra | FMA-medium | MTAT | **Avg.** |
@@ -52,7 +55,7 @@ We follow the exact same evaluation protocol as in [CultureMERT-95M](https://hug
52
  | **CultureMERT-TA-95M** | 76.9% / 45.4% | 74.2% / 45.0% | 82.5% / 32.1% | 73.0% / **45.3%** | **59.1%** / **38.2%** | 35.7% / 21.5% | 52.4% |
53
 
54
 
55
- 📈 **CultureMERT-TA-95M** performs comparably to [CultureMERT-95M](https://huggingface.co/ntua-slp/CultureMERT-95M) on non-Western datasets and even surpasses it on *Lyra* and Western benchmarks. Notably, it also outperforms the [MERT-v1-95M](https://huggingface.co/m-a-p/MERT-v1-95M) on Western tasks by a **+0.7%** average improvement.
56
 
57
  ---
58
 
 
15
  # CultureMERT: Continual Pre-Training for Cross-Cultural Music Representation Learning
16
  📑 [**Read the full paper (to be presented at ISMIR 2025)**](...TODO)
17
 
18
+ **CultureMERT-TA-95M** is a 95M-parameter music foundation model adapted to diverse musical cultures through **task arithmetic**. Instead of direct continual pre-training on a multi-cultural mixture, as in [CultureMERT-95M](https://huggingface.co/ntua-slp/CultureMERT-95M), this model merges multiple **single-culture adapted** variants of [MERT-v1-95M](https://huggingface.co/m-a-p/MERT-v1-95M)—each continually pre-trained via our two-stage strategy on a distinct musical tradition:
19
+
20
 
21
  | Dataset | Music Tradition | Hours Used |
22
  |-----------------|-----------------------------|------------|
 
25
  | [*Hindustani*](https://dunya.compmusic.upf.edu/hindustani/) | North Indian classical | 200h |
26
  | [*Carnatic*](https://dunya.compmusic.upf.edu/carnatic/) | South Indian classical | 200h |
27
 
28
+ > 🧪 The final model was merged using a scaling factor of **λ = 0.2**, which yielded the best overall performance across all task arithmetic variants evaluated.
29
+
30
 
31
+ 🔀 This is an alternative variant of [**CultureMERT-95M**](https://huggingface.co/ntua-slp/CultureMERT-95M), where culturally specialized models are merged in weight space via task arithmetic to form a unified multi-cultural model. It builds directly on [CultureMERT-95M](https://huggingface.co/ntua-slp/CultureMERT-95M), using the same two-stage continual pre-training strategy applied individually to each musical tradition before merging.
32
 
33
  ---
34
 
 
36
 
37
  We follow the exact same evaluation protocol as in [CultureMERT-95M](https://huggingface.co/ntua-slp/CultureMERT-95M). Below are the evaluation results, along with comparisons to both [CultureMERT-95M](https://huggingface.co/ntua-slp/CultureMERT-95M) and the original [MERT-v1-95M](https://huggingface.co/m-a-p/MERT-v1-95M):
38
 
39
+
40
  ## ROC-AUC / mAP
41
 
42
  | Model | Turkish-makam | Hindustani | Carnatic | Lyra | FMA-medium | MTAT | **Avg.** |
 
55
  | **CultureMERT-TA-95M** | 76.9% / 45.4% | 74.2% / 45.0% | 82.5% / 32.1% | 73.0% / **45.3%** | **59.1%** / **38.2%** | 35.7% / 21.5% | 52.4% |
56
 
57
 
58
+ 📈 **CultureMERT-TA-95M** performs comparably to [CultureMERT-95M](https://huggingface.co/ntua-slp/CultureMERT-95M) on non-Western datasets, while surpassing it on *Lyra* and Western benchmarks. It also outperforms [MERT-v1-95M](https://huggingface.co/m-a-p/MERT-v1-95M) on Western tasks (MTAT and FMA-medium) by an average margin of **+0.7%** across all metrics.
59
 
60
  ---
61