KEVVVV commited on
Commit
11a0f95
·
verified ·
1 Parent(s): 27a1496

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -0
README.md CHANGED
@@ -1,3 +1,66 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - pkupie/mc2_corpus
5
+ language:
6
+ - bo
7
+ - ug
8
+ - kk
9
+ - mn
10
+ - zh
11
+ base_model:
12
+ - hfl/cino-base-v2
13
  ---
14
+ # XLM-SWCM: Multilingual Encoder with Shared Weights Pretraining
15
+
16
+ ## Overview
17
+
18
+ XLM-SWCM (Cross-lingual Language Model with Shared Weights Cross-lingual Modeling) is an innovative sequence-to-sequence model specifically designed to address the challenges of extremely low-resource languages. Our framework introduces a novel weight-sharing mechanism between encoder and decoder components, enabling effective knowledge transfer from multilingual encoders to generation tasks.
19
+
20
+ ## Key Innovations
21
+
22
+ * **Shared Weight Framework**: Strategic weight reuse between encoder and decoder layers
23
+ * **Hybrid Decoder Architecture**: Combines:
24
+ * Standard transformer decoder layers
25
+ * Custom decoder layers with dual FFN structure
26
+ * Optimized layer insertion pattern (1 normal layer per 3 custom layers)
27
+ * **Efficient Adaptation**: Enables effective text generation with minimal training data
28
+
29
+ ## Model Architecture
30
+
31
+
32
+ | Component | Description |
33
+ | -------------- | ------------------------------------------------------------------- |
34
+ | **Encoder** | XLM-RoBERTa base (CINO v2 variant) |
35
+ | **Decoder** | Hybrid transformer with: |
36
+ | | • NormalDecoderLayer: Randomly initialized standard layers |
37
+ | | • CustomDecoderLayer: Weight-shared layers with dual FFN structure |
38
+ | **Parameters** | 492M total parameters |
39
+
40
+ ### Advanced Features
41
+
42
+ * Beam search decoding
43
+ * Mixed-precision training
44
+ * Cross-lingual transfer learning
45
+
46
+ For detailed usage instructions, see our [GitHub repository](https://github.com/asd765973346/xlm-swcm)
47
+
48
+ ## Supported Languages
49
+
50
+ Primary focus on Chinese minority languages:
51
+
52
+ * Tibetan (bo)
53
+ * Uyghur (ug)
54
+ * Kazakh (kk)
55
+ * Mongolian (mn)
56
+ * Chinese (zh)
57
+
58
+ ## Citation
59
+
60
+ ```
61
+ @article{swcm,
62
+ author = {Zeli Su and Ziyin Zhang and Guixian Xu and Jianing Liu and XU Han and Ting Zhang and Yushuang Dong},
63
+ title = {Multilingual Encoder Knows more than You Realize: Shared Weights Pretraining for Extremely Low-Resource Languages},
64
+ year = {2025},
65
+ url = {http://dx.doi.org/10.13140/RG.2.2.11262.09285},
66
+ }