KEVVVV commited on
Commit
572661f
·
verified ·
1 Parent(s): 875d2d4

Upload readme.md

Browse files
Files changed (1) hide show
  1. readme.md +54 -0
readme.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # XLM-SWCM: Multilingual Encoder with Shared Weights Pretraining
2
+
3
+ ## Overview
4
+
5
+ XLM-SWCM (Cross-lingual Language Model with Shared Weights Cross-lingual Modeling) is an innovative sequence-to-sequence model specifically designed to address the challenges of extremely low-resource languages. Our framework introduces a novel weight-sharing mechanism between encoder and decoder components, enabling effective knowledge transfer from multilingual encoders to generation tasks.
6
+
7
+ ## Key Innovations
8
+
9
+ * **Shared Weight Framework**: Strategic weight reuse between encoder and decoder layers
10
+ * **Hybrid Decoder Architecture**: Combines:
11
+ * Standard transformer decoder layers
12
+ * Custom decoder layers with dual FFN structure
13
+ * Optimized layer insertion pattern (1 normal layer per 3 custom layers)
14
+ * **Efficient Adaptation**: Enables effective text generation with minimal training data
15
+
16
+ ## Model Architecture
17
+
18
+
19
+ | Component | Description |
20
+ | -------------- | ------------------------------------------------------------------- |
21
+ | **Encoder** | XLM-RoBERTa base (CINO v2 variant) |
22
+ | **Decoder** | Hybrid transformer with: |
23
+ | | • NormalDecoderLayer: Randomly initialized standard layers |
24
+ | | • CustomDecoderLayer: Weight-shared layers with dual FFN structure |
25
+ | **Parameters** | 492M total parameters |
26
+
27
+ ### Advanced Features
28
+
29
+ * Beam search decoding
30
+ * Mixed-precision training
31
+ * Cross-lingual transfer learning
32
+
33
+ For detailed usage instructions, see our [GitHub repository](https://github.com/asd765973346/xlm-swcm)
34
+
35
+ ## Supported Languages
36
+
37
+ Primary focus on Chinese minority languages:
38
+
39
+ * Tibetan (bo)
40
+ * Uyghur (ug)
41
+ * Kazakh (kk)
42
+ * Mongolian (mn)
43
+ * Chinese (zh)
44
+
45
+ ## Citation
46
+
47
+ ```
48
+ @article{swcm,
49
+ author = {Zeli Su and Ziyin Zhang and Guixian Xu and Jianing Liu and XU Han and Ting Zhang and Yushuang Dong},
50
+ title = {Multilingual Encoder Knows more than You Realize: Shared Weights Pretraining for Extremely Low-Resource Languages},
51
+ year = {2025},
52
+ url = {http://dx.doi.org/10.13140/RG.2.2.11262.09285},
53
+ }
54
+ ```