KEVVVV
/

xlm-swcm

Model card Files Files and versions Community

xlm-swcm / README.md

KEVVVV's picture

Update README.md

53bb2b4 verified 2 months ago

|

history blame contribute delete

2.6 kB

	---
	license: apache-2.0
	datasets:
	- pkupie/mc2_corpus
	language:
	- bo
	- ug
	- kk
	- mn
	- zh
	base_model:
	- hfl/cino-base-v2
	---
	# XLM-SWCM: Multilingual Encoder with Shared Weights Pretraining

	## Overview

	XLM-SWCM (Cross-lingual Language Model with Shared Weights Cross-lingual Modeling) is an innovative sequence-to-sequence model specifically designed to address the challenges of extremely low-resource languages. Our framework introduces a novel weight-sharing mechanism between encoder and decoder components, enabling effective knowledge transfer from multilingual encoders to generation tasks.

	## Key Innovations

	* Shared Weight Framework: Strategic weight reuse between encoder and decoder layers
	* Hybrid Decoder Architecture: Combines:
	* Standard transformer decoder layers
	* Custom decoder layers with dual FFN structure
	* Optimized layer insertion pattern (1 normal layer per 3 custom layers)
	* Efficient Adaptation: Enables effective text generation with minimal training data

	## Model Architecture


	\| Component \| Description \|
	\| -------------- \| ------------------------------------------------------------------- \|
	\| Encoder \| XLM-RoBERTa base (CINO v2 variant) \|
	\| Decoder \| Hybrid transformer with: \|
	\| \| • NormalDecoderLayer: Randomly initialized standard layers \|
	\| \| • CustomDecoderLayer: Weight-shared layers with dual FFN structure \|
	\| Parameters \| 492M total parameters \|

	### Advanced Features

	* Beam search decoding
	* Mixed-precision training
	* Cross-lingual transfer learning

	For detailed usage instructions, see our [GitHub repository](https://github.com/asd765973346/xlm-swcm)

	## Supported Languages

	Primary focus on Chinese minority languages:

	* Tibetan (bo)
	* Uyghur (ug)
	* Kazakh (kk)
	* Mongolian (mn)
	* Chinese (zh)

	## Citation

	```
	@article{su2025multilingualencoderknowsrealize,
	author = {Zeli Su and Ziyin Zhang and Guixian Xu and Jianing Liu and Xu Han and Ting Zhang and Yushuang Dong},
	title = {Multilingual Encoder Knows more than You Realize: Shared Weights Pretraining
	for Extremely Low-Resource Languages},
	journal = {CoRR},
	volume = {abs/2502.10852},
	year = {2025},
	url = {https://doi.org/10.48550/arXiv.2502.10852},
	doi = {10.48550/ARXIV.2502.10852},
	eprinttype = {arXiv},
	eprint = {2502.10852}
	}