--- license: mit language: - vi pipeline_tag: text-generation --- # TheSyx-V1-7B-Base ## Introduction TheSyx-V1-7B-Base is the first LLM released by [thehosy](https://huggingface.co/thehosy). **Features**: - Type: Causal Language Models - Training Stage: Pretraining - Architecture: Qwen3 MoE - Number of Parameters: 7.52B - Number of Layers: 28 - Number of Attention Heads (GQA): 24 for Q and 4 for KV - Context Length: Full 16384 tokens and generation 8192 tokens **We do not recommend using base language models for conversations**. Instead, you can apply post-training, e.g., SFT, RLHF, continued pretraining, etc., on this model. ## Requirements The code of TheSyx-V1-7B-Base has been in the latest Hugging face `transformers` and i advise you to use the latest version of `transformers`. ## Citation ...