You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

TheSyx-V1-7B-Base

Introduction

TheSyx-V1-7B-Base is the first LLM released by thehosy.

Features:

  • Type: Causal Language Models
  • Training Stage: Pretraining
  • Architecture: Qwen3 MoE
  • Number of Parameters: 7.52B
  • Number of Layers: 28
  • Number of Attention Heads (GQA): 24 for Q and 4 for KV
  • Context Length: Full 16384 tokens and generation 8192 tokens

We do not recommend using base language models for conversations. Instead, you can apply post-training, e.g., SFT, RLHF, continued pretraining, etc., on this model.

Requirements

The code of TheSyx-V1-7B-Base has been in the latest Hugging face transformers and i advise you to use the latest version of transformers.

Citation

...

Downloads last month
0
Safetensors
Model size
7.89B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support