Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
jinliuxi
/
deepmini-it
like
0
Safetensors
Chinese
English
deepseek_v3
deepseek
Mixture of Experts
instruction-tuning
bilingual
reasoning
code
math
arxiv:
2405.04434
License:
apache-2.0
Model card
Files
Files and versions
Community
a022d31
deepmini-it
/
generation_config.json
jinliuxi
Upload DeepseekV3ForCausalLM
749267d
verified
about 1 month ago
raw
Copy download link
history
blame
Safe
132 Bytes
{
"_from_model_config"
:
true
,
"bos_token_id"
:
0
,
"eos_token_id"
:
1
,
"pad_token_id"
:
1
,
"transformers_version"
:
"4.48.3"
}