base_model:
- Qwen/Qwen3-4B
tags:
- text-generation-inference
- transformers
- unsloth
- qwen3
license: other
license_name: anvdl-1.0
license_link: https://huggingface.co/apexion-ai/Nous-V1-8B/blob/main/LICENSE.md
language:
- en
- fr
- pt
- de
- ro
- sv
- da
- bg
- ru
- cs
- el
- uk
- es
- nl
- sk
- hr
- pl
- lt
- nb
- nn
- fa
- sl
- gu
- lv
- it
- oc
- ne
- mr
- be
- sr
- lb
- vec
- as
- cy
- szl
- ast
- hne
- awa
- mai
- bho
- sd
- ga
- fo
- hi
- pa
- bn
- or
- tg
- yi
- lmo
- lij
- scn
- fur
- sc
- gl
- ca
- is
- sq
- li
- prs
- af
- mk
- si
- ur
- mag
- bs
- hy
- zh
- yue
- my
- ar
- he
- mt
- id
- ms
- tl
- ceb
- jv
- su
- min
- ban
- pag
- ilo
- war
- ta
- te
- kn
- ml
- tr
- az
- uz
- kk
- ba
- tt
- th
- lo
- fi
- et
- hu
- vi
- km
- ja
- ko
- ka
- eu
- ht
- pap
- kea
- tpi
- sw
Nous-V1 4B
Overview
Nous-V1 4B is a cutting-edge 4 billion parameter language model developed by Apexion AI, based on the architecture of Qwen3-4B. Designed for versatility across diverse NLP tasks, Nous-V1 4B delivers strong performance in conversational AI, knowledge reasoning, code generation, and content creation.
Key Features:
- โก Efficient 4B Parameter Scale: Balances model capability with practical deployment on modern hardware
- ๐ง Enhanced Contextual Understanding: Supports an 128k token context window, enabling complex multi-turn conversations and document analysis
- ๐ Multilingual & Multi-domain: Trained on a diverse dataset for broad language and domain coverage
- ๐ค Instruction-Following & Adaptability: Fine-tuned to respond accurately and adaptively across tasks
- ๐ Optimized Inference: Suitable for GPU environments such as NVIDIA A100, T4, and P100 for low-latency applications
Why Choose Nous-V1 4B?
While larger models can offer more raw power, Nous-V1 4B strikes a practical balance โ optimized for deployment efficiency without significant compromise on language understanding or generation quality. Itโs ideal for applications requiring:
- Real-time conversational agents
- Code completion and programming assistance
- Content generation and summarization
- Multilingual natural language understanding
๐ฅ๏ธ How to Run Locally
You can easily integrate Nous-V1 4B via the Hugging Face Transformers library or deploy it on popular serving platforms.
Using Hugging Face Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "apexion-ai/Nous-1-4B"
# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
# prepare the model input
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# conduct text completion
generated_ids = model.generate(
**model_inputs,
max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
# parsing thinking content
try:
# rindex finding 151668 (</think>)
index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
index = 0
thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
print("thinking content:", thinking_content)
print("content:", content)
Deployment Options
Recommended Sampling Parameters
Temperature: 0.7
Top-p: 0.9
Top-k: 40
Min-p: 0.0
FAQ
Q: Can I fine-tune Nous-V1 4B on my custom data?
A: Yes, the model supports fine-tuning workflows via Hugging Face Trainer or custom scripts.Q: What hardware is recommended?
A: NVIDIA GPUs with at least 16GB VRAM (e.g., A100, 3090) are optimal for inference and fine-tuning.Q: Is the model safe to use for production?
A: Nous-V1 4B includes safety mitigations but should be used with human oversight and proper filtering for sensitive content.
๐ Citation
@misc{apexion2025nousv14b,
title={Nous-V1 4B: Efficient Large Language Model for Versatile NLP Applications},
author={Apexion AI Team},
year={2025},
url={https://huggingface.co/apexion-ai/Nous-V1-4B}
}
Nous-V1 4B โ Powering practical AI applications with intelligent language understanding.