Text Generation
Transformers
Safetensors
English
qwen3
shining-valiant
shining-valiant-3
valiant
valiant-labs
qwen
qwen-3
qwen-3-4b
4b
reasoning
code
code-reasoning
science
science-reasoning
physics
biology
chemistry
earth-science
astronomy
machine-learning
artificial-intelligence
compsci
computer-science
information-theory
ML-Ops
math
cuda
deep-learning
agentic
LLM
neuromorphic
self-improvement
complex-systems
cognition
linguistics
philosophy
logic
epistemology
simulation
game-theory
knowledge-management
creativity
problem-solving
architect
engineer
developer
creative
analytical
expert
rationality
conversational
chat
instruct
text-generation-inference
language: | |
- en | |
library_name: transformers | |
pipeline_tag: text-generation | |
tags: | |
- shining-valiant | |
- shining-valiant-3 | |
- valiant | |
- valiant-labs | |
- qwen | |
- qwen-3 | |
- qwen-3-4b | |
- 4b | |
- reasoning | |
- code | |
- code-reasoning | |
- science | |
- science-reasoning | |
- physics | |
- biology | |
- chemistry | |
- earth-science | |
- astronomy | |
- machine-learning | |
- artificial-intelligence | |
- compsci | |
- computer-science | |
- information-theory | |
- ML-Ops | |
- math | |
- cuda | |
- deep-learning | |
- transformers | |
- agentic | |
- LLM | |
- neuromorphic | |
- self-improvement | |
- complex-systems | |
- cognition | |
- linguistics | |
- philosophy | |
- logic | |
- epistemology | |
- simulation | |
- game-theory | |
- knowledge-management | |
- creativity | |
- problem-solving | |
- architect | |
- engineer | |
- developer | |
- creative | |
- analytical | |
- expert | |
- rationality | |
- conversational | |
- chat | |
- instruct | |
base_model: Qwen/Qwen3-4B | |
datasets: | |
- sequelbox/Celestia3-DeepSeek-R1-0528 | |
- sequelbox/Mitakihara-DeepSeek-R1-0528 | |
- sequelbox/Raiden-DeepSeek-R1 | |
license: apache-2.0 | |
**[Support our open-source dataset and model releases!](https://huggingface.co/spaces/sequelbox/SupportOpenSource)** | |
 | |
Shining Valiant 3: [Qwen3-1.7B](https://huggingface.co/ValiantLabs/Qwen3-1.7B-ShiningValiant3), [Qwen3-4B](https://huggingface.co/ValiantLabs/Qwen3-4B-ShiningValiant3), [Qwen3-8B](https://huggingface.co/ValiantLabs/Qwen3-8B-ShiningValiant3) | |
Shining Valiant 3 is a science, AI design, and general reasoning specialist built on Qwen 3. | |
- Finetuned on our newest [science reasoning](https://huggingface.co/datasets/sequelbox/Celestia3-DeepSeek-R1-0528) data generated with [Deepseek R1 0528!](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528) | |
- AI to build AI: our [high-difficulty AI reasoning](https://huggingface.co/datasets/sequelbox/Mitakihara-DeepSeek-R1-0528) data makes Shining Valiant 3 your friend for building with current AI tech and discovering new innovations and improvements! | |
- Improved [general and creative reasoning](https://huggingface.co/datasets/sequelbox/Raiden-DeepSeek-R1) to supplement problem-solving and general chat performance. | |
- Small model sizes allow running on local desktop and mobile, plus super-fast server inference! | |
## Prompting Guide | |
Shining Valiant 3 uses the [Qwen 3](https://huggingface.co/Qwen/Qwen3-4B) prompt format. | |
Shining Valiant 3 is a reasoning finetune; **we recommend enable_thinking=True for all chats.** | |
Example inference script to get started: | |
```python | |
from transformers import AutoModelForCausalLM, AutoTokenizer | |
model_name = "ValiantLabs/Qwen3-4B-ShiningValiant3" | |
# load the tokenizer and the model | |
tokenizer = AutoTokenizer.from_pretrained(model_name) | |
model = AutoModelForCausalLM.from_pretrained( | |
model_name, | |
torch_dtype="auto", | |
device_map="auto" | |
) | |
# prepare the model input | |
prompt = "Propose a novel cognitive architecture where the primary memory component is a Graph Neural Network (GNN). How would this GNN represent working, declarative, and procedural memory? How would the \"cognitive cycle\" be implemented as operations on this graph?" | |
messages = [ | |
{"role": "user", "content": prompt} | |
] | |
text = tokenizer.apply_chat_template( | |
messages, | |
tokenize=False, | |
add_generation_prompt=True, | |
enable_thinking=True # Switches between thinking and non-thinking modes. Default is True. | |
) | |
model_inputs = tokenizer([text], return_tensors="pt").to(model.device) | |
# conduct text completion | |
generated_ids = model.generate( | |
**model_inputs, | |
max_new_tokens=32768 | |
) | |
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() | |
# parsing thinking content | |
try: | |
# rindex finding 151668 (</think>) | |
index = len(output_ids) - output_ids[::-1].index(151668) | |
except ValueError: | |
index = 0 | |
thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n") | |
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n") | |
print("thinking content:", thinking_content) | |
print("content:", content) | |
``` | |
 | |
Shining Valiant 3 is created by [Valiant Labs.](http://valiantlabs.ca/) | |
[Check out our HuggingFace page to see all of our models!](https://huggingface.co/ValiantLabs) | |
We care about open source. For everyone to use. | |