ThreatFlux 0.6B YARA - The Best Open-Source YARA Rule Generation Model
๐ State-of-the-art open-source YARA generation achieving 60% compile rate (99% for simple rules) - outperforming models 50x larger
Model Overview
ThreatFlux 0.6B YARA is the industry-leading open-source model for YARA rule generation. This v21 release represents a breakthrough in automated threat detection rule creation, developed by Wyatt Roersma at ThreatFlux.
Key Achievements
- Industry-best 60% overall compile rate for generated YARA rules
- 99% compile rate for simple rules - virtually perfect for common use cases
- Outperforms models 50x larger including Qwen3 30B MoE
- Sub-second response times (<1s) for rule generation
- Best-in-class open-source YARA model available today
- Optimized for production cybersecurity workflows
- Available in FP16 GGUF and Ollama formats for easy deployment
Training Details
- Base Model: Qwen/Qwen3-0.6B
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Training Data: 557 SFT samples (~3 million tokens) of curated YARA rules
- Hardware: Single NVIDIA RTX 3090
- Training Duration: ~3 hours
- Optimization: Unsloth, Flash Attention 2, BF16 precision
Training Configuration
The model was fine-tuned using the v21 configuration with these optimized settings:
# Model Configuration
model_name_or_path: Qwen/Qwen3-0.6B
finetuning_type: lora
lora_target: all
lora_rank: 64
lora_alpha: 128
# Training Hyperparameters
per_device_train_batch_size: 2
gradient_accumulation_steps: 16 # Effective batch size: 32
num_train_epochs: 10.0
learning_rate: 2.0e-4
lr_scheduler_type: cosine
warmup_ratio: 0.03
cutoff_len: 40960
# Optimizations
use_unsloth: true
flash_attn: fa2
gradient_checkpointing: true
bf16: true
optim: paged_adamw_8bit
Model Specifications
Based on Qwen3-0.6B with the following architecture:
- Type: Causal Language Model (LoRA fine-tuned)
- Number of Parameters: 0.6B
- Number of Parameters (Non-Embedding): 0.44B
- Number of Layers: 28
- Number of Attention Heads (GQA): 16 for Q and 8 for KV
- Context Length: 32,768 (extended to 40,960 during training)
Available Files
This repository contains:
threatflux-0.6B-fp16.gguf
- FP16 GGUF model file (1.2GB)Modelfile
- Ollama configuration with optimized parametersREADME.md
- This documentation
Quick Start
Using with Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "vtriple/threatflux-0.6B-gguf"
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
# Generate a YARA rule
prompt = "Create a YARA rule to detect a ransomware payload with encryption routines"
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# Generate
generated_ids = model.generate(
**model_inputs,
max_new_tokens=2048,
temperature=0.7,
top_p=0.9
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):]
response = tokenizer.decode(output_ids, skip_special_tokens=True)
print("Generated YARA Rule:")
print(response)
Using with Ollama
The model includes an optimized Ollama configuration file with the recommended parameters:
# Pull the model directly from HuggingFace
ollama pull vtriple/threatflux-0.6B
# Generate a YARA rule
ollama run vtriple/threatflux-0.6B "Create a YARA rule for detecting suspicious PowerShell commands"
The Ollama configuration uses optimized parameters:
- Temperature: 0.4
- Top-K: 40
- Top-P: 0.9
- Context: 16384 tokens
- Max output: 2048 tokens
- Repetition penalty: 1.05
Using the GGUF File
The repository includes an FP16 GGUF file optimized for fast inference:
# Direct download
wget https://huggingface.co/vtriple/threatflux-0.6B-gguf/resolve/main/threatflux-0.6B-fp16.gguf
# Using with llama.cpp
./main -m threatflux-0.6B-fp16.gguf \
--temp 0.4 --top-k 40 --top-p 0.9 \
--repeat-penalty 1.05 -n 2048 -c 16384 \
-p "Generate a YARA rule for detecting base64 encoded payloads"
# Using with llama-cpp-python
from llama_cpp import Llama
llm = Llama(
model_path="threatflux-0.6B-fp16.gguf",
n_ctx=16384,
n_threads=8
)
response = llm(
"Create a YARA rule for detecting malicious macros",
max_tokens=2048,
temperature=0.4,
top_k=40,
top_p=0.9,
repeat_penalty=1.05
)
print(response['choices'][0]['text'])
Use Cases
This model excels at:
- YARA rule generation from natural language descriptions
- Rule optimization and refinement
- Threat pattern translation into YARA syntax
- Quick prototyping of detection rules
- Bulk rule generation for threat hunting
- Educational purposes for learning YARA syntax
Example Outputs
Example: Ransomware Detection Rule
rule Ransomware_Encryption_Routine {
meta:
description = "Detects potential ransomware encryption routines"
author = "Generated by ThreatFlux"
date = "2025-01-01"
strings:
$crypto1 = "CryptEncrypt" ascii wide
$crypto2 = "AES_encrypt" ascii
$crypto3 = "RSA" ascii
$extension = /\.[a-z0-9]{5,10}$/
$ransom_note = "Your files have been encrypted" nocase
$bitcoin = /[13][a-km-zA-HJ-NP-Z1-9]{25,34}/ ascii
condition:
uint16(0) == 0x5A4D and
(2 of ($crypto*) or $ransom_note) and
filesize < 5MB
}
Example: Webshell Detection
rule PHP_Webshell_Generic {
meta:
description = "Detects common PHP webshell patterns"
author = "Generated by ThreatFlux"
strings:
$php = "<?php" ascii
$eval = "eval(" nocase
$base64 = "base64_decode" nocase
$system = "system(" nocase
$exec = "exec(" nocase
$shell = "shell_exec" nocase
$passthru = "passthru(" nocase
condition:
$php and (
(#eval > 2) or
($base64 and any of ($system, $exec, $shell, $passthru))
) and filesize < 100KB
}
Example: Cryptominer Detection
rule Cryptominer_XMRig {
meta:
description = "Detects XMRig cryptominer variants"
author = "Generated by ThreatFlux"
strings:
$pool1 = "pool.minexmr.com" ascii
$pool2 = "xmrpool.eu" ascii
$wallet = /4[0-9AB][0-9a-zA-Z]{93}/ ascii
$algo = "randomx" ascii nocase
$cpu = "cpu-priority" ascii
$donate = "donate-level" ascii
condition:
uint16(0) == 0x5A4D and
(any of ($pool*) or $wallet) and
2 of ($algo, $cpu, $donate)
}
Best Practices
Optimal Sampling Parameters
These parameters have been extensively tested to produce the highest quality YARA rules:
- Temperature: 0.4 (optimized for accuracy)
- Top-P: 0.9
- Top-K: 40
- Max Tokens: 2048 (num_predict)
- Context Length: 16384 (num_ctx)
- Repetition Penalty: 1.05
Prompting Guidelines
Be Specific: Include threat type, target platform, and detection objectives
"Create a YARA rule to detect Windows ransomware that uses AES encryption and creates .locked file extensions"
Request Structure: Specify if you need specific sections
"Generate a YARA rule with comprehensive meta information, string patterns, and conditions for detecting..."
Context Matters: Provide behavioral patterns or IOCs when available
"Create a YARA rule based on these IOCs: [list of hashes, strings, behaviors]"
Post-Processing Workflow
Validation: Always compile generated rules with
yarac
yarac generated_rule.yar compiled_rule.yarc
Testing: Test against known samples
yara generated_rule.yar /path/to/samples/
Optimization: Refine for performance and accuracy
- Remove redundant conditions
- Optimize regex patterns
- Adjust file size constraints
Integration Examples
Python Integration
import yara
from transformers import pipeline
# Generate rule
generator = pipeline("text-generation", model="vtriple/threatflux-0.6B-gguf")
rule_text = generator("Create a YARA rule for detecting malicious PowerShell", max_length=1024)[0]['generated_text']
# Compile and use
rules = yara.compile(source=rule_text)
matches = rules.match(filepath="/path/to/file")
CI/CD Pipeline Integration
# .github/workflows/yara-generation.yml
- name: Generate YARA Rules
run: |
python generate_rules.py --model vtriple/threatflux-0.6B-gguf
yarac output/*.yar -o compiled_rules.yarc
Limitations
- Compile Rate: While 60% is a significant improvement, manual review is recommended
- Complex Logic: Multi-condition rules with complex boolean logic may need refinement
- Module Support: Limited support for YARA modules (PE, ELF, etc.)
- Performance: Generated rules should be optimized for production scanning
Citation
If you use this model in your research or applications, please cite:
@misc{threatflux2025yara,
title={ThreatFlux 0.6B v21: State-of-the-Art Open-Source YARA Rule Generation},
author={Roersma, Wyatt},
organization={ThreatFlux},
year={2025},
howpublished={\url{https://huggingface.co/vtriple/threatflux-0.6B-gguf}},
note={Best performing open-source YARA generation model with 60% compile rate}
}
License
This model inherits the Apache 2.0 license from the base Qwen3-0.6B model. See LICENSE for details.
Acknowledgments
- The Qwen team for the excellent base model
- The cybersecurity community for YARA rule datasets
- LLaMA-Factory for the training framework
- Unsloth for optimization techniques
Contact
For questions, feedback, or collaboration:
- Organization: ThreatFlux
- Developer: Wyatt Roersma
- Model Repository: https://huggingface.co/vtriple/threatflux-0.6B-gguf
- Issues & Feedback: Open an issue on the HuggingFace repository
- Downloads last month
- 127
16-bit