ThreatFlux 0.6B YARA - The Best Open-Source YARA Rule Generation Model

๐Ÿ† State-of-the-art open-source YARA generation achieving 60% compile rate (99% for simple rules) - outperforming models 50x larger

Model Overview

ThreatFlux 0.6B YARA is the industry-leading open-source model for YARA rule generation. This v21 release represents a breakthrough in automated threat detection rule creation, developed by Wyatt Roersma at ThreatFlux.

Key Achievements

  • Industry-best 60% overall compile rate for generated YARA rules
  • 99% compile rate for simple rules - virtually perfect for common use cases
  • Outperforms models 50x larger including Qwen3 30B MoE
  • Sub-second response times (<1s) for rule generation
  • Best-in-class open-source YARA model available today
  • Optimized for production cybersecurity workflows
  • Available in FP16 GGUF and Ollama formats for easy deployment

Training Details

  • Base Model: Qwen/Qwen3-0.6B
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Training Data: 557 SFT samples (~3 million tokens) of curated YARA rules
  • Hardware: Single NVIDIA RTX 3090
  • Training Duration: ~3 hours
  • Optimization: Unsloth, Flash Attention 2, BF16 precision

Training Configuration

The model was fine-tuned using the v21 configuration with these optimized settings:

# Model Configuration
model_name_or_path: Qwen/Qwen3-0.6B
finetuning_type: lora
lora_target: all
lora_rank: 64
lora_alpha: 128

# Training Hyperparameters
per_device_train_batch_size: 2
gradient_accumulation_steps: 16  # Effective batch size: 32
num_train_epochs: 10.0
learning_rate: 2.0e-4
lr_scheduler_type: cosine
warmup_ratio: 0.03
cutoff_len: 40960

# Optimizations
use_unsloth: true
flash_attn: fa2
gradient_checkpointing: true
bf16: true
optim: paged_adamw_8bit

Model Specifications

Based on Qwen3-0.6B with the following architecture:

  • Type: Causal Language Model (LoRA fine-tuned)
  • Number of Parameters: 0.6B
  • Number of Parameters (Non-Embedding): 0.44B
  • Number of Layers: 28
  • Number of Attention Heads (GQA): 16 for Q and 8 for KV
  • Context Length: 32,768 (extended to 40,960 during training)

Available Files

This repository contains:

  • threatflux-0.6B-fp16.gguf - FP16 GGUF model file (1.2GB)
  • Modelfile - Ollama configuration with optimized parameters
  • README.md - This documentation

Quick Start

Using with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "vtriple/threatflux-0.6B-gguf"

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# Generate a YARA rule
prompt = "Create a YARA rule to detect a ransomware payload with encryption routines"
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Generate
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=2048,
    temperature=0.7,
    top_p=0.9
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):]
response = tokenizer.decode(output_ids, skip_special_tokens=True)

print("Generated YARA Rule:")
print(response)

Using with Ollama

The model includes an optimized Ollama configuration file with the recommended parameters:

# Pull the model directly from HuggingFace
ollama pull vtriple/threatflux-0.6B

# Generate a YARA rule
ollama run vtriple/threatflux-0.6B "Create a YARA rule for detecting suspicious PowerShell commands"

The Ollama configuration uses optimized parameters:

  • Temperature: 0.4
  • Top-K: 40
  • Top-P: 0.9
  • Context: 16384 tokens
  • Max output: 2048 tokens
  • Repetition penalty: 1.05

Using the GGUF File

The repository includes an FP16 GGUF file optimized for fast inference:

# Direct download
wget https://huggingface.co/vtriple/threatflux-0.6B-gguf/resolve/main/threatflux-0.6B-fp16.gguf

# Using with llama.cpp
./main -m threatflux-0.6B-fp16.gguf \
  --temp 0.4 --top-k 40 --top-p 0.9 \
  --repeat-penalty 1.05 -n 2048 -c 16384 \
  -p "Generate a YARA rule for detecting base64 encoded payloads"

# Using with llama-cpp-python
from llama_cpp import Llama

llm = Llama(
    model_path="threatflux-0.6B-fp16.gguf",
    n_ctx=16384,
    n_threads=8
)
response = llm(
    "Create a YARA rule for detecting malicious macros",
    max_tokens=2048,
    temperature=0.4,
    top_k=40,
    top_p=0.9,
    repeat_penalty=1.05
)
print(response['choices'][0]['text'])

Use Cases

This model excels at:

  • YARA rule generation from natural language descriptions
  • Rule optimization and refinement
  • Threat pattern translation into YARA syntax
  • Quick prototyping of detection rules
  • Bulk rule generation for threat hunting
  • Educational purposes for learning YARA syntax

Example Outputs

Example: Ransomware Detection Rule
rule Ransomware_Encryption_Routine {
    meta:
        description = "Detects potential ransomware encryption routines"
        author = "Generated by ThreatFlux"
        date = "2025-01-01"
        
    strings:
        $crypto1 = "CryptEncrypt" ascii wide
        $crypto2 = "AES_encrypt" ascii
        $crypto3 = "RSA" ascii
        $extension = /\.[a-z0-9]{5,10}$/ 
        $ransom_note = "Your files have been encrypted" nocase
        $bitcoin = /[13][a-km-zA-HJ-NP-Z1-9]{25,34}/ ascii
        
    condition:
        uint16(0) == 0x5A4D and
        (2 of ($crypto*) or $ransom_note) and
        filesize < 5MB
}
Example: Webshell Detection
rule PHP_Webshell_Generic {
    meta:
        description = "Detects common PHP webshell patterns"
        author = "Generated by ThreatFlux"
        
    strings:
        $php = "<?php" ascii
        $eval = "eval(" nocase
        $base64 = "base64_decode" nocase
        $system = "system(" nocase
        $exec = "exec(" nocase
        $shell = "shell_exec" nocase
        $passthru = "passthru(" nocase
        
    condition:
        $php and (
            (#eval > 2) or
            ($base64 and any of ($system, $exec, $shell, $passthru))
        ) and filesize < 100KB
}
Example: Cryptominer Detection
rule Cryptominer_XMRig {
    meta:
        description = "Detects XMRig cryptominer variants"
        author = "Generated by ThreatFlux"
        
    strings:
        $pool1 = "pool.minexmr.com" ascii
        $pool2 = "xmrpool.eu" ascii
        $wallet = /4[0-9AB][0-9a-zA-Z]{93}/ ascii
        $algo = "randomx" ascii nocase
        $cpu = "cpu-priority" ascii
        $donate = "donate-level" ascii
        
    condition:
        uint16(0) == 0x5A4D and
        (any of ($pool*) or $wallet) and
        2 of ($algo, $cpu, $donate)
}

Best Practices

Optimal Sampling Parameters

These parameters have been extensively tested to produce the highest quality YARA rules:

  • Temperature: 0.4 (optimized for accuracy)
  • Top-P: 0.9
  • Top-K: 40
  • Max Tokens: 2048 (num_predict)
  • Context Length: 16384 (num_ctx)
  • Repetition Penalty: 1.05

Prompting Guidelines

  1. Be Specific: Include threat type, target platform, and detection objectives

    "Create a YARA rule to detect Windows ransomware that uses AES encryption and creates .locked file extensions"
    
  2. Request Structure: Specify if you need specific sections

    "Generate a YARA rule with comprehensive meta information, string patterns, and conditions for detecting..."
    
  3. Context Matters: Provide behavioral patterns or IOCs when available

    "Create a YARA rule based on these IOCs: [list of hashes, strings, behaviors]"
    

Post-Processing Workflow

  1. Validation: Always compile generated rules with yarac

    yarac generated_rule.yar compiled_rule.yarc
    
  2. Testing: Test against known samples

    yara generated_rule.yar /path/to/samples/
    
  3. Optimization: Refine for performance and accuracy

    • Remove redundant conditions
    • Optimize regex patterns
    • Adjust file size constraints

Integration Examples

Python Integration

import yara
from transformers import pipeline

# Generate rule
generator = pipeline("text-generation", model="vtriple/threatflux-0.6B-gguf")
rule_text = generator("Create a YARA rule for detecting malicious PowerShell", max_length=1024)[0]['generated_text']

# Compile and use
rules = yara.compile(source=rule_text)
matches = rules.match(filepath="/path/to/file")

CI/CD Pipeline Integration

# .github/workflows/yara-generation.yml
- name: Generate YARA Rules
  run: |
    python generate_rules.py --model vtriple/threatflux-0.6B-gguf
    yarac output/*.yar -o compiled_rules.yarc

Limitations

  • Compile Rate: While 60% is a significant improvement, manual review is recommended
  • Complex Logic: Multi-condition rules with complex boolean logic may need refinement
  • Module Support: Limited support for YARA modules (PE, ELF, etc.)
  • Performance: Generated rules should be optimized for production scanning

Citation

If you use this model in your research or applications, please cite:

@misc{threatflux2025yara,
  title={ThreatFlux 0.6B v21: State-of-the-Art Open-Source YARA Rule Generation},
  author={Roersma, Wyatt},
  organization={ThreatFlux},
  year={2025},
  howpublished={\url{https://huggingface.co/vtriple/threatflux-0.6B-gguf}},
  note={Best performing open-source YARA generation model with 60% compile rate}
}

License

This model inherits the Apache 2.0 license from the base Qwen3-0.6B model. See LICENSE for details.

Acknowledgments

  • The Qwen team for the excellent base model
  • The cybersecurity community for YARA rule datasets
  • LLaMA-Factory for the training framework
  • Unsloth for optimization techniques

Contact

For questions, feedback, or collaboration:

Downloads last month
127
GGUF
Model size
596M params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for vtriple/threatflux-0.6B-gguf

Finetuned
Qwen/Qwen3-0.6B
Quantized
(153)
this model