LLMSIEM/logem-win

LLMSIEM/logem-win is a specialized language model fine-tuned specifically for Windows Event Log (EVTX) analysis and field extraction. Built for Windows-centric security operations and SIEM workflows.

Model Details

Model Description

LLMSIEM/logem-win is a domain-specific fine-tuned version of Qwen3-0.6B, optimized exclusively for parsing and extracting structured data from Windows XML Event Logs (EVTX format). This model excels at handling complex nested XML structures found in Windows Security, System, and Application event logs.

  • Developed by: [Hassan Shehata]
  • Model type: Causal Language Model (Fine-tuned for Windows EVTX)
  • Language(s): English
  • License: Apache 2.0
  • Finetuned from model: Qwen/Qwen3-0.6B
  • Specialization: Windows XML Event Logs (EVTX)
  • Model size: ~1.2 GB (FP16), ~396 MB (Q4_K_M quantized)
  • Parameters: 0.6B

Model Sources

  • General logem model: LLMSIEM/logem
  • Research Series: [LinkedIn/Blog Series Link]

Performance Highlights

πŸͺŸ Windows EVTX Specialist with superior performance on Windows event parsing:

  • 69.2% perfect matches on Windows EVTX test cases
  • 0.830 F1 score - competitive with general security models
  • 0.846 recall - excellent field detection capability
  • 1.34s average response time for complex XML parsing
  • Handles nested XML structures in Windows events

Windows Event Log Coverage

Supported Event Categories

πŸ” Security Events (Channel: Security)

  • Logon/Logoff Events (4624, 4634, 4647)
  • Account Management (4720, 4722, 4726, 4728)
  • Privilege Use (4672, 4673, 4674)
  • Process and Object Access (4656, 4658, 4688)
  • Policy Changes (4719, 4739)
  • Authentication Events (4768, 4769, 4771)

βš™οΈ System Events (Channel: System)

  • Service Control Manager events
  • System startup and shutdown events
  • Driver installation and loading
  • Hardware events and errors
  • Time service synchronization

πŸ“± Application Events (Channel: Application)

  • Application errors and crashes
  • Software installation events
  • MSI installer logs
  • .NET runtime events
  • Custom application event logs

Uses

Direct Use

Ideal for Windows security teams and SIEM engineers who need to:

  • Parse Windows Event Logs (EVTX format)
  • Extract structured fields from Windows security events
  • Automate Windows event log analysis
  • Normalize Windows events for SIEM ingestion
  • Analyze Domain Controller and Active Directory events

Example Use Cases

# Example: Parse Windows Security Event 4624 (Successful Logon)
input_text = """Extract fields from this Windows Security Event:
<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'>
  <s>
    <Provider Name='Microsoft-Windows-Security-Auditing'/>
    <EventID>4624</EventID>
    <TimeCreated SystemTime='2024-01-15T10:30:45.123456Z'/>
    <Computer>DC01.contoso.com</Computer>
  </s>
  <EventData>
    <Data Name='SubjectUserName'>DC01$</Data>
    <Data Name='TargetUserName'>john.doe</Data>
    <Data Name='LogonType'>2</Data>
    <Data Name='IpAddress'>192.168.1.100</Data>
  </EventData>
</Event>"""

# Model will output structured JSON:
# {
#   "event_id": "4624",
#   "event_type": "successful_logon",
#   "timestamp": "2024-01-15T10:30:45.123456Z",
#   "computer": "DC01.contoso.com",
#   "target_user": "john.doe",
#   "logon_type": "2",
#   "source_ip": "192.168.1.100"
# }

Downstream Use

  • Windows SIEM Integration: Splunk, Microsoft Sentinel, QRadar
  • Active Directory Monitoring: Domain controller event analysis
  • Incident Response: Automated Windows event triage
  • Compliance Reporting: PCI DSS, SOX, HIPAA Windows event parsing
  • Threat Hunting: Windows-specific IOC extraction
  • SOAR Workflows: Windows event enrichment and normalization

Out-of-Scope Use

  • Non-Windows log formats (use LLMSIEM/logem instead)
  • Unix/Linux system logs
  • Network device logs (firewalls, routers)
  • Web server logs
  • General text generation tasks

Model Selection Guide

Use Case Recommended Model
Windows-only environment logem-win
Mixed Windows/Linux environment logem (general) + logem-win
Pure Linux/Unix environment logem (general)
Network security focus logem (general)
Domain Controller monitoring logem-win

Bias, Risks, and Limitations

Technical Limitations

  • Windows-specific: Only optimized for Windows EVTX format
  • XML complexity: May struggle with heavily nested or malformed XML
  • Custom events: Performance may vary on non-standard Windows events
  • Processing time: Slower than general model due to XML complexity (1.34s avg)

Security Considerations

  • Windows event expertise required: Users should understand Windows event log structure
  • XML validation needed: Malformed EVTX input may produce unexpected results
  • Context dependency: Some Windows events require additional context for full interpretation

Recommendations

  • Validate XML structure before processing with the model
  • Combine with general logem for comprehensive multi-platform coverage
  • Implement fallbacks for unsupported or malformed EVTX entries
  • Use alongside Windows expertise for production security operations

How to Get Started with the Model

Using with Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("LLMSIEM/logem-win")
model = AutoModelForCausalLM.from_pretrained("LLMSIEM/logem-win")

# Example: Parse Windows Security Event
prompt = """Extract fields from this Windows Event Log:
<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'>
  <EventData>
    <Data Name='SubjectUserName'>alice.smith</Data>
    <Data Name='NewProcessName'>C:\\Windows\\System32\\cmd.exe</Data>
  </EventData>
</Event>

Extract the following fields as JSON:"""

inputs = tokenizer(prompt, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(
        inputs.input_ids,
        max_length=1024,
        temperature=0.1,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )

result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)

Using with Ollama (Recommended for Production)

# Pull the model
ollama pull LLMSIEM/logem-win

# Process Windows Event Log
ollama run LLMSIEM/logem-win "Extract fields from Windows Event ID 4625 failed logon attempt..."

Training Details

Training Data

The model was fine-tuned on a comprehensive dataset of Windows Event Logs including:

  • Security Events: Authentication, account management, privilege escalation
  • System Events: Service control, startup/shutdown, driver events
  • Application Events: Application crashes, software installation, custom logs

Training Procedure

  • Base model: Qwen3-0.6B
  • Training regime: Mixed precision (fp16)
  • Specialization focus: Windows XML Event Log parsing
  • Fine-tuning approach: Supervised learning on EVTX-to-JSON extraction

Evaluation

Results

Metric Score
Perfect Matches 9/13 (69.2%)
Average F1 Score 0.830
Average Precision 0.817
Average Recall 0.846
Average Response Time 1.34s
Complete Failures 2 (complex nested XML)

Comparison with General Model

Model Perfect Matches F1 Score Speed Use Case
logem-win 69.2% 0.830 1.34s Windows EVTX
logem (general) 66.7% 0.833 1.00s Multi-platform

Citation

@misc{llmsiem-logem-win-2025,
  title={LLMSIEM/logem-win: A Windows EVTX Specialized Language Model for Security Log Analysis},
  author=Hassan Shehata,
  year={2025},
  url={https://huggingface.co/LLMSIEM/logem-win},
  note={Fine-tuned from Qwen3-0.6B for Windows Event Log parsing}
}

Model Card Authors

[Hassan Shehata/LLMSIEM]

Model Card Contact

For questions about this model:


Part of the LLMSIEM model family. For general security log parsing, see LLMSIEM/logem. For comprehensive Windows security operations, deploy both models together.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for HassanShehata/logem-win

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(241)
this model