
LLMSIEM/logem-win
LLMSIEM/logem-win is a specialized language model fine-tuned specifically for Windows Event Log (EVTX) analysis and field extraction. Built for Windows-centric security operations and SIEM workflows.
Model Details
Model Description
LLMSIEM/logem-win is a domain-specific fine-tuned version of Qwen3-0.6B, optimized exclusively for parsing and extracting structured data from Windows XML Event Logs (EVTX format). This model excels at handling complex nested XML structures found in Windows Security, System, and Application event logs.
- Developed by: [Hassan Shehata]
- Model type: Causal Language Model (Fine-tuned for Windows EVTX)
- Language(s): English
- License: Apache 2.0
- Finetuned from model: Qwen/Qwen3-0.6B
- Specialization: Windows XML Event Logs (EVTX)
- Model size: ~1.2 GB (FP16), ~396 MB (Q4_K_M quantized)
- Parameters: 0.6B
Model Sources
- General logem model: LLMSIEM/logem
- Research Series: [LinkedIn/Blog Series Link]
Performance Highlights
πͺ Windows EVTX Specialist with superior performance on Windows event parsing:
- 69.2% perfect matches on Windows EVTX test cases
- 0.830 F1 score - competitive with general security models
- 0.846 recall - excellent field detection capability
- 1.34s average response time for complex XML parsing
- Handles nested XML structures in Windows events
Windows Event Log Coverage
Supported Event Categories
π Security Events (Channel: Security)
- Logon/Logoff Events (4624, 4634, 4647)
- Account Management (4720, 4722, 4726, 4728)
- Privilege Use (4672, 4673, 4674)
- Process and Object Access (4656, 4658, 4688)
- Policy Changes (4719, 4739)
- Authentication Events (4768, 4769, 4771)
βοΈ System Events (Channel: System)
- Service Control Manager events
- System startup and shutdown events
- Driver installation and loading
- Hardware events and errors
- Time service synchronization
π± Application Events (Channel: Application)
- Application errors and crashes
- Software installation events
- MSI installer logs
- .NET runtime events
- Custom application event logs
Uses
Direct Use
Ideal for Windows security teams and SIEM engineers who need to:
- Parse Windows Event Logs (EVTX format)
- Extract structured fields from Windows security events
- Automate Windows event log analysis
- Normalize Windows events for SIEM ingestion
- Analyze Domain Controller and Active Directory events
Example Use Cases
# Example: Parse Windows Security Event 4624 (Successful Logon)
input_text = """Extract fields from this Windows Security Event:
<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'>
<s>
<Provider Name='Microsoft-Windows-Security-Auditing'/>
<EventID>4624</EventID>
<TimeCreated SystemTime='2024-01-15T10:30:45.123456Z'/>
<Computer>DC01.contoso.com</Computer>
</s>
<EventData>
<Data Name='SubjectUserName'>DC01$</Data>
<Data Name='TargetUserName'>john.doe</Data>
<Data Name='LogonType'>2</Data>
<Data Name='IpAddress'>192.168.1.100</Data>
</EventData>
</Event>"""
# Model will output structured JSON:
# {
# "event_id": "4624",
# "event_type": "successful_logon",
# "timestamp": "2024-01-15T10:30:45.123456Z",
# "computer": "DC01.contoso.com",
# "target_user": "john.doe",
# "logon_type": "2",
# "source_ip": "192.168.1.100"
# }
Downstream Use
- Windows SIEM Integration: Splunk, Microsoft Sentinel, QRadar
- Active Directory Monitoring: Domain controller event analysis
- Incident Response: Automated Windows event triage
- Compliance Reporting: PCI DSS, SOX, HIPAA Windows event parsing
- Threat Hunting: Windows-specific IOC extraction
- SOAR Workflows: Windows event enrichment and normalization
Out-of-Scope Use
- Non-Windows log formats (use LLMSIEM/logem instead)
- Unix/Linux system logs
- Network device logs (firewalls, routers)
- Web server logs
- General text generation tasks
Model Selection Guide
Use Case | Recommended Model |
---|---|
Windows-only environment | logem-win |
Mixed Windows/Linux environment | logem (general) + logem-win |
Pure Linux/Unix environment | logem (general) |
Network security focus | logem (general) |
Domain Controller monitoring | logem-win |
Bias, Risks, and Limitations
Technical Limitations
- Windows-specific: Only optimized for Windows EVTX format
- XML complexity: May struggle with heavily nested or malformed XML
- Custom events: Performance may vary on non-standard Windows events
- Processing time: Slower than general model due to XML complexity (1.34s avg)
Security Considerations
- Windows event expertise required: Users should understand Windows event log structure
- XML validation needed: Malformed EVTX input may produce unexpected results
- Context dependency: Some Windows events require additional context for full interpretation
Recommendations
- Validate XML structure before processing with the model
- Combine with general logem for comprehensive multi-platform coverage
- Implement fallbacks for unsupported or malformed EVTX entries
- Use alongside Windows expertise for production security operations
How to Get Started with the Model
Using with Transformers
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("LLMSIEM/logem-win")
model = AutoModelForCausalLM.from_pretrained("LLMSIEM/logem-win")
# Example: Parse Windows Security Event
prompt = """Extract fields from this Windows Event Log:
<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'>
<EventData>
<Data Name='SubjectUserName'>alice.smith</Data>
<Data Name='NewProcessName'>C:\\Windows\\System32\\cmd.exe</Data>
</EventData>
</Event>
Extract the following fields as JSON:"""
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
inputs.input_ids,
max_length=1024,
temperature=0.1,
do_sample=False,
pad_token_id=tokenizer.eos_token_id
)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)
Using with Ollama (Recommended for Production)
# Pull the model
ollama pull LLMSIEM/logem-win
# Process Windows Event Log
ollama run LLMSIEM/logem-win "Extract fields from Windows Event ID 4625 failed logon attempt..."
Training Details
Training Data
The model was fine-tuned on a comprehensive dataset of Windows Event Logs including:
- Security Events: Authentication, account management, privilege escalation
- System Events: Service control, startup/shutdown, driver events
- Application Events: Application crashes, software installation, custom logs
Training Procedure
- Base model: Qwen3-0.6B
- Training regime: Mixed precision (fp16)
- Specialization focus: Windows XML Event Log parsing
- Fine-tuning approach: Supervised learning on EVTX-to-JSON extraction
Evaluation
Results
Metric | Score |
---|---|
Perfect Matches | 9/13 (69.2%) |
Average F1 Score | 0.830 |
Average Precision | 0.817 |
Average Recall | 0.846 |
Average Response Time | 1.34s |
Complete Failures | 2 (complex nested XML) |
Comparison with General Model
Model | Perfect Matches | F1 Score | Speed | Use Case |
---|---|---|---|---|
logem-win | 69.2% | 0.830 | 1.34s | Windows EVTX |
logem (general) | 66.7% | 0.833 | 1.00s | Multi-platform |
Citation
@misc{llmsiem-logem-win-2025,
title={LLMSIEM/logem-win: A Windows EVTX Specialized Language Model for Security Log Analysis},
author=Hassan Shehata,
year={2025},
url={https://huggingface.co/LLMSIEM/logem-win},
note={Fine-tuned from Qwen3-0.6B for Windows Event Log parsing}
}
Model Card Authors
[Hassan Shehata/LLMSIEM]
Model Card Contact
For questions about this model:
- Email: [[email protected]]
- LinkedIn: [https://www.linkedin.com/in/hassan-shehata-503272172/]
- GitHub: [Your GitHub Profile]
Part of the LLMSIEM model family. For general security log parsing, see LLMSIEM/logem. For comprehensive Windows security operations, deploy both models together.