Model Card: Thoth-v2.5
Model Card: Thoth-v2.5
Summary
Thoth-v2.5 is a fine-tuned model based on Salesforce/codet5p-220m
, specifically designed to extract two key pieces of information from a payload string:
- attack_syntax: The suspicious attack pattern (e.g.,
<script>
,UNION SELECT
, etc.) - attack_type: The type of attack (e.g., SQL Injection, XSS, Command Injection)
The model expects inputs to be prefixed with analysis:
as shown below:
Input Format
analysis: ```payload```
Output Format
{
"attack_syntax": "...",
"attack_type": "..."
}
Model Details
- Model Name: Thoth-v2.5
- Base Model: Salesforce/codet5p-220m
- Architecture: T5-style Encoder-Decoder
- Prefix Used:
analysis:
- Primary Language: English (based on payloads, not natural language)
Intended Use
This model is intended for research and educational purposes in the domain of payload analysis and attack pattern extraction. It can be used as a preprocessing step in security pipelines or as part of exploratory security tools.
Out-of-Scope Use
- Final Decision-Making in Security Systems: This model should not be used as the sole basis for blocking or mitigating attacks in production environments without additional verification.
- General Natural Language Processing: The model is not trained for tasks involving natural language understanding beyond code and payload patterns.
Example Usage
Input
analysis: ```<script>alert('x')</script>```
Output
{
"attack_syntax": "<script>alert('x')</script>",
"attack_type": "Cross Site Scripting (XSS)"
}
Training Details
- Training Data: Proprietary dataset curated by Seculayer, containing annotated payloads and attack types.
- Fine-Tuning Base: Salesforce/codet5p-220m
Limitations & Risks
- False Positives/Negatives: The model may misclassify benign strings as attacks or fail to detect obfuscated or novel attack patterns.
- Pattern-Based Only: Thoth-v2.5 relies solely on pattern recognition and does not infer intent or contextual meaning.
- Single-Payload Input: The model operates on isolated payload strings and does not process broader request/response context.
License & Usage Restrictions
- License: Non-commercial use only.
- Restrictions: This model and its outputs must not be used for commercial purposes, including integration into commercial security solutions, products, or services, without explicit written permission from Seculayer.
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for madcows/Thoth-v2.5
Base model
Salesforce/codet5p-220m