Model Card: Thoth-v2.5

Summary

Thoth-v2.5 is a fine-tuned model based on Salesforce/codet5p-220m, specifically designed to extract two key pieces of information from a payload string:

attack_syntax: The suspicious attack pattern (e.g., <script>, UNION SELECT, etc.)
attack_type: The type of attack (e.g., SQL Injection, XSS, Command Injection)

The model expects inputs to be prefixed with analysis: as shown below:

Input Format

analysis: ```payload```

Output Format

{
  "attack_syntax": "...",
  "attack_type": "..."
}

Model Details

Model Name: Thoth-v2.5
Base Model: Salesforce/codet5p-220m
Architecture: T5-style Encoder-Decoder
Prefix Used: analysis:
Primary Language: English (based on payloads, not natural language)

Intended Use

This model is intended for research and educational purposes in the domain of payload analysis and attack pattern extraction. It can be used as a preprocessing step in security pipelines or as part of exploratory security tools.

Out-of-Scope Use

Final Decision-Making in Security Systems: This model should not be used as the sole basis for blocking or mitigating attacks in production environments without additional verification.
General Natural Language Processing: The model is not trained for tasks involving natural language understanding beyond code and payload patterns.

Example Usage

Input

analysis: ```<script>alert('x')</script>```

Output

{
  "attack_syntax": "<script>alert('x')</script>",
  "attack_type": "Cross Site Scripting (XSS)"
}

Training Details

Training Data: Proprietary dataset curated by Seculayer, containing annotated payloads and attack types.
Fine-Tuning Base: Salesforce/codet5p-220m

Limitations & Risks

False Positives/Negatives: The model may misclassify benign strings as attacks or fail to detect obfuscated or novel attack patterns.
Pattern-Based Only: Thoth-v2.5 relies solely on pattern recognition and does not infer intent or contextual meaning.
Single-Payload Input: The model operates on isolated payload strings and does not process broader request/response context.

License & Usage Restrictions

License: Non-commercial use only.
Restrictions: This model and its outputs must not be used for commercial purposes, including integration into commercial security solutions, products, or services, without explicit written permission from Seculayer.

madcows
/

Thoth-v2.5

You need to agree to share your contact information to access this model

Model Card: Thoth-v2.5

Model Card: Thoth-v2.5

Summary

Input Format

Output Format

Model Details

Intended Use

Out-of-Scope Use

Example Usage

Input

Output

Training Details

Limitations & Risks

License & Usage Restrictions

Model tree for madcows/Thoth-v2.5