You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Card: Thoth-v2.5

Model Card: Thoth-v2.5

Summary

Thoth-v2.5 is a fine-tuned model based on Salesforce/codet5p-220m, specifically designed to extract two key pieces of information from a payload string:

  • attack_syntax: The suspicious attack pattern (e.g., <script>, UNION SELECT, etc.)
  • attack_type: The type of attack (e.g., SQL Injection, XSS, Command Injection)

The model expects inputs to be prefixed with analysis: as shown below:

Input Format

analysis: ```payload```

Output Format

{
  "attack_syntax": "...",
  "attack_type": "..."
}

Model Details

  • Model Name: Thoth-v2.5
  • Base Model: Salesforce/codet5p-220m
  • Architecture: T5-style Encoder-Decoder
  • Prefix Used: analysis:
  • Primary Language: English (based on payloads, not natural language)

Intended Use

This model is intended for research and educational purposes in the domain of payload analysis and attack pattern extraction. It can be used as a preprocessing step in security pipelines or as part of exploratory security tools.

Out-of-Scope Use

  • Final Decision-Making in Security Systems: This model should not be used as the sole basis for blocking or mitigating attacks in production environments without additional verification.
  • General Natural Language Processing: The model is not trained for tasks involving natural language understanding beyond code and payload patterns.

Example Usage

Input

analysis: ```<script>alert('x')</script>```

Output

{
  "attack_syntax": "<script>alert('x')</script>",
  "attack_type": "Cross Site Scripting (XSS)"
}

Training Details

  • Training Data: Proprietary dataset curated by Seculayer, containing annotated payloads and attack types.
  • Fine-Tuning Base: Salesforce/codet5p-220m

Limitations & Risks

  • False Positives/Negatives: The model may misclassify benign strings as attacks or fail to detect obfuscated or novel attack patterns.
  • Pattern-Based Only: Thoth-v2.5 relies solely on pattern recognition and does not infer intent or contextual meaning.
  • Single-Payload Input: The model operates on isolated payload strings and does not process broader request/response context.

License & Usage Restrictions

  • License: Non-commercial use only.
  • Restrictions: This model and its outputs must not be used for commercial purposes, including integration into commercial security solutions, products, or services, without explicit written permission from Seculayer.
Downloads last month
0
Safetensors
Model size
223M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for madcows/Thoth-v2.5

Finetuned
(52)
this model