Model Card for Model ID

Model Details

Model Description

This model combines Convolutional Neural Networks (CNN) and Gated Recurrent Units (GRU), designed for sequence-based tasks like time series analysis, natural language processing (NLP), or anomaly detection.

1. Input Layer

Shape: (None, 384) — Variable batch size, input dimension of 384.
Reshape: Converts input to (None, 384, 1) to add a channel dimension for Conv1D layers.

2. Two Parallel Branches

a) CNN Branch

Conv1D Layers:
- Filters: 32, 64, 128, 256 (increasing depth)
- Kernel size: (not shown, likely small like 3)
MaxPooling1D: Applied after each Conv1D layer to reduce dimensionality.
GlobalMaxPooling1D: Final pooling layer reducing output to shape (None, 256).

b) GRU Branch

GRU Layers:
- Units: 32, 64, 128, 256 (increasing capacity)
- Stacked for hierarchical feature extraction.
- Final GRU outputs shape (None, 256).

3. Fusion Layer

Multiply: Element-wise multiplication of outputs from CNN and GRU branches.
Shape: (None, 256)

4. Dense Layers

Dropout: Applied for regularization.
Fully Connected Layers:
- 256 → 128 → 64 → 32 → 1
- Gradually reducing dimensions for feature compression.
Output: A single value — suitable for regression or binary classification.

5. Likely Use Cases

Web attack detection
Sequence classification
Anomaly detection in time series

This architecture captures both spatial features (CNN) and temporal dependencies (GRU), making it well-suited for complex sequential data. Let me know if you’d like help tweaking or interpreting this model! 🚀

Developed by: noobpk

Model Sources

Article : Research and Development of a Smart Solution for Runtime Web Application Self-Protection

Uses

Direct Use

Intrusion Detection: Identify suspicious activity in network traffic data.
Sentiment Analysis: Analyze sequential text data to determine sentiment polarity.
Time Series Forecasting: Predict future values based on historical data trends.

Out-of-Scope Use

Image classification: This model is not optimized for handling spatial features in images.
Tabular data analysis: It’s designed for sequential data and may not capture non-temporal relationships well.

Bias, Risks, and Limitations

Data Bias: The model’s performance heavily depends on the quality and diversity of training data. Biased or imbalanced datasets could lead to unfair or inaccurate predictions.
Overfitting: With its depth and complexity, the model may overfit smaller datasets, capturing noise rather than meaningful patterns.
Interpretability: CNN-GRU models can be seen as black boxes, making it difficult to interpret why specific predictions are made.
Computational Costs: The parallel CNN-GRU architecture can demand significant resources during training and inference, potentially leading to longer processing times.

Recommendations

Balanced Dataset: Ensure training data represents diverse and balanced samples to mitigate bias.
Regularization: Apply dropout and early stopping to prevent overfitting.
Hyperparameter Tuning: Experiment with layer configurations, learning rates, and optimization techniques to enhance generalization.
Explainability Tools: Use SHAP or LIME libraries to interpret model predictions and understand feature importance.
Infrastructure: Deploy the model on systems with sufficient computational power, especially for real-time or large-scale applications.

How to Get Started with the Model

Use the code below to get started with the model.

import os
os.environ["KERAS_BACKEND"] = "tensorflow"
    
from tensorflow.keras.models import load_model
from sentence_transformers import SentenceTransformer
from huggingface_hub import hf_hub_download


def load_modeler():
    local_model_path = hf_hub_download(
        repo_id="noobpk/web-attack-detection",
        filename="model.h5"
    )
    return load_model(local_model_path)
    
model = load_modeler()

def load_encoder():
    model_name_or_path = os.environ.get("model_name_or_path", "sentence-transformers/all-MiniLM-L6-v2")
    return SentenceTransformer(model_name_or_path)

encoder = load_encoder()

if __name__ == "__main__":
    payload = input("Enter payload: ")
    print("Processing...")

embeddings = encoder.encode(payload).reshape((1, 384))
prediction = model.predict(embeddings)
accuracy = float(prediction[0][0] * 100)
print(f"Accuracy: {accuracy}")

Training Details

Training Data

Dataset: web-attack-detection

Using 70% for training data

Training Hyperparameters

Optimizer: Adam with initial learning rate 0.001
Learning Rate Schedule: InverseTimeDecay with decay steps of 1000 and decay rate of 0.1
Batch Size: 256
Epochs: Configurable, with early stopping after 3 epochs of no improvement
Dropout Rates:
- 0.1 after CNN and GRU branches
- 0.3 after feature fusion
Cross-Validation: K-Fold cross-validation with k=5 (or configurable)
Loss Function: Binary cross-entropy
Metrics: Accuracy

Evaluation

Testing Data, Factors & Metrics

Testing Data