Model Card for cisco-ai/SecureBERT2.0-base

SecureBERT 2.0 Base is a domain-specific transformer model optimized for cybersecurity tasks. It extends the ModernBERT architecture with cybersecurity-focused pretraining to produce contextualized embeddings for both technical text and code. SecureBERT 2.0 supports tasks like masked language modeling, semantic search, named entity recognition, vulnerability detection, and code analysis.

Model Details

Model Description

SecureBERT 2.0 Base is designed for deep contextual understanding of cybersecurity language and code. It leverages domain-specific pretraining on a large, heterogeneous corpus covering threat reports, blogs, documentation, and codebases, making it effective for reasoning across natural language and programming syntax.

Developed by: Cisco AI
Model type: Transformer (ModernBERT architecture)
Language: English
License: Apache 2.0
Finetuned from model: answerdotai/ModernBERT-base

Model Sources

Repository: https://huggingface.co/cisco-ai/SecureBERT2.0-base
Paper: arXiv:2510.00240

Uses

Direct Use

Masked language modeling for cybersecurity text and code
Embedding generation for semantic search and retrieval
Code and text feature extraction for downstream classification or clustering
Named entity recognition (NER) on security-related entities
Vulnerability detection in source code

Downstream Use

Fine-tuning for:

Threat intelligence extraction
Security question answering
Incident analysis and summarization
Automated code review and vulnerability prediction

Out-of-Scope Use

Non-English or non-technical text
General-purpose conversational AI
Decision-making in real-time security systems without human oversight

Bias, Risks, and Limitations

The model reflects biases in the cybersecurity sources it was trained on, which may include:

Overrepresentation of certain threat actors, technologies, or organizations
Inconsistent code or documentation quality
Limited exposure to non-public or proprietary data formats

Recommendations

Users should evaluate outputs in their specific context and avoid automated high-stakes decisions without expert validation.

How to Get Started with the Model

from transformers import AutoModelForMaskedLM, AutoTokenizer

model_name = "cisco-ai/SecureBERT2.0-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForMaskedLM.from_pretrained(model_name)

text = "The malware exploits a vulnerability in the [MASK] system."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

predicted_token_id = outputs.logits.argmax(-1)
predicted_word = tokenizer.decode(predicted_token_id[0])
print(predicted_word)

Training Details

Training Procedure

Preprocessing

Hybrid tokenization for text and code (natural language + structured syntax).

Training Hyperparameters

Objective: Masked Language Modeling (MLM)
Masking probability: 0.10
Optimizer: AdamW
Learning rate: 5e-5
Weight decay: 0.01
Epochs: 20
Batch size: 16 per GPU × 8 GPUs
Curriculum: Microannealing (gradual dataset diversification)

Evaluation

Testing Data, Factors & Metrics

Testing Data

Internal held-out subset of cybersecurity and code corpora.

Factors

Evaluated across token categories:

Objects (nouns)
Actions (verbs)
Code tokens

Metrics

Top-n accuracy on masked token prediction.

Results

Top-n	Objects (Nouns)	Verbs (Actions)	Code Tokens
1	56.20 %	45.02 %	39.27 %
2	69.73 %	60.00 %	46.90 %
3	75.85 %	66.68 %	50.87 %
4	80.01 %	71.56 %	53.36 %
5	82.72 %	74.12 %	55.41 %
10	88.80 %	81.64 %	60.03 %

This figure presents a comparative study of SecureBERT 2.0, SecureBERT, and ModernBERT on the masked language modeling (MLM) task. This shows SecureBERT 2.0 outperforms both the original SecureBERT and generic ModernBERT, particularly in code understanding and domain-specific terms.

Summary

SecureBERT 2.0 outperforms both the original SecureBERT and ModernBERT on cybersecurity-specific and code-related tasks.

Environmental Impact

Hardware Type: 8× GPU cluster
Hours used: [Information Not Available]
Cloud Provider: [Information Not Available]
Compute Region: [Information Not Available]
Carbon Emitted: [Estimate Not Available]

Carbon footprint can be estimated using Lacoste et al. (2019).

Technical Specifications

Model Architecture and Objective

Architecture: ModernBERT
Max sequence length: 1024 tokens
Parameters: 150 M
Objective: Masked Language Modeling (MLM)
Tensor type: F32

Compute Infrastructure

Framework: Transformers (PyTorch)
Mixed Precision: fp32
Hardware: 8 GPUs
Checkpoint Format: Safetensors

Citation

BibTeX:

@article{aghaei2025securebert,
  title={SecureBERT 2.0: Advanced Language Model for Cybersecurity Intelligence},
  author={Aghaei, Ehsan and Jain, Sarthak and Arun, Prashanth and Sambamoorthy, Arjun},
  journal={arXiv preprint arXiv:2510.00240},
  year={2025}
}

APA:

Cisco AI (2025). SecureBERT 2.0: A Domain-Specific Transformer for Cybersecurity and Code Understanding. arXiv:2510.00240.

Model Card Authors

Cisco AI

Model Card Contact

For inquiries, please contact [email protected]

Downloads last month: 158

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for cisco-ai/SecureBERT2.0-base

Base model

answerdotai/ModernBERT-base

Finetuned

(844)

this model

Finetunes

4 models

Evaluation results

Metadata error: specify a dataset to view leaderboard