Model Overview

Developers Microsoft
Architecture 14B parameters, dense decoder-only Transformer model
Inputs Text, best suited for prompts in the chat format
Context length 16K tokens
Outputs Generated text in response to input
License MIT

Training Datasets

Our training data is an extension of the data used for cyber-llm-14b and includes a wide variety of sources from:

  1. Publicly available blogs, papers, reference from: https://github.com/PEASEC/cybersecurity_dataset.

  2. Newly created synthetic, "textbook-like" data for the purpose of teaching cybersecurity (use GPT-4o).

  3. Acquired academic books and Q&A datasets

Usage

Input Formats

Given the nature of the training data, cyber-llm-14b is best suited for prompts using the chat format as follows:

<|begin_of_text|><|start_header_id|>user<|end_header_id|>
Hello!<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Hey there! How are you?<|eot_id|><|start_header_id|>user<|end_header_id|>
I'm great thanks!<|eot_id|>

With transformers

import transformers

pipeline = transformers.pipeline(
    "text-generation",
    model="viettelsecurity-ai/cyber-llm-14b",
    model_kwargs={"torch_dtype": "auto"},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are a SOC-tier3"},
    {"role": "user", "content": "What is the url phishing?"},
]

outputs = pipeline(messages, max_new_tokens=2048)
print(outputs[0]["generated_text"][-1])
Downloads last month
9
Safetensors
Model size
14.7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for viettelsecurity-ai/cyber-llm-14b

Base model

microsoft/phi-4
Finetuned
(69)
this model