--- datasets: - priamai/AnnoCTR base_model: - urchade/gliner_small-v1 tags: - Security - NER - CTI language: - en --- # AITSecNER - Entity Recognition for Cybersecurity This repository demonstrates how to use the **AITSecNER** model hosted on Hugging Face, based on the powerful GLiNER library, to extract cybersecurity-related entities from text. ## Installation Install GLiNER via pip: ```bash pip install gliner ``` ## Usage ### Import and Load Model Load the pretrained AITSecNER model directly from Hugging Face: ```python from gliner import GLiNER model = GLiNER.from_pretrained("selfconstruct3d/AITSecNER", load_tokenizer=True) ``` ### Predict Entities Define the input text and entity labels you wish to extract: ```python # Example input text text = """ Upon opening Emotet maldocs, victims are greeted with fake Microsoft 365 prompt that states “THIS DOCUMENT IS PROTECTED,” and instructs victims on how to enable macros. """ # Entity labels labels = [ 'CLICommand/CodeSnippet', 'CON', 'DATE', 'GROUP', 'LOC', 'MALWARE', 'ORG', 'SECTOR', 'TACTIC', 'TECHNIQUE', 'TOOL' ] # Predict entities entities = model.predict_entities(text, labels, threshold=0.5) # Display results for entity in entities: print(f"{entity['text']} => {entity['label']}") ``` ### Sample Output ```bash Emotet => MALWARE Microsoft => ORG ``` ## Model Details The **AITSecNER** model was fine-tuned using the [urchade/gliner_small](https://huggingface.co/urchade/gliner_small) model from Hugging Face on the [priamai/AnnoCTR dataset](https://huggingface.co/datasets/priamai/AnnoCTR). For more details about the dataset, see the paper ["AnnoCTR: A Dataset for Detecting and Linking Entities, Tactics, and Techniques in Cyber Threat Reports"](https://arxiv.org/abs/2305.10472). GLiNER is described in detail in the paper ["GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer"](https://arxiv.org/abs/2311.08526). ## About **AITSecNER** leverages GLiNER to quickly and accurately extract cybersecurity-specific entities, making it highly suitable for tasks such as: - Cyber threat intelligence analysis - Incident response documentation - Automated cybersecurity reporting