MutazYoune/Arabic-NER-PII2

Model Description

This is an Arabic Named Entity Recognition (NER) model fine-tuned on BERT architecture specifically for Arabic text processing. The model is based on MutazYoune/ARAB_BERT and has been trained to identify and classify named entities in Arabic text.

Model Details

  • Model Type: Token Classification (NER)
  • Language: Arabic (ar)
  • Base Model: MutazYoune/ARAB_BERT
  • Dataset: augmented_pattern2
  • Task: Named Entity Recognition

Training Configuration

  • Epochs: 30
  • Batch Size: 16
  • Learning Rate: 3e-05

Supported Entity Types

  • CONTACT
  • IDENTIFIER
  • NETWORK
  • NUMERIC_ID
  • PII

Usage

from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("MutazYoune/Arabic-NER-PII2")
model = AutoModelForTokenClassification.from_pretrained("MutazYoune/Arabic-NER-PII2")

# Create NER pipeline
ner_pipeline = pipeline("ner", 
                       model=model, 
                       tokenizer=tokenizer,
                       aggregation_strategy="simple")

# Example usage
text = "أحمد محمد يعمل في شركة جوجل في الرياض"
entities = ner_pipeline(text)
print(entities)

Model Performance

This model was trained on the complete dataset without validation split for final production use.

Training Data

The model was trained on custom Arabic NER dataset:

  • Dataset type: augmented_pattern2
  • Combined training and test data for final model

Citation

@misc{arabic-ner-bert,
  title={Arabic BERT NER Model},
  author={Trained on Kaggle},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/MutazYoune/Arabic-NER-PII2}
}
Downloads last month
21
Safetensors
Model size
108M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support