MutazYoune/Arabic-NER-PII2

Model Description

This is an Arabic Named Entity Recognition (NER) model fine-tuned on BERT architecture specifically for Arabic text processing. The model is based on MutazYoune/ARAB_BERT and has been trained to identify and classify named entities in Arabic text.

Model Details

Model Type: Token Classification (NER)
Language: Arabic (ar)
Base Model: MutazYoune/ARAB_BERT
Dataset: augmented_pattern2
Task: Named Entity Recognition

Training Configuration

Epochs: 30
Batch Size: 16
Learning Rate: 3e-05

Supported Entity Types

CONTACT
IDENTIFIER
NETWORK
NUMERIC_ID
PII

Usage

from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("MutazYoune/Arabic-NER-PII2")
model = AutoModelForTokenClassification.from_pretrained("MutazYoune/Arabic-NER-PII2")

# Create NER pipeline
ner_pipeline = pipeline("ner", 
                       model=model, 
                       tokenizer=tokenizer,
                       aggregation_strategy="simple")

# Example usage
text = "أحمد محمد يعمل في شركة جوجل في الرياض"
entities = ner_pipeline(text)
print(entities)

Model Performance

This model was trained on the complete dataset without validation split for final production use.

Training Data

The model was trained on custom Arabic NER dataset:

Dataset type: augmented_pattern2
Combined training and test data for final model

Citation

@misc{arabic-ner-bert,
  title={Arabic BERT NER Model},
  author={Trained on Kaggle},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/MutazYoune/Arabic-NER-PII2}
}