YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🧠 Resume-Parsing-NER-AI-Model

A custom Named Entity Recognition (NER) model fine-tuned on annotated resume data using a pre-trained BERT architecture. This model extracts structured information such as names, emails, phone numbers, skills, job titles, education, and companies from raw resume text.


✨ Model Highlights

  • πŸ“Œ Base Model: bert-base-cased-resume-ner
  • πŸ“š Datasets: Custom annotated resume dataset (BIO format)
  • 🏷️ Entity Labels: Name, Email, Phone, Education, Skills, Company, Job Title
  • πŸ”§ Framework: Hugging Face Transformers + PyTorch
  • πŸ’Ύ Format: transformers model directory (with tokenizer and config)

🧠 Intended Uses

  • βœ… Resume parsing and candidate data extraction
  • βœ… Applicant Tracking Systems (ATS)
  • βœ… Automated HR screening tools
  • βœ… Resume data analytics and visualization
  • βœ… Chatbots and document understanding applications

🚫 Limitations

  • ❌ Performance may degrade on resumes with non-standard formatting
  • ❌ Might not capture entities in handwritten or image-based resumes
  • ❌ May not generalize to other document types without re-training

πŸ‹οΈβ€β™‚οΈ Training Details

Attribute Value
Base Model bert-base-cased
Dataset Food-101-Dataset
Task Type Token Classification (NER)
Epochs 3
Batch Size 16
Optimizer AdamW
Loss Function CrossEntropyLoss
Framework PyTorch + Transformers
Hardware CUDA-enabled GPU

πŸ“Š Evaluation Metrics

Metric Score
Accuracy 0.98
F1-Score 0.98
Precision 0.97
Recall 0.98

πŸš€ Usage

from datasets import load_dataset
from transformers import AutoTokenizer,
from transformers import AutoModelForTokenClassification,
from transformers import TrainingArguments, Trainer
from transformers import pipeline


# Load model and processor
model_name = "AventIQ-AI/Resume-Parsing-NER-AI-Model"
model = AutoModelForImageClassification.from_pretrained("bert-base-cased")

from transformers import pipeline

ner_pipe = pipeline("ner", model="./resume-ner-model", tokenizer="./resume-ner-model", aggregation_strategy="simple")

text = "John worked at Infosys as an Analyst. Email: [email protected]"
ner_results = ner_pipe(text)

for entity in ner_results:
    print(f"{entity['word']} β†’ {entity['entity_group']} ({entity['score']:.2f})")
label_list = [
    "O",           # 0
    "B-NAME",      # 1
    "I-NAME",      # 2
    "B-EMAIL",     # 3
    "I-EMAIL",     # 4
    "B-PHONE",     # 5
    "I-PHONE",     # 6
    "B-EDUCATION", # 7
    "I-EDUCATION", # 8
    "B-SKILL",     # 9
    "I-SKILL",     # 10
    "B-COMPANY",   # 11
    "I-COMPANY",   # 12
    "B-JOB",       # 13
    "I-JOB"        # 14
]

  • 🧩 Quantization
  • Post-training static quantization applied using PyTorch to reduce model size and accelerate inference on edge devices.

πŸ—‚ Repository Structure

.
beans-vit-finetuned/
β”œβ”€β”€ config.json               βœ… Model configuration
β”œβ”€β”€ pytorch_model.bin         βœ… Fine-tuned model weights
β”œβ”€β”€ tokenizer_config.json     βœ… Tokenizer configuration
β”œβ”€β”€ vocab.txt                 βœ… BERT vocabulary
β”œβ”€β”€ training_args.bin         βœ… Training parameters
β”œβ”€β”€ preprocessor_config.json  βœ… Optional tokenizer pre-processing info
β”œβ”€β”€ README.md                 βœ… Model card

🀝 Contributing

Open to improvements and feedback! Feel free to submit a pull request or open an issue if you find any bugs or want to enhance the model.

Downloads last month
8
Safetensors
Model size
108M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support