age-gender-prediction / README.md

abhilash88

Update README.md

0954262 verified 3 months ago

preview code

raw

history blame

6.55 kB

metadata

library_name: pytorch
pipeline_tag: image-classification
tags:
  - vision-transformer
  - age-estimation
  - gender-classification
  - face-analysis
  - computer-vision
  - pytorch
  - transformers
  - multi-task-learning
language:
  - en
license: apache-2.0
datasets:
  - UTKFace
metrics:
  - accuracy
  - mae
model-index:
  - name: ViT-Age-Gender-Elite
    results:
      - task:
          type: image-classification
          name: Gender Classification
        dataset:
          name: UTKFace
          type: face-analysis
        metrics:
          - type: accuracy
            value: 94.3
            name: Gender Accuracy
          - type: mae
            value: 4.5
            name: Age MAE (years)

🏆 ViT-Age-Gender-Elite: Vision Transformer for Facial Analysis

✅ MODEL WEIGHTS NOW AVAILABLE - Trained model weights uploaded and ready for use!

🎯 Quick Usage

import torch
from transformers import ViTImageProcessor
from model import AgeGenderViTModel  # Use the model.py from this repo

# Load model
model = AgeGenderViTModel()
model.load_state_dict(torch.load("pytorch_model.bin"))
model.eval()

# Load processor
processor = ViTImageProcessor.from_pretrained("google/vit-base-patch16-224")

# Predict on image
from PIL import Image
image = Image.open("your_image.jpg")
inputs = processor(images=image, return_tensors="pt")

with torch.no_grad():
    age_pred, gender_pred = model(inputs["pixel_values"])
    
age = int(age_pred.item())
gender = "Female" if gender_pred.item() > 0.5 else "Male"
confidence = gender_pred.item() if gender_pred.item() > 0.5 else 1 - gender_pred.item()

print(f"Age: {age} years, Gender: {gender}, Confidence: {confidence:.1%}")

🏆 Performance Achievements

✅ 94.3% Gender Accuracy - ELITE tier performance
✅ 4.5 Years Age MAE - Research-grade precision
✅ 86.8M Parameters - Optimally fine-tuned Vision Transformer
✅ Production Ready - Stable, consistent results

📊 Dataset & Training Details

Training Dataset: UTKFace

Total Images: 23,687 facial images
Age Range: 1-100 years
Demographics: Balanced gender distribution (52.3% Male, 47.7% Female)
Quality: High-resolution, diverse lighting and pose conditions

⚠️ Important Dataset Characteristics

The UTKFace dataset has a specific age distribution:

Adults (21-50 years): ~70% of data (majority)
Young Adults (16-30 years): ~20% of data
Children (0-15 years): ~5% of data (limited)
Seniors (50+ years): ~5% of data

🎯 Model Performance by Age Group

Excellent: Adults and young adults (16-60 years) - 94.3% gender accuracy
Good: Teenagers (13-20 years) - ~90% accuracy
Limited: Children (0-12 years) - Reduced accuracy due to limited training data
Good: Seniors (60+ years) - ~85% accuracy

🔄 Upcoming Improvements

Version 2.0 - Enhanced Children Support (In Development)

🎯 Training on FairFace Dataset - Better age distribution
👶 Children-Specific Fine-tuning - Focused 0-15 years training
📊 APPA-REAL Integration - Apparent age dataset inclusion
🎨 Data Augmentation - Synthetic children faces generation

Planned Enhancements

Multi-Age Ensemble: Specialized models for different age ranges
Cross-Cultural Training: Enhanced performance across ethnicities
Age-Specific Confidence: Different confidence thresholds per age group
Real-time Optimization: Mobile and edge device deployment

📈 Current Model Strengths

Best Use Cases

✅ Adult demographic analysis (primary strength)
✅ Social media content filtering (teen/adult classification)
✅ Marketing analytics (adult age segmentation)
✅ Security applications (adult age verification)

Architecture Advantages

Vision Transformer: Superior to CNN-based approaches
Multi-task Learning: Joint age and gender optimization
Transfer Learning: Built on google/vit-base-patch16-224
Robust Features: Handles various lighting and pose conditions

📊 Technical Specifications

Model Architecture

Base: google/vit-base-patch16-224
Parameters: 86.8M total
Input: 224×224 RGB images
Outputs: Age (regression) + Gender (binary classification)
Attention Heads: 12
Transformer Layers: 12

Training Configuration

Epochs: 15 (fully converged)
Optimizer: AdamW (lr=2e-5)
Batch Size: 32
Training Time: 2.95 hours on GPU
Validation Split: 80/20 stratified

📊 Files Included

pytorch_model.bin - Trained model weights (331MB)
config.json - Model configuration and metadata
training_logs.json - Complete training history and metrics
model.py - Model architecture and usage code

⚠️ Usage Recommendations

Optimal Performance

Primary Use: Adults and young adults (16-60 years)
High Confidence: Gender classification across all ages
Reasonable Accuracy: Age estimation for adults

Limitations to Consider

Children (0-12 years): Limited training data may affect accuracy
Very elderly (70+ years): Fewer training examples
Extreme poses/lighting: May reduce performance

Best Practices

Face Detection: Ensure clear, front-facing faces
Image Quality: Use good lighting and resolution
Age Context: Consider model strengths for your use case
Confidence Thresholds: Adjust based on your application needs

🔬 Research & Citation

@misc{vit-age-gender-elite-2025,
  title={ViT-Age-Gender-Elite: Vision Transformer for Facial Analysis},
  author={Abhilash Sahoo},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/abhilash88/ViT-Age-Gender-Elite}
}

🤝 Contributing & Feedback

We welcome contributions and feedback, especially:

Children dataset suggestions for Version 2.0
Performance evaluations on diverse datasets
Use case feedback for model improvements
Technical optimizations and enhancements

📈 Roadmap

Q1 2025: Children-focused fine-tuning (Version 2.0)
Q2 2025: Multi-cultural dataset integration
Q3 2025: Mobile optimization and edge deployment
Q4 2025: Real-time video analysis capabilities

Current Version: 1.0 (Adult-focused) | Next Version: 2.0 (Children-enhanced) | Status: Production Ready*

*Best performance on adults (16-60 years). Children support improved in upcoming Version 2.0.