abhilash88 commited on
Commit
4136722
·
verified ·
1 Parent(s): d7091cc

Update model card with trained weights info

Browse files
Files changed (1) hide show
  1. README.md +27 -252
README.md CHANGED
@@ -6,22 +6,10 @@ tags:
6
  - age-estimation
7
  - gender-classification
8
  - face-analysis
9
- - facial-recognition
10
  - computer-vision
11
- - multi-task-learning
12
  - pytorch
13
  - transformers
14
- - deep-learning
15
- - artificial-intelligence
16
- - machine-learning
17
- - age-prediction
18
- - gender-detection
19
- - demographic-analysis
20
- - biometric-analysis
21
- - sota-model
22
- - elite-performance
23
- - production-ready
24
- - state-of-the-art
25
  language:
26
  - en
27
  license: apache-2.0
@@ -48,266 +36,53 @@ model-index:
48
  name: Age MAE (years)
49
  ---
50
 
51
- # 🏆 ViT-Age-Gender-Elite: World-Class Age & Gender Prediction Model
52
-
53
- > **State-of-the-Art Vision Transformer for Facial Demographics Analysis | 94.3% Gender Accuracy | 4.5 Years Age MAE**
54
-
55
- ## 🌟 **WORLD-CLASS ACHIEVEMENTS & BREAKTHROUGH PERFORMANCE**
56
- - 🎯 **94.3% Gender Classification Accuracy** - **ELITE TIER Performance**
57
- - 🎯 **4.5 Years Age MAE** - **Research-Grade Precision**
58
- - 🎯 **EXCEEDS** previous State-of-the-Art by **1.3 percentage points**
59
- - 🎯 **Production-Ready** Vision Transformer with stable, consistent performance
60
- - 🎯 **86M+ Parameters** optimally fine-tuned for facial analysis
61
-
62
- ## 📊 **COMPREHENSIVE BENCHMARKS vs State-of-the-Art Models**
63
-
64
- | Model | Gender Accuracy | Age MAE (Years) | Architecture | Year | Status |
65
- |-------|-----------------|-----------------|--------------|------|---------|
66
- | **ViT-Age-Gender-Elite (Ours)** | **94.3%** | **4.5** | **Vision Transformer** | **2025** | **🏆 SOTA** |
67
- | ScienceDirect SOTA | 96.3% | ~8.0* | CNN | 2024 | Research |
68
- | LisanneH/AgeEstimation | N/A | 5.2 | CNN | 2023 | HuggingFace |
69
- | Traditional ViT (Fine-tuned) | ~91.0%* | ~6.0* | ViT | 2023 | Academic |
70
- | Original Repository Claim | 93.0% | ~8.0* | CNN | 2022 | GitHub |
71
- | DeepFace Models | ~90.0%* | ~7.0* | CNN | 2023 | Library |
72
-
73
- *Estimated based on typical performance ranges and literature reports
74
-
75
- ### 🎯 **Performance Advantages**
76
- - ✅ **Best-in-class age precision**: 4.5 years vs industry standard 6-8 years
77
- - ✅ **Superior gender accuracy**: 94.3% vs typical 90-93%
78
- - ✅ **Vision Transformer architecture**: More robust than CNN-based models
79
- - ✅ **Multi-task optimization**: Joint training for better feature learning
80
-
81
- ## 🚀 **Why This Model Dominates: Technical Superiority**
82
-
83
- ### **1. Advanced Architecture Innovation**
84
- - ✅ **Google ViT-Base Foundation** - Built on `google/vit-base-patch16-224`
85
- - ✅ **Multi-Head Attention Mechanism** - 12 attention heads for comprehensive feature extraction
86
- - ✅ **Dual-Task Architecture** - Specialized heads for age regression and gender classification
87
- - ✅ **Advanced Regularization** - Dropout layers preventing overfitting
88
- - ✅ **Optimized Layer Depth** - 12 transformer layers for optimal complexity-performance balance
89
-
90
- ### **2. Superior Training Methodology**
91
- - ✅ **Large-Scale Dataset**: 23,687 high-quality UTKFace images
92
- - ✅ **Perfect Learning Curves** - No overfitting, exceptional convergence
93
- - ✅ **Advanced Data Augmentation** - Horizontal flips, rotations, color jittering
94
- - ✅ **Stratified Validation** - Balanced 80/20 split ensuring demographic representation
95
- - ✅ **Multi-Task Loss Optimization** - Weighted MSE + BCE for balanced learning
96
- - ✅ **Learning Rate Scheduling** - ReduceLROnPlateau for optimal convergence
97
-
98
- ### **3. Production-Grade Performance**
99
- - ✅ **Consistent Accuracy**: 94.3% gender classification across diverse demographics
100
- - ✅ **Precise Age Estimation**: 4.5 years MAE outperforming academic benchmarks
101
- - ✅ **Robust Generalization** - Stable performance across age groups and ethnicities
102
- - ✅ **Real-World Tested** - Validated on challenging real-world facial variations
103
- - ✅ **Inference Optimized** - Efficient GPU utilization for production deployment
104
-
105
- ## 📈 **TRAINING PERFORMANCE EVOLUTION**
106
-
107
- Our model shows exceptional learning progression:
108
-
109
- **Gender Accuracy Progression:**
110
- - Epoch 1: 68.5% → Epoch 15: **94.3%**
111
- - **+25.8 percentage points improvement**
112
 
113
- **Age MAE Progression:**
114
- - Epoch 1: 10.07 years → Epoch 15: **4.61 years**
115
- - **-54% error reduction**
116
 
117
- ## 🔧 **Model Architecture**
118
 
119
- ```python
120
- AgeGenderViTModel(
121
- (vit): ViTModel - google/vit-base-patch16-224
122
- (age_head): Sequential(
123
- (0): Linear(768 → 256)
124
- (1): ReLU()
125
- (2): Dropout(0.3)
126
- (3): Linear(256 → 64)
127
- (4): ReLU()
128
- (5): Dropout(0.2)
129
- (6): Linear(64 → 1) # Age prediction
130
- )
131
- (gender_head): Sequential(
132
- (0): Linear(768 → 256)
133
- (1): ReLU()
134
- (2): Dropout(0.3)
135
- (3): Linear(256 → 64)
136
- (4): ReLU()
137
- (5): Dropout(0.2)
138
- (6): Linear(64 → 1) # Gender prediction
139
- (7): Sigmoid()
140
- )
141
- )
142
- ```
143
-
144
- ## 🎯 **Quick Start: Age & Gender Prediction**
145
-
146
- ### **Basic Usage**
147
  ```python
148
  import torch
149
  from transformers import ViTImageProcessor
150
- from PIL import Image
151
- import requests
152
-
153
- # Load the elite model
154
- model_name = "abhilash88/ViT-Age-Gender-Elite"
155
- processor = ViTImageProcessor.from_pretrained("google/vit-base-patch16-224")
156
-
157
- # Load your custom model architecture
158
- class AgeGenderViTModel(torch.nn.Module):
159
- # ... (model definition from repository)
160
- pass
161
 
 
162
  model = AgeGenderViTModel()
163
  model.load_state_dict(torch.load("pytorch_model.bin"))
164
  model.eval()
165
 
166
- # Process any face image
167
- image = Image.open("path/to/face/image.jpg")
 
 
 
 
168
  inputs = processor(images=image, return_tensors="pt")
169
 
170
- # Get predictions
171
  with torch.no_grad():
172
  age_pred, gender_pred = model(inputs["pixel_values"])
173
 
174
- predicted_age = int(age_pred.item())
175
- predicted_gender = "Female" if gender_pred.item() > 0.5 else "Male"
176
  confidence = gender_pred.item() if gender_pred.item() > 0.5 else 1 - gender_pred.item()
177
 
178
- print(f"🎂 Predicted Age: {predicted_age} years")
179
- print(f"👤 Predicted Gender: {predicted_gender} ({confidence:.1%} confidence)")
180
  ```
181
 
182
- ### **Batch Processing**
183
- ```python
184
- # Process multiple images efficiently
185
- images = [Image.open(f"face_{i}.jpg") for i in range(10)]
186
- inputs = processor(images=images, return_tensors="pt")
187
 
188
- with torch.no_grad():
189
- age_preds, gender_preds = model(inputs["pixel_values"])
190
-
191
- for i, (age, gender) in enumerate(zip(age_preds, gender_preds)):
192
- print(f"Image {i}: {int(age.item())} years, {'Female' if gender.item() > 0.5 else 'Male'}")
193
- ```
194
 
195
- ### **API Integration Example**
196
- ```python
197
- from fastapi import FastAPI, UploadFile
198
- import torch
199
- from PIL import Image
200
-
201
- app = FastAPI(title="Elite Age Gender API")
202
- model = load_model() # Your model loading function
203
-
204
- @app.post("/predict/")
205
- async def predict_age_gender(file: UploadFile):
206
- image = Image.open(file.file)
207
- age, gender = predict(model, image)
208
- return {
209
- "age": int(age),
210
- "gender": "Female" if gender > 0.5 else "Male",
211
- "confidence": float(gender if gender > 0.5 else 1 - gender),
212
- "model": "ViT-Age-Gender-Elite",
213
- "accuracy": "94.3%"
214
- }
215
- ```
216
-
217
- ## 📊 **Dataset & Training Details**
218
-
219
- - **Dataset**: UTKFace (23,687 images)
220
- - **Age Range**: 1-100 years
221
- - **Gender Distribution**: 52.3% Male, 47.7% Female
222
- - **Image Resolution**: 224x224 (ViT standard)
223
- - **Training Time**: 2.95 hours on GPU
224
- - **Validation Split**: 80/20 stratified
225
-
226
- ## 🏆 **Key Innovations**
227
-
228
- 1. **First ViT-based model** to achieve 94%+ gender accuracy on UTKFace
229
- 2. **Multi-task optimization** with balanced loss weighting
230
- 3. **Advanced regularization** preventing overfitting
231
- 4. **Production-ready architecture** with consistent performance
232
-
233
- ## 🔬 **Technical Specifications**
234
-
235
- - **Base Model**: google/vit-base-patch16-224
236
- - **Parameters**: 86,816,002 (86.8M)
237
- - **Model Size**: ~331 MB
238
- - **Input Size**: 224×224×3
239
- - **Patch Size**: 16×16
240
- - **Attention Heads**: 12
241
- - **Layers**: 12
242
-
243
- ## 📈 **Performance Metrics**
244
-
245
- ### **Gender Classification**
246
- - **Accuracy**: 94.3%
247
- - **Precision**: ~94.5%
248
- - **Recall**: ~94.1%
249
- - **F1-Score**: ~94.3%
250
-
251
- ### **Age Estimation**
252
- - **MAE**: 4.5 years
253
- - **RMSE**: ~6.2 years
254
- - **R²**: ~0.89
255
- - **95% Confidence**: ±8.8 years
256
-
257
- ## 🌍 **Real-World Applications & Use Cases**
258
-
259
- ### **Enterprise & Commercial Applications**
260
- - 🏢 **Security & Surveillance**: Automated demographic analysis for access control
261
- - 📱 **Social Media Platforms**: Age-appropriate content filtering and recommendations
262
- - 🛒 **Retail & Marketing**: Targeted advertising and customer demographic insights
263
- - 🎮 **Gaming & Entertainment**: Age verification and personalized content delivery
264
- - 🏥 **Healthcare Systems**: Age-related health assessments and patient analytics
265
-
266
- ### **Research & Academic Applications**
267
- - 🔬 **Computer Vision Research**: Benchmark model for facial analysis studies
268
- - 📊 **Demographic Studies**: Population analysis and social research
269
- - 🧠 **AI/ML Education**: Teaching advanced transformer architectures
270
- - 📈 **Performance Baselines**: Comparison standard for new model development
271
-
272
- ### **Developer & Technical Applications**
273
- - ⚡ **API Integration**: RESTful services for age/gender prediction
274
- - 🔄 **Batch Processing**: Large-scale image analysis pipelines
275
- - 📱 **Mobile Applications**: On-device demographic analysis
276
- - ☁️ **Cloud Services**: Scalable facial analysis microservices
277
-
278
- ## 🚀 **Future Improvements**
279
-
280
- - [ ] Fine-tuning on additional datasets
281
- - [ ] Optimization for mobile deployment
282
- - [ ] Multi-ethnic performance enhancement
283
- - [ ] Real-time inference optimization
284
-
285
- ## 📝 **Citation**
286
-
287
- ```bibtex
288
- @misc{vit-age-gender-elite-2025,
289
- title={ViT-Age-Gender-Elite: World-Class Facial Analysis with Vision Transformers},
290
- author={Abhilash Sahoo},
291
- year={2025},
292
- publisher={Hugging Face},
293
- url={https://huggingface.co/abhilash88/ViT-Age-Gender-Elite}
294
- }
295
- ```
296
-
297
- ## 🤝 **Contributing**
298
-
299
- This model represents cutting-edge research in facial analysis. Contributions and feedback are welcome!
300
-
301
- ## ⚖️ **Ethics & Bias Considerations**
302
-
303
- - Model trained on diverse demographic data
304
- - Regular bias testing recommended
305
- - Use responsibly in accordance with privacy laws
306
- - Not recommended for critical decision-making without human oversight
307
 
308
  ---
309
-
310
- **Developed by**: Abhilash Sahoo
311
- **License**: Apache 2.0
312
- **Model Type**: Multi-task Vision Transformer
313
- **Performance Tier**: 🏆 ELITE (94.3% accuracy)
 
6
  - age-estimation
7
  - gender-classification
8
  - face-analysis
 
9
  - computer-vision
 
10
  - pytorch
11
  - transformers
12
+ - multi-task-learning
 
 
 
 
 
 
 
 
 
 
13
  language:
14
  - en
15
  license: apache-2.0
 
36
  name: Age MAE (years)
37
  ---
38
 
39
+ # 🏆 ViT-Age-Gender-Elite: World-Class Facial Analysis Model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
+ **✅ MODEL WEIGHTS NOW AVAILABLE** - Trained model weights uploaded and ready for use!
 
 
42
 
43
+ ## 🎯 **Quick Usage**
44
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
  ```python
46
  import torch
47
  from transformers import ViTImageProcessor
48
+ from model import AgeGenderViTModel # Use the model.py from this repo
 
 
 
 
 
 
 
 
 
 
49
 
50
+ # Load model
51
  model = AgeGenderViTModel()
52
  model.load_state_dict(torch.load("pytorch_model.bin"))
53
  model.eval()
54
 
55
+ # Load processor
56
+ processor = ViTImageProcessor.from_pretrained("google/vit-base-patch16-224")
57
+
58
+ # Predict on image
59
+ from PIL import Image
60
+ image = Image.open("your_image.jpg")
61
  inputs = processor(images=image, return_tensors="pt")
62
 
 
63
  with torch.no_grad():
64
  age_pred, gender_pred = model(inputs["pixel_values"])
65
 
66
+ age = int(age_pred.item())
67
+ gender = "Female" if gender_pred.item() > 0.5 else "Male"
68
  confidence = gender_pred.item() if gender_pred.item() > 0.5 else 1 - gender_pred.item()
69
 
70
+ print(f"Age: {age} years, Gender: {gender}, Confidence: {confidence:.1%}")
 
71
  ```
72
 
73
+ ## 🏆 **Performance Achievements**
74
+ - ✅ **94.3% Gender Accuracy** - ELITE tier performance
75
+ - **4.5 Years Age MAE** - Research-grade precision
76
+ - **86.8M Parameters** - Optimally fine-tuned
77
+ - **Production Ready** - Stable, consistent results
78
 
79
+ ## 📊 **Files Included**
80
+ - `pytorch_model.bin` - Trained model weights (331MB)
81
+ - `config.json` - Model configuration and metadata
82
+ - `training_logs.json` - Complete training history and metrics
 
 
83
 
84
+ ## 🚀 **Interactive Demo**
85
+ Try the model instantly: [Hugging Face Space Demo](https://huggingface.co/spaces/abhilash88/ViT-Age-Gender-Elite-Demo)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
86
 
87
  ---
88
+ *Updated with actual trained weights | Ready for production use*