--- license: mit --- # facegen-facenet-unet-gan-embedding ## Model Overview ![image/png](https://cdn-uploads.huggingface.co/production/uploads/666c3d6489e21df7d4a02805/hAhkaJMaC1s122PudZXv-.png) This repository hosts a generative model trained to synthesize 128×128 face images conditioned on facial embeddings. The architecture jointly trains a **FaceNet encoder** and a **UNet-based GAN generator** to produce high-fidelity images from identity embeddings. ## Codebase, Dataset, and Model Artifacts | Category | Description | Link | |------------------|------------------------------------------------------|----------------------------------------------------------------------| | GitHub Repo | Full training and inference codebase | [GitHub Repository](https://github.com/Mayankpratapsingh022/Conditional_Face_Synthesis_with_Embedding_Conditioned_Generative_Model) | | Dataset | Cropped faces (128x128) for training | [Hugging Face Dataset](https://huggingface.co/datasets/Mayank022/Cropped_Face_Dataset_128x128) | | Trained Model | Final GAN model with FaceNet encoder | [Hugging Face Model](https://huggingface.co/Mayank022/facegen-facenet-unet-gan-embedding) | | Training Notebook| End-to-end model training pipeline in Colab | [Colab Notebook](https://colab.research.google.com/drive/16vafB_pVNk_QJpquXwxMJXNme3BCGFqS?usp=sharing) | | Inference Notebook| Generate images from embeddings | [Colab Notebook](https://colab.research.google.com/drive/1Y1s7fmyVfT2jnEL9l23jmkhISNYastds?usp=sharing) | --- ## Experimental Summary Find detailed metrics, loss trends, and inference samples in the full Weights & Biases report: > [Weights & Biases Report – Training, Metrics, and Sample Outputs](https://api.wandb.ai/links/mayankpratapsingh0022-other/x8zkffzn) ## Use Case The model can be used for: - Face reconstruction from embeddings - Conditional face generation - Evaluating zero-shot generalization for unseen identities ## Architecture - **Encoder**: Pretrained FaceNet (jointly fine-tuned) - **Generator**: Modified UNet with residual upsampling blocks and conditional embedding injection at multiple resolutions - **Embedding Injection**: Done at 8×8 resolution with skip connections ## Training Details - **Training Time**: ~6 hours on A100 - **Dataset**: 12,000 human face images - **Losses**: MSE + Perceptual + Adversarial - **Optimizers**: AdamW with learning rate scheduling ## Checkpoints - `epoch_100/` - `epoch_200/` - `epoch_300/` (final) ## Intended Use This model is intended for research purposes, particularly for understanding conditional generation and face embedding interpretability. ## Limitations - Trained on a limited dataset - May not generalize well to non-human faces or distorted embeddings