File size: 2,946 Bytes
5a1f194 ef49162 5a1f194 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
---
license: mit
---
# facegen-facenet-unet-gan-embedding
## Model Overview

This repository hosts a generative model trained to synthesize 128×128 face images conditioned on facial embeddings. The architecture jointly trains a **FaceNet encoder** and a **UNet-based GAN generator** to produce high-fidelity images from identity embeddings.
## Codebase, Dataset, and Model Artifacts
| Category | Description | Link |
|------------------|------------------------------------------------------|----------------------------------------------------------------------|
| GitHub Repo | Full training and inference codebase | [GitHub Repository](https://github.com/Mayankpratapsingh022/Conditional_Face_Synthesis_with_Embedding_Conditioned_Generative_Model) |
| Dataset | Cropped faces (128x128) for training | [Hugging Face Dataset](https://huggingface.co/datasets/Mayank022/Cropped_Face_Dataset_128x128) |
| Trained Model | Final GAN model with FaceNet encoder | [Hugging Face Model](https://huggingface.co/Mayank022/facegen-facenet-unet-gan-embedding) |
| Training Notebook| End-to-end model training pipeline in Colab | [Colab Notebook](https://colab.research.google.com/drive/16vafB_pVNk_QJpquXwxMJXNme3BCGFqS?usp=sharing) |
| Inference Notebook| Generate images from embeddings | [Colab Notebook](https://colab.research.google.com/drive/1Y1s7fmyVfT2jnEL9l23jmkhISNYastds?usp=sharing) |
---
## Experimental Summary
Find detailed metrics, loss trends, and inference samples in the full Weights & Biases report:
> [Weights & Biases Report – Training, Metrics, and Sample Outputs](https://api.wandb.ai/links/mayankpratapsingh0022-other/x8zkffzn)
## Use Case
The model can be used for:
- Face reconstruction from embeddings
- Conditional face generation
- Evaluating zero-shot generalization for unseen identities
## Architecture
- **Encoder**: Pretrained FaceNet (jointly fine-tuned)
- **Generator**: Modified UNet with residual upsampling blocks and conditional embedding injection at multiple resolutions
- **Embedding Injection**: Done at 8×8 resolution with skip connections
## Training Details
- **Training Time**: ~6 hours on A100
- **Dataset**: 12,000 human face images
- **Losses**: MSE + Perceptual + Adversarial
- **Optimizers**: AdamW with learning rate scheduling
## Checkpoints
- `epoch_100/`
- `epoch_200/`
- `epoch_300/` (final)
## Intended Use
This model is intended for research purposes, particularly for understanding conditional generation and face embedding interpretability.
## Limitations
- Trained on a limited dataset
- May not generalize well to non-human faces or distorted embeddings |