karthik-2905 commited on
Commit
42c78f7
·
verified ·
1 Parent(s): da54531

Rename readme.md to README.md

Browse files
Files changed (2) hide show
  1. README.md +157 -0
  2. readme.md +0 -92
README.md ADDED
@@ -0,0 +1,157 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CIFAR-10 Diffusion Model
2
+
3
+ A lightweight diffusion model trained from scratch on the CIFAR-10 dataset in just 14.5 minutes using PyTorch.
4
+
5
+ ## Model Description
6
+
7
+ This is a **SimpleUNet-based diffusion model** trained to generate 32x32 RGB images similar to the CIFAR-10 dataset. The model demonstrates the fundamentals of diffusion-based image generation with a compact architecture suitable for educational purposes and quick experimentation.
8
+
9
+ ### Key Features
10
+ - 🚀 **Fast Training**: Complete training in under 15 minutes on RTX 3060
11
+ - 💾 **Lightweight**: Only 16.8M parameters (~64MB model size)
12
+ - 🎯 **Educational**: Clean, well-documented code for learning diffusion models
13
+ - ⚡ **Efficient Inference**: Generate images in seconds on consumer GPUs
14
+
15
+ ## Model Details
16
+
17
+ | Attribute | Value |
18
+ |-----------|-------|
19
+ | **Architecture** | SimpleUNet with ResNet blocks + Attention |
20
+ | **Parameters** | 16,808,835 |
21
+ | **Dataset** | CIFAR-10 (50,000 training images) |
22
+ | **Image Size** | 32×32 RGB |
23
+ | **Training Steps** | 7,820 (20 epochs × 391 batches) |
24
+ | **Training Time** | 14.54 minutes |
25
+ | **Hardware** | NVIDIA RTX 3060 (0.43GB VRAM used) |
26
+ | **Framework** | PyTorch 2.0+ |
27
+
28
+ ## Quick Start
29
+
30
+ ### Installation
31
+ ```bash
32
+ pip install torch torchvision matplotlib tqdm pillow numpy
33
+ ```
34
+
35
+ ### Basic Usage
36
+ ```python
37
+ import torch
38
+ import matplotlib.pyplot as plt
39
+
40
+ # Load model
41
+ checkpoint = torch.load('complete_diffusion_model.pth')
42
+ model = SimpleUNet(**checkpoint['model_config'])
43
+ model.load_state_dict(checkpoint['model_state_dict'])
44
+ model.eval()
45
+
46
+ # Initialize scheduler
47
+ scheduler = DDPMScheduler(**checkpoint['diffusion_config'])
48
+
49
+ # Generate images
50
+ @torch.no_grad()
51
+ def generate_images(model, scheduler, num_images=4):
52
+ device = next(model.parameters()).device
53
+ images = torch.randn(num_images, 3, 32, 32).to(device)
54
+
55
+ for t in range(999, -1, -20): # 50 denoising steps
56
+ timestep = torch.full((num_images,), t, device=device)
57
+ noise_pred = model(images, timestep)
58
+
59
+ # Simplified DDPM step
60
+ alpha_t = scheduler.alpha_cumprod[t]
61
+ alpha_prev = scheduler.alpha_cumprod[t-20] if t >= 20 else 1.0
62
+
63
+ pred_x0 = (images - torch.sqrt(1-alpha_t) * noise_pred) / torch.sqrt(alpha_t)
64
+ images = torch.sqrt(alpha_prev) * pred_x0 + torch.sqrt(1-alpha_prev) * noise_pred
65
+
66
+ return images
67
+
68
+ # Generate and display
69
+ generated = generate_images(model, scheduler)
70
+ ```
71
+
72
+ ## Training Details
73
+
74
+ - **Loss Function**: MSE between predicted and actual noise
75
+ - **Optimizer**: AdamW (lr=1e-4, weight_decay=1e-6)
76
+ - **Scheduler**: CosineAnnealingLR
77
+ - **Batch Size**: 128
78
+ - **Final Loss**: 0.0363 (73% reduction from initial)
79
+ - **Diffusion Steps**: 1000 (linear beta schedule)
80
+
81
+ ## Performance
82
+
83
+ ### Training Loss Curve
84
+ The model shows excellent convergence:
85
+ - **Epoch 1**: 0.1349 → **Epoch 20**: 0.0363
86
+ - **Best Loss**: 0.0358 (Epoch 19)
87
+ - **Stable convergence** without overfitting
88
+
89
+ ### Generation Quality
90
+ - ✅ Captures CIFAR-10 color distributions
91
+ - ✅ Generates diverse, non-repetitive outputs
92
+ - ⚠️ Abstract patterns (needs longer training for object recognition)
93
+ - 🎯 Suitable for color/texture generation tasks
94
+
95
+ ## Files in this Repository
96
+
97
+ | File | Description | Size |
98
+ |------|-------------|------|
99
+ | `complete_diffusion_model.pth` | Full model with config and weights | ~64MB |
100
+ | `diffusion_model_final.pth` | Training checkpoint (epoch 20) | ~64MB |
101
+ | `model_info.json` | Training metadata and hyperparameters | <1KB |
102
+ | `inference_example.py` | Complete inference script with model classes | ~5KB |
103
+
104
+ ## Model Architecture
105
+
106
+ ```
107
+ SimpleUNet(
108
+ time_embedding: TimeEmbedding(128)
109
+ encoder: 3 ResNet blocks with downsampling
110
+ middle: ResNet + Self-Attention + ResNet
111
+ decoder: 3 ResNet blocks with upsampling
112
+ output: GroupNorm → SiLU → Conv2d
113
+ )
114
+ ```
115
+
116
+ ## Use Cases
117
+
118
+ - 🎓 **Educational**: Learn diffusion model fundamentals
119
+ - 🔬 **Research**: Baseline for diffusion experiments
120
+ - 🎨 **Art**: Generate abstract textures and patterns
121
+ - ⚡ **Prototyping**: Quick diffusion model testing
122
+
123
+ ## Limitations & Improvements
124
+
125
+ ### Current Limitations
126
+ - Generates abstract patterns rather than recognizable objects
127
+ - Trained on small 32×32 resolution
128
+ - Limited to 20 training epochs
129
+
130
+ ### Suggested Improvements
131
+ 1. **Extended Training**: 50-100 epochs for better object generation
132
+ 2. **Larger Architecture**: Increase model capacity
133
+ 3. **Advanced Sampling**: Implement DDIM or DPM-Solver++
134
+ 4. **Higher Resolution**: Train on 64×64 or 128×128 images
135
+ 5. **Better Datasets**: Use CelebA-HQ or custom datasets
136
+
137
+ ## Citation
138
+
139
+ ```bibtex
140
+ @misc{cifar10-diffusion-2025,
141
+ title={CIFAR-10 Diffusion Model: Fast Training Implementation},
142
+ author={Karthik},
143
+ year={2025},
144
+ publisher={Hugging Face},
145
+ howpublished={\url{https://huggingface.co/karthik-2905/DiffusionPretrained}}
146
+ }
147
+ ```
148
+
149
+ ## License
150
+
151
+ MIT License - Free for research and commercial use.
152
+
153
+ ---
154
+
155
+ **🚀 Want to train your own?** Check out the [full implementation](https://github.com/GruheshKurra/DiffusionModelPretrained) with Jupyter notebooks and step-by-step training code!
156
+
157
+ **📊 Training Stats**: 16.8M params • 14.5min training • RTX 3060 • PyTorch 2.0
readme.md DELETED
@@ -1,92 +0,0 @@
1
- # CIFAR-10 Diffusion Model
2
-
3
- 🎨 **A diffusion model trained from scratch on CIFAR-10 dataset**
4
-
5
- ## Model Details
6
- - **Architecture**: SimpleUNet with 16.8M parameters
7
- - **Dataset**: CIFAR-10 (50,000 training images)
8
- - **Training Time**: 14.54 minutes on RTX 3060
9
- - **Final Loss**: 0.0363
10
- - **Image Size**: 32x32 RGB
11
- - **Framework**: PyTorch
12
-
13
- ## Quick Start
14
-
15
- ```python
16
- import torch
17
- from model import SimpleUNet, DDPMScheduler, generate_images
18
-
19
- # Load the trained model
20
- checkpoint = torch.load('complete_diffusion_model.pth')
21
- model = SimpleUNet(**checkpoint['model_config'])
22
- model.load_state_dict(checkpoint['model_state_dict'])
23
- model.eval()
24
-
25
- # Initialize scheduler
26
- scheduler = DDPMScheduler(**checkpoint['diffusion_config'])
27
-
28
- # Generate images
29
- generated_images = generate_images(model, scheduler, num_images=8)
30
- ```
31
-
32
- ## Installation
33
-
34
- ```bash
35
- pip install torch>=2.0.0 torchvision>=0.15.0 matplotlib tqdm pillow numpy
36
- ```
37
-
38
- ## Files Included
39
- - `complete_diffusion_model.pth` - Complete model with config (64MB)
40
- - `model_info.json` - Training details and metadata
41
- - `diffusion_model_final.pth` - Final training checkpoint (64MB)
42
- - `inference_example.py` - Ready-to-use inference script
43
-
44
- ## Training Details
45
- - **Epochs**: 20
46
- - **Batch Size**: 128
47
- - **Learning Rate**: 1e-4 (CosineAnnealingLR)
48
- - **Optimizer**: AdamW
49
- - **GPU**: NVIDIA RTX 3060 (0.43GB VRAM used)
50
- - **Loss Reduction**: 73% (from 0.1349 to 0.0363)
51
-
52
- ## Hardware Requirements
53
- - **Minimum**: 1GB VRAM for inference
54
- - **Recommended**: 2GB+ VRAM for training extensions
55
- - **CPU**: Works but slower
56
-
57
- ## Results
58
- The model generates colorful abstract patterns that capture CIFAR-10's color distributions.
59
- With more training epochs (50-100), it should produce more recognizable objects.
60
-
61
- ## Improvements
62
- To get better results:
63
- 1. **Train longer**: 50-100 epochs instead of 20
64
- 2. **Larger model**: Increase channels/layers
65
- 3. **Advanced sampling**: DDIM, DPM-Solver
66
- 4. **Better datasets**: CelebA, ImageNet
67
- 5. **Learning rate**: Experiment with schedules
68
-
69
- ## Model Architecture
70
- - **U-Net based** with ResNet blocks
71
- - **Time embedding** for diffusion timesteps
72
- - **Attention layers** at multiple resolutions
73
- - **Skip connections** for better gradient flow
74
-
75
- ## Citation
76
- ```bibtex
77
- @misc{cifar10-diffusion-2025,
78
- title={CIFAR-10 Diffusion Model},
79
- author={Your Name},
80
- year={2025},
81
- url={https://github.com/your-username/cifar10-diffusion}
82
- }
83
- ```
84
-
85
- ## License
86
- MIT License - Feel free to use and modify!
87
-
88
- ---
89
- **Created**: July 19, 2025
90
- **Training Time**: 14.54 minutes
91
- **GPU**: NVIDIA RTX 3060
92
- **Framework**: PyTorch