Upload folder using huggingface_hub
Browse files- README.md +111 -3
- config.json +29 -0
- images/sample0.jpg +0 -0
- images/sample1.jpg +0 -0
- images/sample2.jpg +0 -0
- images/sample3.jpg +0 -0
- images/sample4.jpg +0 -0
- model_index.json +13 -0
- scheduler/scheduler_config.json +18 -0
- unet/config.json +48 -0
- unet/diffusion_pytorch_model.safetensors +3 -0
README.md
CHANGED
@@ -1,3 +1,111 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Flow Matching CIFAR-10 Model
|
2 |
+
|
3 |
+
A flow matching model for unconditional image generation trained on the CIFAR-10 dataset. This model uses continuous normalizing flows with the FlowMatchEulerDiscreteScheduler for efficient sampling.
|
4 |
+
|
5 |
+
## Model Details
|
6 |
+
|
7 |
+
- **Architecture**: UNet2DModel with flow matching
|
8 |
+
- **Dataset**: CIFAR-10 (32x32 RGB images)
|
9 |
+
- **Scheduler**: FlowMatchEulerDiscreteScheduler
|
10 |
+
- **Training Steps**: 1000 timesteps
|
11 |
+
- **Framework**: Diffusers 0.35.0.dev0
|
12 |
+
|
13 |
+
## Flow Matching Configuration
|
14 |
+
|
15 |
+
The model uses FlowMatchEulerDiscreteScheduler with the following key parameters:
|
16 |
+
- **Base shift**: 0.5
|
17 |
+
- **Shift**: 1.0 (exponential time shifting)
|
18 |
+
- **Base image sequence length**: 256
|
19 |
+
- **Max image sequence length**: 4096
|
20 |
+
- **Stochastic sampling**: Disabled for deterministic generation
|
21 |
+
|
22 |
+
## Usage
|
23 |
+
|
24 |
+
### Basic Generation
|
25 |
+
|
26 |
+
```python
|
27 |
+
from diffusers import DDPMPipeline
|
28 |
+
|
29 |
+
# Load the flow matching model
|
30 |
+
pipeline = DDPMPipeline.from_pretrained("FrankCCCCC/cfm-cifar10-32")
|
31 |
+
|
32 |
+
# Generate an image
|
33 |
+
image = pipeline().images[0]
|
34 |
+
image.save("generated_cifar10.png")
|
35 |
+
```
|
36 |
+
|
37 |
+
### Custom Inference Steps
|
38 |
+
|
39 |
+
```python
|
40 |
+
from diffusers import DDPMPipeline
|
41 |
+
|
42 |
+
pipeline = DDPMPipeline.from_pretrained("FrankCCCCC/cfm-cifar10-32")
|
43 |
+
|
44 |
+
# Generate with custom number of inference steps
|
45 |
+
num_inference_steps: int = 1000
|
46 |
+
pipeline.scheduler.set_timesteps(num_inference_steps)
|
47 |
+
image = pipeline().images[0]
|
48 |
+
image.save("fast_generated_cifar10.png")
|
49 |
+
```
|
50 |
+
|
51 |
+
### Batch Generation
|
52 |
+
|
53 |
+
```python
|
54 |
+
from diffusers import DDPMPipeline
|
55 |
+
|
56 |
+
pipeline = DDPMPipeline.from_pretrained("FrankCCCCC/cfm-cifar10-32")
|
57 |
+
|
58 |
+
# Generate multiple images at once
|
59 |
+
images = pipeline(batch_size=4).images
|
60 |
+
for i, image in enumerate(images):
|
61 |
+
image.save(f"generated_cifar10_{i}.png")
|
62 |
+
```
|
63 |
+
|
64 |
+
## Flow Matching vs Standard Diffusion
|
65 |
+
|
66 |
+
This model implements flow matching, which offers several advantages over standard diffusion models:
|
67 |
+
- **Faster sampling**: More efficient ODE solving with fewer steps
|
68 |
+
- **Better training stability**: Continuous normalizing flows provide smoother optimization
|
69 |
+
- **Flexible scheduling**: Exponential time shifting for improved sample quality
|
70 |
+
|
71 |
+
## Model Architecture
|
72 |
+
|
73 |
+
- **UNet**: Standard UNet2DModel for denoising/flow prediction
|
74 |
+
- **Scheduler**: FlowMatchEulerDiscreteScheduler with exponential time shifting
|
75 |
+
- **Output**: 32x32 RGB images matching CIFAR-10 distribution
|
76 |
+
|
77 |
+
## Requirements
|
78 |
+
|
79 |
+
```bash
|
80 |
+
pip install diffusers torch torchvision
|
81 |
+
```
|
82 |
+
|
83 |
+
## Samples
|
84 |
+
1. 
|
85 |
+
2. 
|
86 |
+
3. 
|
87 |
+
4. 
|
88 |
+
5. 
|
89 |
+
|
90 |
+
## Citation
|
91 |
+
|
92 |
+
If you use this model, please cite the original flow matching and diffusion literature:
|
93 |
+
|
94 |
+
```bibtex
|
95 |
+
@inproceedings{DDPM,
|
96 |
+
author = {Ho, Jonathan and Jain, Ajay and Abbeel, Pieter},
|
97 |
+
booktitle = {Advances in Neural Information Processing Systems},
|
98 |
+
title = {Denoising Diffusion Probabilistic Models},
|
99 |
+
url = {https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf},
|
100 |
+
year = {2020}
|
101 |
+
}
|
102 |
+
|
103 |
+
@inproceedings{FM,
|
104 |
+
title={Flow Matching for Generative Modeling},
|
105 |
+
author={Yaron Lipman and Ricky T. Q. Chen and Heli Ben-Hamu and Maximilian Nickel and Matthew Le},
|
106 |
+
booktitle={The Eleventh International Conference on Learning Representations },
|
107 |
+
year={2023},
|
108 |
+
url={https://openreview.net/forum?id=PqvMRDCJT9t}
|
109 |
+
}
|
110 |
+
|
111 |
+
```
|
config.json
ADDED
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"beta_1": 0.9,
|
3 |
+
"beta_2": 0.999,
|
4 |
+
"epsilon": 1e-08,
|
5 |
+
"lr_sched_num_warmup_steps": 45000,
|
6 |
+
"lr_sched_lr_end": 1e-07,
|
7 |
+
"lr_sched_power": 1.0,
|
8 |
+
"ep_model_dir": "epochs",
|
9 |
+
"output_dir": "fm_cifar10",
|
10 |
+
"ckpt_dir": "ckpt",
|
11 |
+
"data_ckpt_dir": "data.ckpt",
|
12 |
+
"is_save_all_model_epochs": false,
|
13 |
+
"args_key": "args",
|
14 |
+
"default_key": "default",
|
15 |
+
"final_key": "final",
|
16 |
+
"config_file": "config.json",
|
17 |
+
"project": "cfm-training",
|
18 |
+
"run_name": "train_cfm",
|
19 |
+
"model_id": "google/ddpm-cifar10-32",
|
20 |
+
"batch_size": 256,
|
21 |
+
"num_epochs": 1000,
|
22 |
+
"lr": 0.0005,
|
23 |
+
"weight_decay": 0.0,
|
24 |
+
"num_train_timesteps": 1000,
|
25 |
+
"num_inference_steps": 1000,
|
26 |
+
"sigma_min": 0.0,
|
27 |
+
"seed": 42,
|
28 |
+
"device": "cuda:0"
|
29 |
+
}
|
images/sample0.jpg
ADDED
![]() |
images/sample1.jpg
ADDED
![]() |
images/sample2.jpg
ADDED
![]() |
images/sample3.jpg
ADDED
![]() |
images/sample4.jpg
ADDED
![]() |
model_index.json
ADDED
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_class_name": "DDPMPipeline",
|
3 |
+
"_diffusers_version": "0.35.0.dev0",
|
4 |
+
"_name_or_path": "/home/sc3379/workspace/research/cfm-cifar10-32",
|
5 |
+
"scheduler": [
|
6 |
+
"diffusers",
|
7 |
+
"FlowMatchEulerDiscreteScheduler"
|
8 |
+
],
|
9 |
+
"unet": [
|
10 |
+
"diffusers",
|
11 |
+
"UNet2DModel"
|
12 |
+
]
|
13 |
+
}
|
scheduler/scheduler_config.json
ADDED
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_class_name": "FlowMatchEulerDiscreteScheduler",
|
3 |
+
"_diffusers_version": "0.35.0.dev0",
|
4 |
+
"base_image_seq_len": 256,
|
5 |
+
"base_shift": 0.5,
|
6 |
+
"invert_sigmas": false,
|
7 |
+
"max_image_seq_len": 4096,
|
8 |
+
"max_shift": 1.15,
|
9 |
+
"num_train_timesteps": 1000,
|
10 |
+
"shift": 1.0,
|
11 |
+
"shift_terminal": null,
|
12 |
+
"stochastic_sampling": false,
|
13 |
+
"time_shift_type": "exponential",
|
14 |
+
"use_beta_sigmas": false,
|
15 |
+
"use_dynamic_shifting": false,
|
16 |
+
"use_exponential_sigmas": false,
|
17 |
+
"use_karras_sigmas": false
|
18 |
+
}
|
unet/config.json
ADDED
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_class_name": "UNet2DModel",
|
3 |
+
"_diffusers_version": "0.35.0.dev0",
|
4 |
+
"_name_or_path": "/home/sc3379/workspace/research/cfm-cifar10-32/unet",
|
5 |
+
"act_fn": "silu",
|
6 |
+
"add_attention": true,
|
7 |
+
"attention_head_dim": null,
|
8 |
+
"attn_norm_num_groups": null,
|
9 |
+
"block_out_channels": [
|
10 |
+
128,
|
11 |
+
256,
|
12 |
+
256,
|
13 |
+
256
|
14 |
+
],
|
15 |
+
"center_input_sample": false,
|
16 |
+
"class_embed_type": null,
|
17 |
+
"down_block_types": [
|
18 |
+
"DownBlock2D",
|
19 |
+
"AttnDownBlock2D",
|
20 |
+
"DownBlock2D",
|
21 |
+
"DownBlock2D"
|
22 |
+
],
|
23 |
+
"downsample_padding": 0,
|
24 |
+
"downsample_type": "conv",
|
25 |
+
"dropout": 0.0,
|
26 |
+
"flip_sin_to_cos": false,
|
27 |
+
"freq_shift": 1,
|
28 |
+
"in_channels": 3,
|
29 |
+
"layers_per_block": 2,
|
30 |
+
"mid_block_scale_factor": 1,
|
31 |
+
"mid_block_type": "UNetMidBlock2D",
|
32 |
+
"norm_eps": 1e-06,
|
33 |
+
"norm_num_groups": 32,
|
34 |
+
"num_class_embeds": null,
|
35 |
+
"num_train_timesteps": null,
|
36 |
+
"out_channels": 3,
|
37 |
+
"resnet_time_scale_shift": "default",
|
38 |
+
"sample_size": 32,
|
39 |
+
"time_embedding_dim": null,
|
40 |
+
"time_embedding_type": "positional",
|
41 |
+
"up_block_types": [
|
42 |
+
"UpBlock2D",
|
43 |
+
"UpBlock2D",
|
44 |
+
"AttnUpBlock2D",
|
45 |
+
"UpBlock2D"
|
46 |
+
],
|
47 |
+
"upsample_type": "conv"
|
48 |
+
}
|
unet/diffusion_pytorch_model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:97d25692dbd390a357e7375966ecd521418d7a9623a01037dc1aeef809142980
|
3 |
+
size 143020060
|