FrankCCCCC commited on
Commit
e43818d
·
verified ·
1 Parent(s): e92e109

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,3 +1,111 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Flow Matching CIFAR-10 Model
2
+
3
+ A flow matching model for unconditional image generation trained on the CIFAR-10 dataset. This model uses continuous normalizing flows with the FlowMatchEulerDiscreteScheduler for efficient sampling.
4
+
5
+ ## Model Details
6
+
7
+ - **Architecture**: UNet2DModel with flow matching
8
+ - **Dataset**: CIFAR-10 (32x32 RGB images)
9
+ - **Scheduler**: FlowMatchEulerDiscreteScheduler
10
+ - **Training Steps**: 1000 timesteps
11
+ - **Framework**: Diffusers 0.35.0.dev0
12
+
13
+ ## Flow Matching Configuration
14
+
15
+ The model uses FlowMatchEulerDiscreteScheduler with the following key parameters:
16
+ - **Base shift**: 0.5
17
+ - **Shift**: 1.0 (exponential time shifting)
18
+ - **Base image sequence length**: 256
19
+ - **Max image sequence length**: 4096
20
+ - **Stochastic sampling**: Disabled for deterministic generation
21
+
22
+ ## Usage
23
+
24
+ ### Basic Generation
25
+
26
+ ```python
27
+ from diffusers import DDPMPipeline
28
+
29
+ # Load the flow matching model
30
+ pipeline = DDPMPipeline.from_pretrained("FrankCCCCC/cfm-cifar10-32")
31
+
32
+ # Generate an image
33
+ image = pipeline().images[0]
34
+ image.save("generated_cifar10.png")
35
+ ```
36
+
37
+ ### Custom Inference Steps
38
+
39
+ ```python
40
+ from diffusers import DDPMPipeline
41
+
42
+ pipeline = DDPMPipeline.from_pretrained("FrankCCCCC/cfm-cifar10-32")
43
+
44
+ # Generate with custom number of inference steps
45
+ num_inference_steps: int = 1000
46
+ pipeline.scheduler.set_timesteps(num_inference_steps)
47
+ image = pipeline().images[0]
48
+ image.save("fast_generated_cifar10.png")
49
+ ```
50
+
51
+ ### Batch Generation
52
+
53
+ ```python
54
+ from diffusers import DDPMPipeline
55
+
56
+ pipeline = DDPMPipeline.from_pretrained("FrankCCCCC/cfm-cifar10-32")
57
+
58
+ # Generate multiple images at once
59
+ images = pipeline(batch_size=4).images
60
+ for i, image in enumerate(images):
61
+ image.save(f"generated_cifar10_{i}.png")
62
+ ```
63
+
64
+ ## Flow Matching vs Standard Diffusion
65
+
66
+ This model implements flow matching, which offers several advantages over standard diffusion models:
67
+ - **Faster sampling**: More efficient ODE solving with fewer steps
68
+ - **Better training stability**: Continuous normalizing flows provide smoother optimization
69
+ - **Flexible scheduling**: Exponential time shifting for improved sample quality
70
+
71
+ ## Model Architecture
72
+
73
+ - **UNet**: Standard UNet2DModel for denoising/flow prediction
74
+ - **Scheduler**: FlowMatchEulerDiscreteScheduler with exponential time shifting
75
+ - **Output**: 32x32 RGB images matching CIFAR-10 distribution
76
+
77
+ ## Requirements
78
+
79
+ ```bash
80
+ pip install diffusers torch torchvision
81
+ ```
82
+
83
+ ## Samples
84
+ 1. ![sample_1](https://huggingface.co/FrankCCCCC/cfm-cifar10-32/resolve/main/images/sample0.jpg)
85
+ 2. ![sample_2](https://huggingface.co/FrankCCCCC/cfm-cifar10-32/resolve/main/images/sample1.jpg)
86
+ 3. ![sample_3](https://huggingface.co/FrankCCCCC/cfm-cifar10-32/resolve/main/images/sample2.jpg)
87
+ 4. ![sample_4](https://huggingface.co/FrankCCCCC/cfm-cifar10-32/resolve/main/images/sample3.jpg)
88
+ 5. ![sample_5](https://huggingface.co/FrankCCCCC/cfm-cifar10-32/resolve/main/images/sample4.jpg)
89
+
90
+ ## Citation
91
+
92
+ If you use this model, please cite the original flow matching and diffusion literature:
93
+
94
+ ```bibtex
95
+ @inproceedings{DDPM,
96
+ author = {Ho, Jonathan and Jain, Ajay and Abbeel, Pieter},
97
+ booktitle = {Advances in Neural Information Processing Systems},
98
+ title = {Denoising Diffusion Probabilistic Models},
99
+ url = {https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf},
100
+ year = {2020}
101
+ }
102
+
103
+ @inproceedings{FM,
104
+ title={Flow Matching for Generative Modeling},
105
+ author={Yaron Lipman and Ricky T. Q. Chen and Heli Ben-Hamu and Maximilian Nickel and Matthew Le},
106
+ booktitle={The Eleventh International Conference on Learning Representations },
107
+ year={2023},
108
+ url={https://openreview.net/forum?id=PqvMRDCJT9t}
109
+ }
110
+
111
+ ```
config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "beta_1": 0.9,
3
+ "beta_2": 0.999,
4
+ "epsilon": 1e-08,
5
+ "lr_sched_num_warmup_steps": 45000,
6
+ "lr_sched_lr_end": 1e-07,
7
+ "lr_sched_power": 1.0,
8
+ "ep_model_dir": "epochs",
9
+ "output_dir": "fm_cifar10",
10
+ "ckpt_dir": "ckpt",
11
+ "data_ckpt_dir": "data.ckpt",
12
+ "is_save_all_model_epochs": false,
13
+ "args_key": "args",
14
+ "default_key": "default",
15
+ "final_key": "final",
16
+ "config_file": "config.json",
17
+ "project": "cfm-training",
18
+ "run_name": "train_cfm",
19
+ "model_id": "google/ddpm-cifar10-32",
20
+ "batch_size": 256,
21
+ "num_epochs": 1000,
22
+ "lr": 0.0005,
23
+ "weight_decay": 0.0,
24
+ "num_train_timesteps": 1000,
25
+ "num_inference_steps": 1000,
26
+ "sigma_min": 0.0,
27
+ "seed": 42,
28
+ "device": "cuda:0"
29
+ }
images/sample0.jpg ADDED
images/sample1.jpg ADDED
images/sample2.jpg ADDED
images/sample3.jpg ADDED
images/sample4.jpg ADDED
model_index.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "DDPMPipeline",
3
+ "_diffusers_version": "0.35.0.dev0",
4
+ "_name_or_path": "/home/sc3379/workspace/research/cfm-cifar10-32",
5
+ "scheduler": [
6
+ "diffusers",
7
+ "FlowMatchEulerDiscreteScheduler"
8
+ ],
9
+ "unet": [
10
+ "diffusers",
11
+ "UNet2DModel"
12
+ ]
13
+ }
scheduler/scheduler_config.json ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "FlowMatchEulerDiscreteScheduler",
3
+ "_diffusers_version": "0.35.0.dev0",
4
+ "base_image_seq_len": 256,
5
+ "base_shift": 0.5,
6
+ "invert_sigmas": false,
7
+ "max_image_seq_len": 4096,
8
+ "max_shift": 1.15,
9
+ "num_train_timesteps": 1000,
10
+ "shift": 1.0,
11
+ "shift_terminal": null,
12
+ "stochastic_sampling": false,
13
+ "time_shift_type": "exponential",
14
+ "use_beta_sigmas": false,
15
+ "use_dynamic_shifting": false,
16
+ "use_exponential_sigmas": false,
17
+ "use_karras_sigmas": false
18
+ }
unet/config.json ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "UNet2DModel",
3
+ "_diffusers_version": "0.35.0.dev0",
4
+ "_name_or_path": "/home/sc3379/workspace/research/cfm-cifar10-32/unet",
5
+ "act_fn": "silu",
6
+ "add_attention": true,
7
+ "attention_head_dim": null,
8
+ "attn_norm_num_groups": null,
9
+ "block_out_channels": [
10
+ 128,
11
+ 256,
12
+ 256,
13
+ 256
14
+ ],
15
+ "center_input_sample": false,
16
+ "class_embed_type": null,
17
+ "down_block_types": [
18
+ "DownBlock2D",
19
+ "AttnDownBlock2D",
20
+ "DownBlock2D",
21
+ "DownBlock2D"
22
+ ],
23
+ "downsample_padding": 0,
24
+ "downsample_type": "conv",
25
+ "dropout": 0.0,
26
+ "flip_sin_to_cos": false,
27
+ "freq_shift": 1,
28
+ "in_channels": 3,
29
+ "layers_per_block": 2,
30
+ "mid_block_scale_factor": 1,
31
+ "mid_block_type": "UNetMidBlock2D",
32
+ "norm_eps": 1e-06,
33
+ "norm_num_groups": 32,
34
+ "num_class_embeds": null,
35
+ "num_train_timesteps": null,
36
+ "out_channels": 3,
37
+ "resnet_time_scale_shift": "default",
38
+ "sample_size": 32,
39
+ "time_embedding_dim": null,
40
+ "time_embedding_type": "positional",
41
+ "up_block_types": [
42
+ "UpBlock2D",
43
+ "UpBlock2D",
44
+ "AttnUpBlock2D",
45
+ "UpBlock2D"
46
+ ],
47
+ "upsample_type": "conv"
48
+ }
unet/diffusion_pytorch_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:97d25692dbd390a357e7375966ecd521418d7a9623a01037dc1aeef809142980
3
+ size 143020060