Trained for 0 epochs and 6750 steps.

Trained with datasets ['text-embed-cache', 'grayscale-lensing-256', 'grayscale-lensing-512']
Learning rate 1e-06, batch size 8, and 1 gradient accumulation steps.
Used DDPM noise scheduler for training with epsilon prediction type and rescaled_betas_zero_snr=False
Using 'trailing' timestep spacing.
Base model: kwai-kolors/kolors-diffusers
VAE: madebyollin/sdxl-vae-fp16-fix

Files changed (15) hide show

.gitattributes +2 -0
README.md +130 -0
assets/image_0_0.png +3 -0
assets/image_1_0.png +3 -0
model_index.json +26 -0
scheduler/scheduler_config.json +28 -0
tokenizer/special_tokens_map.json +1 -0
tokenizer/tokenizer.model +3 -0
tokenizer/tokenizer_config.json +21 -0
unet/config.json +73 -0
unet/diffusion_pytorch_model-00001-of-00002.safetensors +3 -0
unet/diffusion_pytorch_model-00002-of-00002.safetensors +3 -0
unet/diffusion_pytorch_model.safetensors.index.json +0 -0
vae/config.json +38 -0
vae/diffusion_pytorch_model.safetensors +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+assets/image_0_0.png filter=lfs diff=lfs merge=lfs -text
+assets/image_1_0.png filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,130 @@

+---
+license: apache-2.0
+base_model: "kwai-kolors/kolors-diffusers"
+tags:
+  - kolors
+  - kolors-diffusers
+  - text-to-image
+  - diffusers
+  - simpletuner
+  - safe-for-work
+  - full
+inference: true
+widget:
+- text: 'unconditional (blank prompt)'
+  parameters:
+    negative_prompt: 'blurry, cropped, ugly'
+  output:
+    url: ./assets/image_0_0.png
+- text: 'gravitational lensing effects on galaxy'
+  parameters:
+    negative_prompt: 'blurry, cropped, ugly'
+  output:
+    url: ./assets/image_1_0.png
+---
+# gravlens-grayscale
+This is a full rank finetune derived from [kwai-kolors/kolors-diffusers](https://huggingface.co/kwai-kolors/kolors-diffusers).
+The main validation prompt used during training was:
+```
+gravitational lensing effects on galaxy
+```
+## Validation settings
+- CFG: `5.0`
+- CFG Rescale: `0.0`
+- Steps: `20`
+- Sampler: `None`
+- Seed: `42`
+- Resolution: `512x512`
+Note: The validation settings are not necessarily the same as the [training settings](#training-settings).
+You can find some example images in the following gallery:
+<Gallery />
+The text encoder **was not** trained.
+You may reuse the base model text encoder for inference.
+## Training settings
+- Training epochs: 0
+- Training steps: 6750
+- Learning rate: 1e-06
+  - Learning rate schedule: constant
+  - Warmup steps: 675
+- Max grad norm: 2.0
+- Effective batch size: 8
+  - Micro-batch size: 8
+  - Gradient accumulation steps: 1
+  - Number of GPUs: 1
+- Gradient checkpointing: True
+- Prediction type: epsilon (extra parameters=['training_scheduler_timestep_spacing=trailing', 'inference_scheduler_timestep_spacing=trailing'])
+- Optimizer: optimi-lion
+- Trainable parameter precision: Pure BF16
+- Caption dropout probability: 10.0%
+## Datasets
+### grayscale-lensing-256
+- Repeats: 15
+- Total number of images: 3689
+- Total number of aspect buckets: 1
+- Resolution: 0.065536 megapixels
+- Cropped: False
+- Crop style: None
+- Crop aspect: None
+- Used for regularisation data: No
+### grayscale-lensing-512
+- Repeats: 15
+- Total number of images: 1801
+- Total number of aspect buckets: 1
+- Resolution: 0.262144 megapixels
+- Cropped: False
+- Crop style: None
+- Crop aspect: None
+- Used for regularisation data: No
+## Inference
+```python
+import torch
+from diffusers import DiffusionPipeline
+model_id = 'GazTrab/gravlens-grayscale'
+pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float32) # loading directly in bf16
+prompt = "gravitational lensing effects on galaxy"
+negative_prompt = 'blurry, cropped, ugly'
+pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
+image = pipeline(
+    prompt=prompt,
+    negative_prompt=negative_prompt,
+    num_inference_steps=20,
+    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
+    width=512,
+    height=512,
+    guidance_scale=5.0,
+    guidance_rescale=0.0,
+).images[0]
+image.save("output.png", format="PNG")
+```

assets/image_0_0.png ADDED Viewed

Git LFS Details

SHA256: 2bb77a46328c8b1f3516e8693c3ac9f7e7bd2cac03cfaf24a0999f3bfd2f7d36
Pointer size: 131 Bytes
Size of remote file: 757 kB

assets/image_1_0.png ADDED Viewed

Git LFS Details

SHA256: 83e894b97ccd1ecec7ca941909d51d2247ae5b5d9d15b380ee8b2c97f9913507
Pointer size: 132 Bytes
Size of remote file: 1.04 MB

model_index.json ADDED Viewed

	@@ -0,0 +1,26 @@

+{
+  "_class_name": "KolorsPipeline",
+  "_diffusers_version": "0.32.2",
+  "_name_or_path": "kwai-kolors/kolors-diffusers",
+  "force_zeros_for_empty_prompt": false,
+  "scheduler": [
+    "diffusers",
+    "EulerDiscreteScheduler"
+  ],
+  "text_encoder": [
+    null,
+    null
+  ],
+  "tokenizer": [
+    "kolors",
+    "ChatGLMTokenizer"
+  ],
+  "unet": [
+    "diffusers",
+    "UNet2DConditionModel"
+  ],
+  "vae": [
+    "diffusers",
+    "AutoencoderKL"
+  ]
+}

scheduler/scheduler_config.json ADDED Viewed

	@@ -0,0 +1,28 @@

+{
+  "_class_name": "EulerDiscreteScheduler",
+  "_diffusers_version": "0.32.2",
+  "beta_end": 0.014,
+  "beta_schedule": "scaled_linear",
+  "beta_start": 0.00085,
+  "clip_sample": false,
+  "clip_sample_range": 1.0,
+  "dynamic_thresholding_ratio": 0.995,
+  "final_sigmas_type": "zero",
+  "interpolation_type": "linear",
+  "num_train_timesteps": 1100,
+  "prediction_type": "epsilon",
+  "rescale_betas_zero_snr": false,
+  "sample_max_value": 1.0,
+  "set_alpha_to_one": false,
+  "sigma_max": null,
+  "sigma_min": null,
+  "skip_prk_steps": true,
+  "steps_offset": 1,
+  "thresholding": false,
+  "timestep_spacing": "leading",
+  "timestep_type": "discrete",
+  "trained_betas": null,
+  "use_beta_sigmas": false,
+  "use_exponential_sigmas": false,
+  "use_karras_sigmas": false
+}

tokenizer/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {}

tokenizer/tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e7dc4c393423b76e4373e5157ddc34803a0189ba96b21ddbb40269d31468a6f2
+size 1018370

tokenizer/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,21 @@

+{
+  "added_tokens_decoder": {},
+  "auto_map": {
+    "AutoTokenizer": [
+      "kwai-kolors/kolors-diffusers--tokenization_chatglm.ChatGLMTokenizer",
+      null
+    ]
+  },
+  "clean_up_tokenization_spaces": false,
+  "do_lower_case": false,
+  "encode_special_tokens": false,
+  "eos_token": "</s>",
+  "extra_special_tokens": {},
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "<unk>",
+  "padding_side": "left",
+  "remove_space": false,
+  "tokenizer_class": "ChatGLMTokenizer",
+  "unk_token": "<unk>",
+  "use_fast": false
+}

unet/config.json ADDED Viewed

	@@ -0,0 +1,73 @@

+{
+  "_class_name": "UNet2DConditionModel",
+  "_diffusers_version": "0.32.2",
+  "_name_or_path": "output/models/checkpoint-2700",
+  "act_fn": "silu",
+  "addition_embed_type": "text_time",
+  "addition_embed_type_num_heads": 64,
+  "addition_time_embed_dim": 256,
+  "attention_head_dim": [
+    5,
+    10,
+    20
+  ],
+  "attention_type": "default",
+  "block_out_channels": [
+    320,
+    640,
+    1280
+  ],
+  "center_input_sample": false,
+  "class_embed_type": null,
+  "class_embeddings_concat": false,
+  "conv_in_kernel": 3,
+  "conv_out_kernel": 3,
+  "cross_attention_dim": 2048,
+  "cross_attention_norm": null,
+  "down_block_types": [
+    "DownBlock2D",
+    "CrossAttnDownBlock2D",
+    "CrossAttnDownBlock2D"
+  ],
+  "downsample_padding": 1,
+  "dropout": 0.0,
+  "dual_cross_attention": false,
+  "encoder_hid_dim": 4096,
+  "encoder_hid_dim_type": "text_proj",
+  "flip_sin_to_cos": true,
+  "freq_shift": 0,
+  "in_channels": 4,
+  "layers_per_block": 2,
+  "mid_block_only_cross_attention": null,
+  "mid_block_scale_factor": 1,
+  "mid_block_type": "UNetMidBlock2DCrossAttn",
+  "norm_eps": 1e-05,
+  "norm_num_groups": 32,
+  "num_attention_heads": null,
+  "num_class_embeds": null,
+  "only_cross_attention": false,
+  "out_channels": 4,
+  "projection_class_embeddings_input_dim": 5632,
+  "resnet_out_scale_factor": 1.0,
+  "resnet_skip_time_act": false,
+  "resnet_time_scale_shift": "default",
+  "reverse_transformer_layers_per_block": null,
+  "sample_size": 128,
+  "time_cond_proj_dim": null,
+  "time_embedding_act_fn": null,
+  "time_embedding_dim": null,
+  "time_embedding_type": "positional",
+  "timestep_post_act": null,
+  "transformer_layers_per_block": [
+    1,
+    2,
+    10
+  ],
+  "up_block_types": [
+    "CrossAttnUpBlock2D",
+    "CrossAttnUpBlock2D",
+    "UpBlock2D"
+  ],
+  "upcast_attention": false,
+  "use_linear_projection": true
+}

unet/diffusion_pytorch_model-00001-of-00002.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d422b485be0c2247962e148d7e88a713880c2261fe2d12ed2d60ed9cbc8130a7
+size 9983649912

unet/diffusion_pytorch_model-00002-of-00002.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5fd7c26bbae870e0ecfb8201d29cad84fbd56760acc8465722da3a677e93c276
+size 334408336

unet/diffusion_pytorch_model.safetensors.index.json ADDED Viewed

The diff for this file is too large to render. See raw diff

vae/config.json ADDED Viewed

	@@ -0,0 +1,38 @@

+{
+  "_class_name": "AutoencoderKL",
+  "_diffusers_version": "0.32.2",
+  "_name_or_path": "madebyollin/sdxl-vae-fp16-fix",
+  "act_fn": "silu",
+  "block_out_channels": [
+    128,
+    256,
+    512,
+    512
+  ],
+  "down_block_types": [
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D"
+  ],
+  "force_upcast": false,
+  "in_channels": 3,
+  "latent_channels": 4,
+  "latents_mean": null,
+  "latents_std": null,
+  "layers_per_block": 2,
+  "mid_block_add_attention": true,
+  "norm_num_groups": 32,
+  "out_channels": 3,
+  "sample_size": 512,
+  "scaling_factor": 0.13025,
+  "shift_factor": null,
+  "up_block_types": [
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D"
+  ],
+  "use_post_quant_conv": true,
+  "use_quant_conv": true
+}

vae/diffusion_pytorch_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ba4c83179928f4f890c402a297d86a435ea020c844e323d350ca786c8a75c6c1
+size 334643268