What is this?
This is an initial version of Stable Diffusion 1.5 base model, with its noise scheduler/prediction replaced with FlowMatchEulerDiscrete
This model probably has a buncha low quality stuff in it. Base model SD might give better output in many reguards. The reason this model exists is to allow other people to take advantage of FlowMatch for their own finetunes and other experiments.
For that reason, this is a FULL FP32 precision model. But the sample code below loads it as bf16.
Usage note
Original diffusers module for stable_diffusion has a hardcode that stops this working. I have submitted a patch that was accepted.. but as far as I know, it has not been added to an official release yet. So, "diffusers 0.34.0" wont work with it. That means that to use this, you currently need to either use my tweaked code, imgsample-hacked.py or manually add in the main git version to use this. eg:
pip install git+https://github.com/huggingface/diffusers
You should then be able to do the typical diffusers code. For example:
from diffusers import DiffusionPipeline
import torch.nn as nn, torch, types
import os,sys
MODEL="opendiffusionai/sd-flow-alpha"
pipe = DiffusionPipeline.from_pretrained(
MODEL, use_safetensors=True,
safety_checker=None, requires_safety_checker=False,
torch_dtype=torch.bfloat16,
)
pipe.enable_sequential_cpu_offload()
prompt="Some pretty photo of something"
images = pipe(prompt, num_inference_steps=args.steps, generator=generator).images
for i,image in enumerate(images):
fname=f"{OUTDIR}/sample{i}.png"
print(f"saving to {fname}")
image.save(fname)
ComfyUI note
From the author:
It works fine in comfy, just load the unet with the load diffusion model node and hook it to a ModelSamplingSD3 node.
For the clip/vae you can just use the one from the SD1.5 checkpoint."
Making your own FlowMatch model
Doing the training itself, did not take that long. Writing my own functional training code, and trying various pathways to find what works, took WEEKS.
That, and putting together a 40k clean ALL-SQUARE IMAGE DATASET
If you wanted to recreate your own from scratch, here's the details from one of my runs: (This only takes a few hours to complete, on a 4090)
First, download the sd base model in diffusers format, and hand-edit the model_config.json and scheduler/scheduler_config.json file. (I was going to detail it here, but... just copy/look at the files in this repo. I linked them, after all!)
(Batchsize 40, accum=1 for all)
- time blocks only, 1e-5, 350 steps (result very murky here, thats expected)
- up.0 and up.1, 1e-6, 75 steps
- mid, 1e-6, 60 steps
- up.2, 1e-6, 160 steps
- up.3, 1e-6, 120 steps
Sampling
During the first phase, maybe sample every 50 steps. After the first phase, you'll want to take samples every 10 steps. Make sure you use MULTIPLE samples, and ideally of different types. You should have at least one "single token" prompt, and then a few more complex ones.
- Downloads last month
- 41