Thatguy099 commited on
Commit
47ccaa4
·
verified ·
1 Parent(s): aeec00e

Update DiffuseCraft.ipynb

Browse files
Files changed (1) hide show
  1. DiffuseCraft.ipynb +59 -62
DiffuseCraft.ipynb CHANGED
@@ -4,18 +4,13 @@
4
  "cell_type": "markdown",
5
  "metadata": {},
6
  "source": [
7
- "# DiffuseCraft: Text-to-Image Generation on T4 Colab\n",
8
  "\n",
9
- "This script uses a custom Stable Diffusion model from Hugging Face for text-to-image generation, optimized for T4 GPU with low RAM usage.\n",
10
  "\n",
11
- "**Requirements**:\n",
12
- "- T4 GPU runtime in Colab\n",
13
- "- Hugging Face account and token (for gated models)\n",
14
  "\n",
15
- "**Features**:\n",
16
- "- Uses `diffusers` library with FP16 precision\n",
17
- "- Enables model CPU offloading for low RAM\n",
18
- "- Supports custom prompts and negative prompts\n"
19
  ]
20
  },
21
  {
@@ -24,10 +19,14 @@
24
  "metadata": {},
25
  "outputs": [],
26
  "source": [
27
- "# Install required libraries\n",
28
- "!pip install -q diffusers==0.21.4 transformers==4.33.0 accelerate==0.22.0\n",
29
- "!pip install -q torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/cu118\n",
30
- "!pip install -q xformers==0.0.22\n"
 
 
 
 
31
  ]
32
  },
33
  {
@@ -36,15 +35,24 @@
36
  "metadata": {},
37
  "outputs": [],
38
  "source": [
39
- "# Import libraries\n",
40
  "import torch\n",
41
  "from diffusers import StableDiffusionPipeline\n",
42
- "from huggingface_hub import login\n",
43
- "import os\n",
44
  "\n",
45
- "# Set Hugging Face token (replace with your token)\n",
46
- "os.environ['HUGGINGFACE_TOKEN'] = 'your_hf_token_here'\n",
47
- "login(os.environ['HUGGINGFACE_TOKEN'])\n"
 
 
 
 
 
 
 
 
 
 
 
 
48
  ]
49
  },
50
  {
@@ -53,25 +61,24 @@
53
  "metadata": {},
54
  "outputs": [],
55
  "source": [
56
- "# Initialize the pipeline with optimizations\n",
57
- "model_id = 'runwayml/stable-diffusion-v1-5' # Replace with your custom HF model ID\n",
 
 
58
  "\n",
59
- "pipe = StableDiffusionPipeline.from_pretrained(\n",
60
- " model_id,\n",
61
- " torch_dtype=torch.float16,\n",
62
- " use_auth_token=True\n",
63
- ")\n",
64
- "\n",
65
- "# Enable optimizations for T4\n",
66
- "pipe = pipe.to('cuda')\n",
67
- "pipe.enable_attention_slicing() # Reduces memory usage\n",
68
- "pipe.enable_model_cpu_offload() # Offloads model to CPU when not in use\n",
 
69
  "\n",
70
- "# Optional: Enable xformers for faster inference\n",
71
- "try:\n",
72
- " pipe.enable_xformers_memory_efficient_attention()\n",
73
- "except:\n",
74
- " print('xformers not supported, proceeding without it.')\n"
75
  ]
76
  },
77
  {
@@ -80,37 +87,27 @@
80
  "metadata": {},
81
  "outputs": [],
82
  "source": [
83
- "# Define generation parameters\n",
84
- "prompt = 'A serene mountain landscape at sunset, vibrant colors, highly detailed'\n",
85
- "negative_prompt = 'blurry, low quality, artifacts, text, watermark'\n",
86
- "num_inference_steps = 30 # Lower steps for faster generation\n",
87
- "guidance_scale = 7.5\n",
88
- "\n",
89
- "# Generate image\n",
90
- "image = pipe(\n",
91
- " prompt,\n",
92
- " negative_prompt=negative_prompt,\n",
93
- " num_inference_steps=num_inference_steps,\n",
94
- " guidance_scale=guidance_scale,\n",
95
- " height=512,\n",
96
- " width=512\n",
97
- ").images[0]\n",
98
- "\n",
99
- "# Save and display image\n",
100
- "image.save('generated_image.png')\n",
101
- "image\n"
102
  ]
103
  },
104
  {
105
  "cell_type": "markdown",
106
  "metadata": {},
107
  "source": [
108
- "## Notes\n",
109
- "- Replace `'your_hf_token_here'` with your Hugging Face token.\n",
110
- "- Replace `'runwayml/stable-diffusion-v1-5'` with your custom model ID from Hugging Face.\n",
111
- "- Adjust `prompt`, `negative_prompt`, `num_inference_steps`, and `guidance_scale` as needed.\n",
112
- "- The script uses FP16 and attention slicing to minimize RAM usage.\n",
113
- "- Model CPU offloading reduces VRAM requirements, ideal for T4 GPUs.\n"
 
 
 
 
 
 
114
  ]
115
  }
116
  ],
@@ -130,7 +127,7 @@
130
  "name": "python",
131
  "nbconvert_exporter": "python",
132
  "pygments_lexer": "ipython3",
133
- "version": "3.8.10"
134
  }
135
  },
136
  "nbformat": 4,
 
4
  "cell_type": "markdown",
5
  "metadata": {},
6
  "source": [
7
+ "# DiffuseCraft: Text-to-Image Generation with Custom Model\n",
8
  "\n",
9
+ "This notebook uses a custom text-to-image model from Hugging Face to generate images from text prompts. It is optimized for use with a T4 GPU in Google Colab, with a focus on minimizing RAM usage.\n",
10
  "\n",
11
+ "## Setup\n",
 
 
12
  "\n",
13
+ "Run the following cell to install the required libraries:"
 
 
 
14
  ]
15
  },
16
  {
 
19
  "metadata": {},
20
  "outputs": [],
21
  "source": [
22
+ "!pip install --no-cache-dir diffusers transformers torch"
23
+ ]
24
+ },
25
+ {
26
+ "cell_type": "markdown",
27
+ "metadata": {},
28
+ "source": [
29
+ "Then, load the model by running the next cell. Make sure to replace `\"username/efficient-text-to-image\"` with the actual model ID from Hugging Face."
30
  ]
31
  },
32
  {
 
35
  "metadata": {},
36
  "outputs": [],
37
  "source": [
 
38
  "import torch\n",
39
  "from diffusers import StableDiffusionPipeline\n",
 
 
40
  "\n",
41
+ "model_id = \"username/efficient-text-to-image\" # Replace with actual model ID\n",
42
+ "pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)\n",
43
+ "pipe = pipe.to(\"cuda\")\n",
44
+ "pipe.enable_attention_slicing()"
45
+ ]
46
+ },
47
+ {
48
+ "cell_type": "markdown",
49
+ "metadata": {},
50
+ "source": [
51
+ "## Generate Image\n",
52
+ "\n",
53
+ "Enter your text prompt in the `prompt` variable below. You can also adjust the `height`, `width`, and `num_inference_steps` to balance between image quality and resource usage. Smaller values will use less memory but may result in lower quality images.\n",
54
+ "\n",
55
+ "Run the cell to generate and display the image."
56
  ]
57
  },
58
  {
 
61
  "metadata": {},
62
  "outputs": [],
63
  "source": [
64
+ "prompt = \"A beautiful landscape with mountains and a river\"\n",
65
+ "height = 256\n",
66
+ "width = 256\n",
67
+ "num_inference_steps = 20\n",
68
  "\n",
69
+ "with torch.inference_mode():\n",
70
+ " image = pipe(prompt, height=height, width=width, num_inference_steps=num_inference_steps).images[0]\n",
71
+ "from IPython.display import display\n",
72
+ "display(image)"
73
+ ]
74
+ },
75
+ {
76
+ "cell_type": "markdown",
77
+ "metadata": {},
78
+ "source": [
79
+ "## Clean Up\n",
80
  "\n",
81
+ "After generating the image, you can run the following cell to clear the GPU memory, which can help if you plan to generate multiple images."
 
 
 
 
82
  ]
83
  },
84
  {
 
87
  "metadata": {},
88
  "outputs": [],
89
  "source": [
90
+ "import gc\n",
91
+ "gc.collect()\n",
92
+ "torch.cuda.empty_cache()"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93
  ]
94
  },
95
  {
96
  "cell_type": "markdown",
97
  "metadata": {},
98
  "source": [
99
+ "## Save Image\n",
100
+ "\n",
101
+ "If you want to save the generated image, run the following cell:"
102
+ ]
103
+ },
104
+ {
105
+ "cell_type": "code",
106
+ "execution_count": null,
107
+ "metadata": {},
108
+ "outputs": [],
109
+ "source": [
110
+ "image.save(\"generated_image.png\")"
111
  ]
112
  }
113
  ],
 
127
  "name": "python",
128
  "nbconvert_exporter": "python",
129
  "pygments_lexer": "ipython3",
130
+ "version": "3.11.0"
131
  }
132
  },
133
  "nbformat": 4,