arxiv:2412.14422

Enhancing Diffusion Models for High-Quality Image Generation

Published on Dec 19, 2024

Authors:

Abstract

Advanced techniques enhance DDPMs and DDIMs to improve image generation efficiency and quality through methods like Classifier-Free Guidance and Latent Diffusion Models with VAE.

AI-generated summary

This report presents the comprehensive implementation, evaluation, and optimization of Denoising Diffusion Probabilistic Models (DDPMs) and Denoising Diffusion Implicit Models (DDIMs), which are state-of-the-art generative models. During inference, these models take random noise as input and iteratively generate high-quality images as output. The study focuses on enhancing their generative capabilities by incorporating advanced techniques such as Classifier-Free Guidance (CFG), Latent Diffusion Models with Variational Autoencoders (VAE), and alternative noise scheduling strategies. The motivation behind this work is the growing demand for efficient and scalable generative AI models that can produce realistic images across diverse datasets, addressing challenges in applications such as art creation, image synthesis, and data augmentation. Evaluations were conducted on datasets including CIFAR-10 and ImageNet-100, with a focus on improving inference speed, computational efficiency, and image quality metrics like Frechet Inception Distance (FID). Results demonstrate that DDIM + CFG achieves faster inference and superior image quality. Challenges with VAE and noise scheduling are also highlighted, suggesting opportunities for future optimization. This work lays the groundwork for developing scalable, efficient, and high-quality generative AI systems to benefit industries ranging from entertainment to robotics.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2412.14422 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2412.14422 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2412.14422 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.