SVQVAE (Scalable Vector Quantized Variational Autoencoder)

Github: https://github.com/Open-Model-Initiative/SVQVAE

A scalable Vector Quantized Variational Autoencoder (VQVAE) for high-resolution image generation and reconstruction. This model supports tiled processing for handling large images efficiently.

Model Description

SVQVAE is a scalable variant of the Vector Quantized Variational Autoencoder that can process high-resolution images through tiled encoding and decoding. The model uses a discrete codebook to compress images into a latent representation and can reconstruct them at multiple scales.

Key Features

  • Scalable Processing: Handles high-resolution images through tiled processing
  • Multi-scale Output: Can generate reconstructions at different scales
  • Vector Quantization: Uses a discrete codebook for efficient compression
  • Attention Mechanisms: Includes self-attention blocks for better feature learning
  • Flexible Architecture: Configurable encoder/decoder with customizable channel multipliers

Citation

If you use this code in your research, please cite Austin J. Bryant and the Open Model Initiative.

Acknowledgments

This implementation is based on the VQVAE architecture and includes improvements for scalable processing of high-resolution images.

Repository Links

  • GitHub Repository: Open-Model-Initiative/SVQVAE
  • Model Weights: Available in this Hugging Face repository
  • Documentation: See the GitHub repository for detailed documentation and examples

This model is licensed under the OpenMDW License Agreement (See LICENSE)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support