--- license: apache-2.0 datasets: - jackyhate/text-to-image-2M language: - en - zh base_model: - ByteDance-Seed/BAGEL-7B-MoT pipeline_tag: any-to-any --- # BAGEL-ReAlign (Paper Coming Soon) > A self-supervised training framework that aligns understanding and generation in modest compute, with huge **zero-shot** gain on generation and editing capability. This repository hosts the model weights for **BAGEL-ReAlign**. We fine-tuned BAGEL on 6 80GB NVIDIA A800 for only 27 GPU hours. While the understanding capability remains unchanged, our ReAlign method brings +3.6 **zero-shot improvement** on GenEval , +1.26 on DPGBench, +0.37 on ImgEdit and +0.33 on GEdit. For installation, usage instructions, and further documentation, please visit BAGEL's original [GitHub repository](https://github.com/bytedance-seed/BAGEL). ## 🧠 Method Coming soon! Stay tuned~ ## 📊 Benchmarks ### 1. Visual Understanding Remains Unchanged. ### 2. Text-to-Image Generation We test it on 1024x1024 resolution. | Model | GenEval ↑ | DPGBench ↑ | WISE ↑ | | ------------ | --------- | --------- | --------- | | **BAGEL** | 0.787 | 84.03 | 0.50 | | **BAGEL-ReAlign** | **0.824** | **85.29** | **0.52** | ### 3. Image Editing | Model | GEdit-Bench-EN (SC) ↑ | GEdit-Bench-EN (PQ) ↑ | GEdit-Bench-EN (O) ↑ | ImgEdit ↑ | | ------------- | --------------------- | --------------------- | ------------------- | ------------------ | | **BAGEL** | 7.96 | 6.64 | 6.94 | 3.38 | | **BAGEL-NHR** | 8.04 | 6.87 | 7.08 | 3.48 | | **BAGEL-ReAlign** | **8.24** | 6.87 | **7.27** | **3.75** | | **FLUX Kontext** | 6.95 | **7.30** | 6.27 | 3.59 | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e99fc07e2ec711a7138262/lGur0scJWaCGkAwH2AHxy.png) ## License BAGEL-ReAlign is licensed under the Apache 2.0 license. ## ✍️ Citation Coming soon!