---
license: apache-2.0
datasets:
- jackyhate/text-to-image-2M
language:
- en
- zh
base_model:
- ByteDance-Seed/BAGEL-7B-MoT
pipeline_tag: any-to-any
---
# BAGEL-ReAlign (Paper Coming Soon)

> A self-supervised training framework that aligns understanding and generation in modest compute, with huge **zero-shot** gain on generation and editing capability.

This repository hosts the model weights for **BAGEL-ReAlign**. We fine-tuned BAGEL on 6 80GB NVIDIA A800 for only 27 GPU hours. While the understanding capability remains unchanged, our ReAlign method brings +3.6 **zero-shot improvement** on GenEval , +1.26 on DPGBench, +0.37 on ImgEdit and +0.33 on GEdit.

For installation, usage instructions, and further documentation, please visit BAGEL's original [GitHub repository](https://github.com/bytedance-seed/BAGEL).

## 🧠 Method

Coming soon! Stay tuned~

## 📊 Benchmarks

### 1. Visual Understanding

Remains Unchanged.

### 2. Text-to-Image Generation 

We test it on 1024x1024 resolution.

| Model        | GenEval ↑ | DPGBench ↑ | WISE ↑ |
| ------------ | --------- | --------- | --------- |
| **BAGEL**    | 0.787  | 84.03  | 0.50 |
| **BAGEL-ReAlign**    | **0.824**  | **85.29** | **0.52** |

### 3. Image Editing

| Model         | GEdit-Bench-EN (SC) ↑ | GEdit-Bench-EN (PQ) ↑ | GEdit-Bench-EN (O) ↑ | ImgEdit ↑ |
| ------------- | --------------------- | --------------------- | ------------------- | ------------------ |
| **BAGEL**     | 7.96 | 6.64 | 6.94 | 3.38 |
| **BAGEL-NHR** | 8.04 | 6.87  | 7.08 | 3.48 |
| **BAGEL-ReAlign** | **8.24**  | 6.87  | **7.27**  | **3.75** |
| **FLUX Kontext** | 6.95 | **7.30** | 6.27  | 3.59 |


![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e99fc07e2ec711a7138262/lGur0scJWaCGkAwH2AHxy.png)

## License

BAGEL-ReAlign is licensed under the Apache 2.0 license. 

## ✍️ Citation

Coming soon!