---
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen3-8B
pipeline_tag: image-text-to-text
tags:
- Bee-8B
- Fully-Open-MLLMs
datasets:
- Open-Bee/Honey-Data-15M
library_name: transformers
---
# Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs

[[🏠 Homepage](https://open-bee.github.io/)] [[📖 Arxiv Paper](https://arxiv.org/pdf/2510.13795)] [[🤗 Models & Datasets](https://huggingface.co/collections/Open-Bee/bee-8b-68ecbf10417810d90fbd9995)] [[💻 Code(coming soon)](https://github.com/Open-Bee)]

## Introduction

We introduce **Bee-8B**, a new state-of-the-art, fully open 8B Multimodal Large Language Model (MLLM) designed to close the performance gap with proprietary models by focusing on data quality.

Bee-8B is trained on our new **Honey-Data-15M** corpus, a high-quality supervised fine-tuning (SFT) dataset of approximately 15 million samples. This dataset was meticulously created with our transparent, adaptable, and open-source data curation pipeline, **HoneyPipe**, which systematically cleans noisy data and enriches it with a novel dual-level (short and long) Chain-of-Thought (CoT) strategy.

This dataset enables Bee-8B to achieve exceptional performance, particularly in complex reasoning, establishing a new standard for fully open MLLMs.

## Key Features

  - **High-Quality, Large-Scale Dataset:** We release **Honey-Data-15M**, a new 15M-sample SFT corpus. It has undergone extensive cleaning to remove widespread noise and has been enriched with dual-level CoT reasoning to enhance advanced problem-solving capabilities.
  - **Fully Open-Source Data Curation Suite:** We provide not just the data, but the entire methodology. **HoneyPipe** and its underlying framework **DataStudio** offer the community a transparent and reproducible pipeline, moving beyond static dataset releases.
  - **State-of-the-Art Open Model:** Our model, **Bee-8B**, achieves state-of-the-art performance among fully open MLLMs and is highly competitive with recent semi-open models like InternVL3.5-8B, demonstrating the power of high-quality data.

## News
  - **[2025.10.20]** 🚀 **vLLM Support is Here!** Bee-8B now supports high-performance inference with [vLLM](https://github.com/vllm-project/vllm), enabling faster and more efficient deployment for production use cases.

  - **[2025.10.13]** 🐝 **Bee-8B is Released\!** Our model is now publicly available. You can download it from [Hugging Face](https://huggingface.co/collections/Open-Bee/bee-8b-68ecbf10417810d90fbd9995).

## Quickstart

> [!NOTE]
> Below, we provide simple examples to show how to use Bee-8B with 🤗 Transformers.
> You can dynamically control the model's response by selecting one of two modes: set `enable_thinking=True` for `thinking` mode, or `enable_thinking=False` for `non-thinking` mode. The default is `thinking` mode.


### Using 🤗 Transformers to Chat

```python
import requests
import torch
from PIL import Image
from transformers import AutoModel, AutoProcessor

model_path = "Open-Bee/Bee-8B-RL"

# Load model
model = AutoModel.from_pretrained(
    model_path,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
).to("cuda")

# Load processor
processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True)

# Define conversation messages
messages = [{
    "role":
    "user",
    "content": [
        {
            "type": "image",
            "image": "https://huggingface.co/Open-Bee/Bee-8B-RL/resolve/main/assets/logo.png",
        },
        {
            "type": "text",
            "text": "Based on this picture, write an advertising slogan about Bee-8B (a Fully Open Multimodal Large Language Model)."
        },
    ],
}]

# Apply chat template
text = processor.apply_chat_template(messages,
                                     tokenize=False,
                                     add_generation_prompt=True,
                                     enable_thinking=True)

# Load image
image_url = "https://huggingface.co/Open-Bee/Bee-8B-RL/resolve/main/assets/logo.png"
image = Image.open(requests.get(image_url, stream=True).raw)

# Process inputs
inputs = processor(images=image, text=text, return_tensors="pt").to("cuda")

# Generate output
generated_ids = model.generate(**inputs, max_new_tokens=16384, temperature=0.6)
output_ids = generated_ids[0][len(inputs.input_ids[0]):]

# Decode output
output_text = processor.decode(output_ids, skip_special_tokens=True)

# Print result
print(output_text)
```

### Using vLLM for High-Performance Inference

#### Install vLLM

> [!IMPORTANT]
> Bee-8B support will be officially available in vLLM **v0.11.1**. Until then, please install vLLM from source:

```bash
git clone https://github.com/vllm-project/vllm.git
cd vllm
VLLM_USE_PRECOMPILED=1 uv pip install --editable .
```

Once vLLM v0.11.1 is released, you will be able to install it directly via pip:
```bash
pip install vllm>=0.11.1
```


#### Offline Inference
```python
from transformers import AutoProcessor
from vllm import LLM, SamplingParams
from PIL import Image
import requests


def main():

    model_path = "Open-Bee/Bee-8B-RL"

    llm = LLM(
        model=model_path,
        limit_mm_per_prompt={"image": 5},
        trust_remote_code=True,
        tensor_parallel_size=1,
        gpu_memory_utilization=0.8,
    )

    sampling_params = SamplingParams(
        temperature=0.6,
        max_tokens=16384,
    )

    image_url = "https://huggingface.co/Open-Bee/Bee-8B-RL/resolve/main/assets/logo.png"
    image = Image.open(requests.get(image_url, stream=True).raw)

    messages = [
        {
            "role":
            "user",
            "content": [
                {
                    "type": "image",
                    "image": image
                },
                {
                    "type":
                    "text",
                    "text":
                    "Based on this picture, write an advertising slogan about Bee-8B (a Fully Open Multimodal Large Language Model)."
                },
            ],
        },
    ]

    processor = AutoProcessor.from_pretrained(model_path,
                                              trust_remote_code=True)
    prompt = processor.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True,
        enable_thinking=True,
    )

    mm_data = {"image": image}
    llm_inputs = {
        "prompt": prompt,
        "multi_modal_data": mm_data,
    }

    outputs = llm.generate([llm_inputs], sampling_params=sampling_params)
    generated_text = outputs[0].outputs[0].text

    print(generated_text)


if __name__ == '__main__':
    main()
```

#### Online Serving
- Start the server
```bash
vllm serve \
    Open-Bee/Bee-8B-RL \
    --served-model-name bee-8b-rl \
    --tensor-parallel-size 8 \
    --gpu-memory-utilization 0.8 \
    --host 0.0.0.0 \
    --port 8000 \
    --trust-remote-code
```

- Using OpenAI Python Client to Query the server
```python
from openai import OpenAI

# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

# image url
image_messages = [
    {
        "role":
        "user",
        "content": [
            {
                "type": "image_url",
                "image_url": {
                    "url":
                    "https://huggingface.co/Open-Bee/Bee-8B-RL/resolve/main/assets/logo.png"
                },
            },
            {
                "type":
                "text",
                "text":
                "Based on this picture, write an advertising slogan about Bee-8B (a Fully Open Multimodal Large Language Model)."
            },
        ],
    },
]

chat_response = client.chat.completions.create(
    model="bee-8b-rl",
    messages=image_messages,
    max_tokens=16384,
    extra_body={
        "chat_template_kwargs": {
            "enable_thinking": True
        },
    },
)
print("Chat response:", chat_response.choices[0].message.content)
```

## Experimental Results

<figure align="center">
<img src="assets/results.png" alt="logo"/> 
<figcaption>Evaluation of Bee-8B against other MLLMs. We distinguish between fully open (*) and semi-open (†) models. The <strong>top</strong> and <strong>second-best</strong> scores for each benchmark are highlighted.</figcaption>
</figure>

1.  **New State-of-the-Art:** Bee-8B establishes a new performance standard for fully open MLLMs, proving highly competitive with recent semi-open models across a wide array of benchmarks.
2.  **Excellence in Complex Reasoning:** Thanks to the CoT-enriched Honey-Data-15M, Bee-8B shows its most significant advancements in complex math and reasoning. It achieves top scores on challenging benchmarks like **MathVerse**, **LogicVista**, and **DynaMath**.
3.  **Superior Document and Chart Understanding:** The model demonstrates powerful capabilities in analyzing structured visual data, securing the top rank on the **CharXiv** benchmark for both descriptive and reasoning questions.

## Acknowledgements

Bee-8B is developed based on the architectures and codebases of the following projects: [R-4B](https://huggingface.co/YannQi/R-4B), [LLaVA-OneVision](https://github.com/LLaVA-VL/LLaVA-NeXT), [SigLIP2](https://huggingface.co/google/siglip2-so400m-patch14-384), [Qwen3](https://github.com/QwenLM/Qwen3), and evaluated using [VLMEvalKit](https://github.com/open-compass/VLMEvalKit). We sincerely thank these projects for their outstanding contributions to the open-source community.