EviOmni-nq_train-1.5B

Introduction

EviOmni is a rational evidence extraction model. Compared to vanilla evidence extraction models, EviOmni demonstrates the superiority in terms of performance, generalization, efficiency, and robustness.

Requirements

The code of EviOmni has been in the latest Huggingface transformers and we advise you to use the latest version of transformers.

With transformers<4.37.0, you will encounter the following error:

KeyError: 'qwen2'

Quickstart

from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import StoppingCriteria, StoppingCriteriaList
import re

class MultiTokenStoppingCriteria(StoppingCriteria):
    def __init__(self, stop_ids, device):
        self.stop_ids = stop_ids
        self.stop_len = len(stop_ids)

    def __call__(self, input_ids, scores, **kwargs):
        if len(input_ids[0]) >= self.stop_len:
            last_tokens = input_ids[0][-self.stop_len:].tolist()
            return last_tokens == self.stop_ids
        return False

model_name = "HIT-TMG/EviOmni-nq_train-1.5B"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = open("eviomni_prompt", "r").read()
question = "..."
passages = "..."
instruction = prompt.format(question=question, passages=passages)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": instruction}
]

stop_token = "</extract>\n\n"
stop_ids = tokenizer.encode(stop_token, add_special_tokens=False)

stopping_criteria = StoppingCriteriaList([
    MultiTokenStoppingCriteria(stop_ids, model.device)
])

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512,
    stopping_criteria=stopping_criteria
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0].strip()
match = re.search(r"<extract>(.*?)</extract>", response, re.DOTALL)
evidence = match.group(1).strip()

Performance

Main results. main

Citation

If you find our work helpful, feel free to give us a cite.

@misc{EviOmni,
      title={Learning to Extract Rational Evidence via Reinforcement Learning for Retrieval-Augmented Generation}, 
      author={Xinping Zhao and Shouzheng Huang and Yan Zhong and Xinshuo Hu and Meishan Zhang and Baotian Hu and Min Zhang},
      year={2025},
      eprint={2507.15586},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2507.15586}, 
}
Downloads last month
65
Safetensors
Model size
1.78B params
Tensor type
BF16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for HIT-TMG/EviOmni-nq_train-1.5B

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(1151)
this model
Quantizations
1 model