πŸš€ MePO: Lightweight Prompt Optimization Model

MePO is a lightweight, locally deployable prompt optimization model designed for research in prompt optimization.
It is built on the Qwen2.5-7B-Instruct base model and fine-tuned to enhance prompt effectiveness in low-resource LLM scenarios.

πŸ’» Usage

Load the model and tokenizer using Hugging Face's transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", load_in_8bit=True)
tokenizer = AutoTokenizer.from_pretrained(model_path, truncation_side='left', padding_side='left')

# Example prompt optimization
po_prompt_ins ='You are an expert of prompt optimization.\n\nSliver Prompt:\n'S_P'\n\n\nThe optional Sliver Response was generated by an AI based on the Silver Prompt. Please help modify the Silver Prompt to Golden Prompt (in English) that can obtain a more correct response, in reference to the optional Golden Response. The Golden Prompt should be strictly faithful to any factual information in the Silver Prompt. Only give me the content of Golden Prompt in English, do not contain any other information (e.g., your response of the Golden Prompt, any postfix like 'Golden Prompt', etc.).'

raw_prompt = 'who is the father of nlp?'
prompt = po_prompt_ins.replace("S_P", raw_prompt)

messages = [
    {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=1024,
    do_sample=False
)

# Remove the original input to keep only the generated response
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

optimized_prompt = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

πŸ“¦ Dataset & Code

The dataset used for training is available on Hugging Face Datasets:

  • MePO
  • MePO_BPO – Optimized prompts based on the BPO dataset
  • MePO_Alpaca – Optimized prompts based on the Alpaca dataset

Full implementation and training scripts can be found on GitHub:
πŸ”— https://github.com/MidiyaZhu/MePO

πŸ“„ Citation

If you use this model, code, or dataset, please cite our paper:

@misc{zhu2025rethinkingpromptoptimizersprompt,
  title     = {Rethinking Prompt Optimizers: From Prompt Merits to Optimization},
  author    = {Zixiao Zhu and Hanzhang Zhou and Zijian Feng and Tianjiao Li and Chua Jia Jim Deryl and Mak Lee Onn and Gee Wah Ng and Kezhi Mao},
  year      = {2025},
  eprint    = {2505.09930},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CL},
  url       = {https://arxiv.org/abs/2505.09930}
}
Downloads last month
17
Safetensors
Model size
7.61B params
Tensor type
FP16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for zixiaozhu/MePO

Base model

Qwen/Qwen2.5-7B
Finetuned
(2372)
this model

Dataset used to train zixiaozhu/MePO