ulab-ai/personalized_router_model

This repository contains the trained PersonalizedRouter model weights saved as a .pth file.

In the project files, the suffix v1 refers to the Multi-cost-efficiency Simulation Strategy described in the paper, while v2 refers to the LLM-as-a-Judge Simulation Strategy.

For best_model_v1.pth, the model was trained on an interaction dataset generated by 10 LLMs, 240 queries, and 9 different performance and cost settings.

For best_model_v2.pth, the model was trained on an interaction dataset generated by 10 LLMs, 240 queries, and preferences from 9 different user groups.