Flames-scorer
This is the specified scorer for Flames benchmark – a highly adversarial benchmark in Chinese for LLM's value alignment evaluation. For more detail, please refer to our paper and Github repo
Model Details
- Developed by: Shanghai AI Lab and Fudan NLP Group.
- Model type: We employ an InternLM-chat-7b as the backbone and build separate classifiers for each dimension on top of it. Then, we apply a multi-task training approach to train the scorer.
- Language(s): Chinese
- Paper: FLAMES: Benchmarking Value Alignment of LLMs in Chinese
- Contact: For questions and comments about the model, please email [email protected].
Usage
The environment can be set up as:
$ pip install -r requirements.txt
And you can use infer.py
to evaluate your model:
python infer.py --data_path YOUR_DATA_FILE.jsonl
The flames-scorer can be loaded by:
from tokenization_internlm import InternLMTokenizer
from modeling_internlm import InternLMForSequenceClassification
tokenizer = InternLMTokenizer.from_pretrained("CaasiHUANG/flames-scorer", trust_remote_code=True)
model = InternLMForSequenceClassification.from_pretrained("CaasiHUANG/flames-scorer", trust_remote_code=True)
Please note that:
- Ensure each entry in
YOUR_DATA_FILE.jsonl
includes the fields: "dimension", "prompt", and "response". - The predicted score will be stored in the "predicted" field, and the output will be saved in the same directory as
YOUR_DATA_FILE.jsonl
. - The accuracy of the Flames-scorer on out-of-distribution prompts (i.e., prompts not included in the Flames-prompts) has not been evaluated. Consequently, its predictions for such data may not be reliable.
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.