jieliu
/

Storm-7B

Text Generation

text-generation-inference

Model card Files Files and versions Community

jieliu commited on Apr 28, 2024

Commit

26cce27

·

verified ·

1 Parent(s): 1d7cdbc

Update README.md

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -81,6 +81,16 @@ response_text = generate_response(input_prompt)
 print("Response:", response_text)
 ```
 ## Limitations
 Storm-7B is a quick demonstration that a language model, fine-tuned with AI feedback, can easily surpass or match state-of-the-art models, as assessed by the same AI feedback. However, this improvement on the automatic leaderboard may not necessarily indicate better alignment with human intentions. Our model therefore represents a critical, preliminary reevaluation of the RLAIF paradigm, questioning how much learning from and being evaluated by AI feedback aligns with actual human preferences.

 print("Response:", response_text)
 ```
+## Scripts
+You can reproduce our results on AlphaEval 2.0 using the script provided below.
+```bash
+git clone https://github.com/tatsu-lab/alpaca_eval.git
+cd alpaca_eval
+pip install -e .
+export OPENAI_API_KEY=<your_api_key>
+alpaca_eval evaluate_from_model --model_configs 'Storm-7B'
+```
 ## Limitations
 Storm-7B is a quick demonstration that a language model, fine-tuned with AI feedback, can easily surpass or match state-of-the-art models, as assessed by the same AI feedback. However, this improvement on the automatic leaderboard may not necessarily indicate better alignment with human intentions. Our model therefore represents a critical, preliminary reevaluation of the RLAIF paradigm, questioning how much learning from and being evaluated by AI feedback aligns with actual human preferences.