franken
commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -8,6 +8,10 @@ tags: []
|
|
| 8 |
|
| 9 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
|
| 13 |
## Inference
|
|
@@ -25,7 +29,7 @@ def _get_audio(wav_path):
|
|
| 25 |
return audio
|
| 26 |
|
| 27 |
model_name = "mispeech/r1-aqa"
|
| 28 |
-
audio_url = "test-mini-audios/3fe64f3d-282c-4bc8-a753-68f8f6c35652.wav"
|
| 29 |
|
| 30 |
processor = AutoProcessor.from_pretrained(model_name)
|
| 31 |
model = Qwen2AudioForConditionalGeneration.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
|
|
|
|
| 8 |
|
| 9 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 10 |
|
| 11 |
+
## Introduction
|
| 12 |
+
|
| 13 |
+
R1-AQA is based on `Qwen2-Audio-7B-Instruc`, but applied group relative policy optimization (GRPO) algorithm to the Audio Question Answering(AQA) task.
|
| 14 |
+
For more details, please refer to our [Github](https://github.com/xiaomi/r1-aqa) and [Report]().
|
| 15 |
|
| 16 |
|
| 17 |
## Inference
|
|
|
|
| 29 |
return audio
|
| 30 |
|
| 31 |
model_name = "mispeech/r1-aqa"
|
| 32 |
+
audio_url = "test-mini-audios/3fe64f3d-282c-4bc8-a753-68f8f6c35652.wav" # Copyied from MMAU dataset
|
| 33 |
|
| 34 |
processor = AutoProcessor.from_pretrained(model_name)
|
| 35 |
model = Qwen2AudioForConditionalGeneration.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
|