Nellyw888
/

VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb

Reinforcement Learning

text-generation

text-generation-inference

Model card Files Files and versions Community

Nellyw888 commited on 19 days ago

Commit

b6146ea

·

verified ·

1 Parent(s): 996d6c5

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -79,8 +79,9 @@ The GRPO (Generative Reinforcement Learning from Preference Optimization) traini
    ```
 ## Citation
-Please cite our paper if you use our model:
 @misc{wang2025verireasonreinforcementlearningtestbench,
       title={VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation},
       author={Yiting Wang and Guoheng Sun and Wanghao Ye and Gang Qu and Ang Li},
@@ -90,7 +91,7 @@ Please cite our paper if you use our model:
       primaryClass={cs.AI},
       url={https://arxiv.org/abs/2505.11849},
 }
 ## Acknowledgement
 This repo benefits from OpenR1 and LLamaFactory.

    ```
 ## Citation
+Please cite our paper if you use our model or dataset:
+```bibtex
 @misc{wang2025verireasonreinforcementlearningtestbench,
       title={VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation},
       author={Yiting Wang and Guoheng Sun and Wanghao Ye and Gang Qu and Ang Li},
       primaryClass={cs.AI},
       url={https://arxiv.org/abs/2505.11849},
 }
+```
 ## Acknowledgement
 This repo benefits from OpenR1 and LLamaFactory.