Nellyw888
/

VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb

Reinforcement Learning

text-generation

text-generation-inference

Model card Files Files and versions Community

Nellyw888 commited on 19 days ago

Commit

efa7763

·

verified ·

1 Parent(s): b6146ea

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -11,14 +11,14 @@ base_model:
 - Qwen/Qwen2.5-Coder-7B-Instruct
 ---
-# VeriReason-Qwen2.5-7b-SFT-Reasoning
 For implementation details, visit our GitHub repository: [VeriReason](https://github.com/NellyW8/VeriReason)
 Check out our paper: [VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation](https://arxiv.org/abs/2505.11849)
 ## Update Log
-2025.05.17: Initial release of Nellyw888/VeriReason-Qwen2.5-7b-SFT-Reasoning
 ## Project Description
 This study introduces VeriReason, a novel approach utilizing reinforcement learning with testbench feedback to enhance the performance of pre-trained models for Verilog RTL code generation. VeriReason combines supervised fine-tuning with Guided Reward Proximal Optimization (GRPO) reinforcement learning, specifically tailored for RTL code generation. Using our curated high-quality training examples alongside a feedback-driven reward model, VeriReason achieves 83.1% functional correctness on the VerilogEval Machine benchmark, substantially outperforming both comparable-sized models and much larger commercial systems like GPT-4 Turbo.
@@ -39,7 +39,7 @@ You can use the model with the transformers library:
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
-model_name = "Nellyw888/Nellyw888/VeriReason-Qwen2.5-7b-SFT-Reasoning"
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
 model.eval()

 - Qwen/Qwen2.5-Coder-7B-Instruct
 ---
+# VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb
 For implementation details, visit our GitHub repository: [VeriReason](https://github.com/NellyW8/VeriReason)
 Check out our paper: [VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation](https://arxiv.org/abs/2505.11849)
 ## Update Log
+2025.05.17: Initial release of VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb
 ## Project Description
 This study introduces VeriReason, a novel approach utilizing reinforcement learning with testbench feedback to enhance the performance of pre-trained models for Verilog RTL code generation. VeriReason combines supervised fine-tuning with Guided Reward Proximal Optimization (GRPO) reinforcement learning, specifically tailored for RTL code generation. Using our curated high-quality training examples alongside a feedback-driven reward model, VeriReason achieves 83.1% functional correctness on the VerilogEval Machine benchmark, substantially outperforming both comparable-sized models and much larger commercial systems like GPT-4 Turbo.
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
+model_name = "Nellyw888/Nellyw888/VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb"
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
 model.eval()