Update README.md
Browse files
README.md
CHANGED
@@ -11,14 +11,14 @@ base_model:
|
|
11 |
- Qwen/Qwen2.5-Coder-7B-Instruct
|
12 |
---
|
13 |
|
14 |
-
# VeriReason-Qwen2.5-7b-
|
15 |
|
16 |
For implementation details, visit our GitHub repository: [VeriReason](https://github.com/NellyW8/VeriReason)
|
17 |
|
18 |
Check out our paper: [VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation](https://arxiv.org/abs/2505.11849)
|
19 |
|
20 |
## Update Log
|
21 |
-
2025.05.17: Initial release of
|
22 |
|
23 |
## Project Description
|
24 |
This study introduces VeriReason, a novel approach utilizing reinforcement learning with testbench feedback to enhance the performance of pre-trained models for Verilog RTL code generation. VeriReason combines supervised fine-tuning with Guided Reward Proximal Optimization (GRPO) reinforcement learning, specifically tailored for RTL code generation. Using our curated high-quality training examples alongside a feedback-driven reward model, VeriReason achieves 83.1% functional correctness on the VerilogEval Machine benchmark, substantially outperforming both comparable-sized models and much larger commercial systems like GPT-4 Turbo.
|
@@ -39,7 +39,7 @@ You can use the model with the transformers library:
|
|
39 |
import torch
|
40 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
41 |
|
42 |
-
model_name = "Nellyw888/Nellyw888/VeriReason-Qwen2.5-7b-
|
43 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
44 |
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
|
45 |
model.eval()
|
|
|
11 |
- Qwen/Qwen2.5-Coder-7B-Instruct
|
12 |
---
|
13 |
|
14 |
+
# VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb
|
15 |
|
16 |
For implementation details, visit our GitHub repository: [VeriReason](https://github.com/NellyW8/VeriReason)
|
17 |
|
18 |
Check out our paper: [VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation](https://arxiv.org/abs/2505.11849)
|
19 |
|
20 |
## Update Log
|
21 |
+
2025.05.17: Initial release of VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb
|
22 |
|
23 |
## Project Description
|
24 |
This study introduces VeriReason, a novel approach utilizing reinforcement learning with testbench feedback to enhance the performance of pre-trained models for Verilog RTL code generation. VeriReason combines supervised fine-tuning with Guided Reward Proximal Optimization (GRPO) reinforcement learning, specifically tailored for RTL code generation. Using our curated high-quality training examples alongside a feedback-driven reward model, VeriReason achieves 83.1% functional correctness on the VerilogEval Machine benchmark, substantially outperforming both comparable-sized models and much larger commercial systems like GPT-4 Turbo.
|
|
|
39 |
import torch
|
40 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
41 |
|
42 |
+
model_name = "Nellyw888/Nellyw888/VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb"
|
43 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
44 |
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
|
45 |
model.eval()
|