Update README.md
Browse files
README.md
CHANGED
@@ -1,4 +1,4 @@
|
|
1 |
-
<h1 align="center"> PyBench: Evaluate LLM Agent on Real World Tasks </h1>
|
2 |
|
3 |
<p align="center">
|
4 |
<a href="https://arxiv.org/abs/2407.16732">π Paper</a>
|
@@ -7,10 +7,11 @@
|
|
7 |
β’
|
8 |
<a href="https://huggingface.co/Mercury7353/PyLlama3" >π€ Model (PyLlama3)</a>
|
9 |
β’
|
10 |
-
<a href=" https://github.com/Mercury7353/PyBench" > Code </a>
|
11 |
β’
|
12 |
</p>
|
13 |
|
|
|
14 |
|
15 |
PyBench is a comprehensive benchmark evaluating LLM on real-world coding tasks including **chart analysis**, **text analysis**, **image/ audio editing**, **complex math** and **software/website development**.
|
16 |
We collect files from Kaggle, arXiv, and other sources and automatically generate queries according to the type and content of each file.
|
|
|
1 |
+
<h1 align="center"> PyBench: Evaluate LLM Agent on Real World Coding Tasks </h1>
|
2 |
|
3 |
<p align="center">
|
4 |
<a href="https://arxiv.org/abs/2407.16732">π Paper</a>
|
|
|
7 |
β’
|
8 |
<a href="https://huggingface.co/Mercury7353/PyLlama3" >π€ Model (PyLlama3)</a>
|
9 |
β’
|
10 |
+
<a href=" https://github.com/Mercury7353/PyBench" > πCode </a>
|
11 |
β’
|
12 |
</p>
|
13 |
|
14 |
+
This is the PyLlama3 model, fine-tuned for <a href=" https://github.com/Mercury7353/PyBench" > PyBench </a>.
|
15 |
|
16 |
PyBench is a comprehensive benchmark evaluating LLM on real-world coding tasks including **chart analysis**, **text analysis**, **image/ audio editing**, **complex math** and **software/website development**.
|
17 |
We collect files from Kaggle, arXiv, and other sources and automatically generate queries according to the type and content of each file.
|