Mercury7353 commited on
Commit
721a51e
Β·
verified Β·
1 Parent(s): 046f135

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -1,4 +1,4 @@
1
- <h1 align="center"> PyBench: Evaluate LLM Agent on Real World Tasks </h1>
2
 
3
  <p align="center">
4
  <a href="https://arxiv.org/abs/2407.16732">πŸ“ƒ Paper</a>
@@ -7,10 +7,11 @@
7
  β€’
8
  <a href="https://huggingface.co/Mercury7353/PyLlama3" >πŸ€— Model (PyLlama3)</a>
9
  β€’
10
- <a href=" https://github.com/Mercury7353/PyBench" > Code </a>
11
  β€’
12
  </p>
13
 
 
14
 
15
  PyBench is a comprehensive benchmark evaluating LLM on real-world coding tasks including **chart analysis**, **text analysis**, **image/ audio editing**, **complex math** and **software/website development**.
16
  We collect files from Kaggle, arXiv, and other sources and automatically generate queries according to the type and content of each file.
 
1
+ <h1 align="center"> PyBench: Evaluate LLM Agent on Real World Coding Tasks </h1>
2
 
3
  <p align="center">
4
  <a href="https://arxiv.org/abs/2407.16732">πŸ“ƒ Paper</a>
 
7
  β€’
8
  <a href="https://huggingface.co/Mercury7353/PyLlama3" >πŸ€— Model (PyLlama3)</a>
9
  β€’
10
+ <a href=" https://github.com/Mercury7353/PyBench" > πŸš—Code </a>
11
  β€’
12
  </p>
13
 
14
+ This is the PyLlama3 model, fine-tuned for <a href=" https://github.com/Mercury7353/PyBench" > PyBench </a>.
15
 
16
  PyBench is a comprehensive benchmark evaluating LLM on real-world coding tasks including **chart analysis**, **text analysis**, **image/ audio editing**, **complex math** and **software/website development**.
17
  We collect files from Kaggle, arXiv, and other sources and automatically generate queries according to the type and content of each file.