Safetensors
English
llama
table
QA
Code
File size: 4,688 Bytes
bb92136
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
00c5928
bb92136
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
license: llama3.1
datasets:
- RUCKBReasoning/TableLLM-SFT
language:
- en
base_model:
- meta-llama/Llama-3.1-8B-Instruct
tags:
- table
- QA
- Code
---

# TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios

| **[Paper](https://arxiv.org/abs/2403.19318)** | **[Training set](https://huggingface.co/datasets/RUCKBReasoning/TableLLM-SFT)** | **[Github](https://github.com/RUCKBReasoning/TableLLM)** | **[Homepage](https://tablellm.github.io/)** |

We present **TableLLM**, a powerful large language model designed to handle tabular data manipulation tasks efficiently, whether they are embedded in spreadsheets or documents, meeting the demands of real office scenarios. TableLLM is fine-tuned based on [Llama3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct).

TableLLM generates either a code solution or a direct text answer to handle tabular data manipulation tasks based on different scenarios. Code generation is used for handling spreadsheet-embedded tabular data, which often involves the insert, delete, update, query, merge, and plot operations of tables. Text generation is used for handling document-embedded tabular data, which often involves the query operation of short tables.

## Evaluation Results
We evaluate the code solution generation ability of TableLLM on three benchmarks: WikiSQL, Spider and Self-created table operation benchmark. The text answer generation ability is tested on four benchmarks: WikiTableQuestion (WikiTQ), TAT-QA and FeTaQA. The evaluation result is shown below:

| Model                | WikiTQ | TAT-QA | FeTaQA | WikiSQL | Spider | Self-created | Average |
| :------------------- | :----: | :----: | :----: | :-----: | :----: | :----------: | :-----: |
| TaPEX                |  38.6  |    –   |    –   |   83.9  |  15.0  |       /      |   45.8  |
| TaPas                |  31.6  |    –   |    –   |   74.2  |  23.1  |       /      |   43.0 |
| TableLlama           |  24.0  |  22.3  |  20.5  |   43.7  |   -  |       /      |   23.4  |
| TableGPT2(7B)        |  77.3  |  88.1  |  75.6  |   63.0  |   77.34  |     74.42     |   76.0 
| Llama3.1 (8B)        |  71.9  |  74.3  |  83.4  |   40.6  |  18.8  |     43.2     |   55.3  |
| GPT3.5               |  58.5  |  72.1  |  71.2  |   81.7   |  67.4  |     77.1     |   69.8  |
| GPT4o                |**91.5**|**91.5**|**94.4**|<ins>84.0</ins>|  69.5  |<ins>77.8</ins>|<ins>84.8</ins>|
| CodeLlama (13B)      |  43.4  |  47.3  |  57.2  |   38.3  |  21.9  |     47.6     |   43.6  |
| Deepseek-Coder (33B) |   6.5  |  11.0  |   7.1  |   72.5  |  58.4  |     73.9     |   33.8  |
| StructGPT (GPT3.5)   |  52.5  |  27.5  |  11.8  |   67.8  |**84.8**|       /      |   43.1  |
| Binder (GPT3.5)      |  61.6  |  12.8  |   6.9  |   78.6  |  52.6  |       /      |   36.3  |
| DATER (GPT3.5)       |  53.4  |  28.5  |  18.3  |   58.2  |  26.5  |       /      |   33.0  |
| TableLLM-8B (Ours)  |<ins>89.1</ins>|<ins>89.5</ins>|<ins>93.4</ins>|**89.6**|<ins>81.1</ins>|<ins>77.8</ins>|**86.7**|

## Prompt Template
The prompts we used for generating code solutions and text answers are introduced below.

### Code Solution
The prompt template for the insert, delete, update, query, and plot operations on a single table.
```
[INST]Below are the first few lines of a CSV file. You need to write a Python program to solve the provided question.

Header and first few lines of CSV file:
{csv_data}

Question: {question}[/INST]
```

The prompt template for the merge operation on two tables.
```
[INST]Below are the first few lines two CSV file. You need to write a Python program to solve the provided question.

Header and first few lines of CSV file 1:
{csv_data1}

Header and first few lines of CSV file 2:
{csv_data2}

Question: {question}[/INST]
```

The csv_data field is filled with the first few lines of your provided table file. Below is an example:
```
Sex,Length,Diameter,Height,Whole weight,Shucked weight,Viscera weight,Shell weight,Rings
M,0.455,0.365,0.095,0.514,0.2245,0.101,0.15,15
M,0.35,0.265,0.09,0.2255,0.0995,0.0485,0.07,7
F,0.53,0.42,0.135,0.677,0.2565,0.1415,0.21,9
M,0.44,0.365,0.125,0.516,0.2155,0.114,0.155,10
I,0.33,0.255,0.08,0.205,0.0895,0.0395,0.055,7
```

### Text Answer
The prompt template for direct text answer generation on short tables.
````
[INST]Offer a thorough and accurate solution that directly addresses the Question outlined in the [Question].
### [Table Text]
{table_descriptions}

### [Table]
```
{table_in_csv}
```

### [Question]
{question}

### [Solution][INST/]
````

For more details about how to use TableLLM, please refer to our GitHub page: <https://github.com/TableLLM/TableLLM>