Safetensors
qwen3
TAMA-QWen3 / README.md
dnaihao's picture
added logo
5ffa7e0 verified
---
license: apache-2.0
---
<!doctype html>
<html>
<head>
<meta charset="UTF-8">
<div style="display: flex; align-items: center;">
<img src="https://huggingface.co/datasets/MichiganNLP/blog-images/resolve/main/tama.png" alt="TAMA Logo" width="80px" style="margin-right: 12px;">
<h1 style="margin: 0;">Better TAMA Models with Limited Data</h1>
</div>
<link href="https://fonts.googleapis.com/css?family=Crimson+Text:400,400italic,700,700italic|Roboto:400,700,700italic,400italic" rel="stylesheet">
<style>
body {
font-family: "Crimson Text", serif;
font-size: 16px;
color: #333;
background: #fff;
max-width: 960px;
margin: 0 auto;
padding: 60px 40px;
line-height: 1.6;
}
h1, h2, h3 {
font-family: Roboto, sans-serif;
font-weight: 700;
margin-top: 2em;
margin-bottom: 0.5em;
}
h1 {
font-size: 36px;
margin-top: 0;
}
h2 {
font-size: 28px;
}
h3 {
font-size: 22px;
}
a {
color: #336699;
text-decoration: none;
}
a:hover {
text-decoration: underline;
}
.table-container {
text-align: center;
}
.scroll-table {
overflow-x: auto;
display: inline-block;
max-width: 100%;
}
table {
border-collapse: collapse;
margin: 2em 0;
font-size: 14px;
}
th, td {
border: 1px solid #ccc;
padding: 8px;
text-align: left;
white-space: normal;
word-wrap: break-word;
word-break: break-word;
}
.nowrap td, .nowrap th {
white-space: nowrap;
word-wrap: normal;
word-break: normal;
}
.highlight {
font-weight: bold;
color: #1a7f37;
}
hr {
border: 0;
border-top: 1px solid #ccc;
margin: 3em 0;
}
.references p {
margin-bottom: 0.5em;
}
.toggle-button {
display: block;
margin: 1em auto;
padding: 0.5em 1em;
font-size: 16px;
cursor: pointer;
background: #f2f2f2;
border: 1px solid #ccc;
border-radius: 4px;
}
.extra-columns {
display: none;
}
</style>
</head>
<body>
<p>
In [<a href="#ref1">1</a>], we reveal that with limited instruction tuning data, we can achieve competitive performance on table tasks. This compact setup enables quick instruction tuning with advanced base models.
</p>
<p>
We present TAMA models built on Qwen 2.5 and Qwen 3. These models achieve strong results on the MMTU benchmark [<a href="#ref2">2</a>], outperforming recent table reasoning models [<a href="#ref3">3</a>] and competitive table LLMs like Table-GPT 2 [<a href="#ref4">4</a>], which is tuned on 2.36M datapoints.
</p>
<p>
Notably, TAMA-QWen3 achieves the best overall performance of <span class="highlight">33.9</span>, surpassing QWen-3-8B (32.9) and TableGPT-2 (30.0).
</p>
<div class="table-container">
<div class="scroll-table">
<table id="performance-table">
<thead>
<tr>
<th style='background-color:#F2F2F2;'>Models</th>
<th style='background-color:#F2F2F2;'>Paper Source</th>
<th style='background-color:#F2F2F2;'>Training Corpora Size</th>
<th style='background-color:#F2F2F2;'>Base Model</th>
<th style='background-color:#F2F2F2;'>Model Size</th>
<th style='background-color:#F2F2F2;'>Overall</th>
<th class="extra-columns" style='background-color:#F2F2F2;'>Table Understanding and QA</th>
<th class="extra-columns" style='background-color:#F2F2F2;'>Table Transformation and Manipulation</th>
<th class="extra-columns" style='background-color:#F2F2F2;'>Entity and Schema Matching</th>
<th class="extra-columns" style='background-color:#F2F2F2;'>SQL and Table Navigation</th>
<th class="extra-columns" style='background-color:#F2F2F2;'>Semantic Analysis and Relationships</th>
<th class="extra-columns" style='background-color:#F2F2F2;'>Cell and Column Annotation</th>
<th class="extra-columns" style='background-color:#F2F2F2;'>Error Detection</th>
<th class="extra-columns" style='background-color:#F2F2F2;'>Formula Prediction</th>
</tr>
</thead>
<tbody>
<!-- Fill in first few rows if needed for preview -->
</tr><tr id='temp:C:GJGd7a80489248c47eaae55f0e7a'><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJG35c2ff53e6ca4ff6a432d6e10' style=''><a href="https://huggingface.co/osunlp/TableLlama">TableLlama</a>
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJG8f95d72be7214550b9aebfb0a' style=''><a href="https://arxiv.org/abs/2311.09206">TableLlama: Towards Open Large Generalist Models for Tables</a>
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJG16f50893a8014805a44a1007a' style=''>2M
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJG774257f6b0044ad6b08086ba5' style=''><a href="https://huggingface.co/Yukang/Llama-2-7b-longlora-8k">Yukang/Llama-2-7b-longlora-8k</a>
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJG814aa5d1c1c64eb2968e7a007' style=''>7B
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJGae76c63851b648439cab3eb14' style=''>0.0
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJG00e2e6d7000147df97cc77e28' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJG67a5d462ce30423c8fb870364' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJGbb5cc10122964720bc8da76c5' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJG0d0b379bcb6f462185bacedaa' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJG750f96b656904402b0685a9dc' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJG090381cf461f4ec1be545a3b6' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJGdbb0b3f228a24416873a958a9' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td><td id='temp:s:temp:C:GJGd7a80489248c47eaae55f0e7a;temp:C:GJG6886327be27f435c846f58863' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td></tr><tr id='temp:C:GJG2c70664766d9435895fb4b802'><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJG35c2ff53e6ca4ff6a432d6e10' style=''><a href="https://huggingface.co/RUCKBReasoning/TableLLM-7b">TableLLM</a>
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJG8f95d72be7214550b9aebfb0a' style=''><a href="https://arxiv.org/html/2403.19318v1">TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios</a>
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJG16f50893a8014805a44a1007a' style=''>309K
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJG774257f6b0044ad6b08086ba5' style=''><a href="https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf">codellama/CodeLlama-7b-Instruct-hf</a>
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJG814aa5d1c1c64eb2968e7a007' style=''>7B
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJGae76c63851b648439cab3eb14' style=''>2.5
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJG00e2e6d7000147df97cc77e28' style='background-color:#f4fbf8;text-align: right;vertical-align: bottom;' class="extra-columns">2.9
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJG67a5d462ce30423c8fb870364' style='text-align: right;vertical-align: bottom;' class="extra-columns">0.1
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJGbb5cc10122964720bc8da76c5' style='background-color:#e0f3e9;text-align: right;vertical-align: bottom;' class="extra-columns">16.6
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJG0d0b379bcb6f462185bacedaa' style='background-color:#f3faf7;text-align: right;vertical-align: bottom;' class="extra-columns">2.3
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJG750f96b656904402b0685a9dc' style='background-color:#f9fdfb;text-align: right;vertical-align: bottom;' class="extra-columns">1.4
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJG090381cf461f4ec1be545a3b6' style='background-color:#f0f9f4;text-align: right;vertical-align: bottom;' class="extra-columns">2.6
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJGdbb0b3f228a24416873a958a9' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td><td id='temp:s:temp:C:GJG2c70664766d9435895fb4b802;temp:C:GJG6886327be27f435c846f58863' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td></tr><tr id='temp:C:GJG0e53c4ed694944198613f84c0'><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJG35c2ff53e6ca4ff6a432d6e10' style=''><a href="https://huggingface.co/Multilingual-Multimodal-NLP/TableLLM-Llama3.1-8B">TableBenchLLM-Llama-3.1-8B</a>
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJG8f95d72be7214550b9aebfb0a' style=''><a href="https://arxiv.org/abs/2408.09174">TableBench: A Comprehensive and Complex Benchmark for Table Question Answering</a>
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJG16f50893a8014805a44a1007a' style=''>20K
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJG774257f6b0044ad6b08086ba5' style=''><a href="https://huggingface.co/meta-llama/Llama-3.1-8B">meta-llama/Llama-3.1-8B</a>
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJG814aa5d1c1c64eb2968e7a007' style=''>8B
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJGae76c63851b648439cab3eb14' style=''>3.4
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJG00e2e6d7000147df97cc77e28' style='text-align: right;vertical-align: bottom;' class="extra-columns">0.1
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJG67a5d462ce30423c8fb870364' style='background-color:#ebf7f1;text-align: right;vertical-align: bottom;' class="extra-columns">3.1
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJGbb5cc10122964720bc8da76c5' style='background-color:#d3ede0;text-align: right;vertical-align: bottom;' class="extra-columns">23.3
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJG0d0b379bcb6f462185bacedaa' style='background-color:#fbfefc;text-align: right;vertical-align: bottom;' class="extra-columns">0.8
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJG750f96b656904402b0685a9dc' style='background-color:#f6fcf9;text-align: right;vertical-align: bottom;' class="extra-columns">2.1
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJG090381cf461f4ec1be545a3b6' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJGdbb0b3f228a24416873a958a9' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td><td id='temp:s:temp:C:GJG0e53c4ed694944198613f84c0;temp:C:GJG6886327be27f435c846f58863' style='text-align: right;vertical-align: bottom;' class="extra-columns">0
<br/></td></tr><tr id='temp:C:GJG9fd759f704714916869c72b96'><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJG35c2ff53e6ca4ff6a432d6e10' style=''><a href="https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct">Llama-3.1-8B-Instruct</a>
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJG8f95d72be7214550b9aebfb0a' style=''><a href="https://arxiv.org/abs/2407.21783">The Llama 3 Herd of Models</a>
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJG16f50893a8014805a44a1007a' style=''>-
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJG774257f6b0044ad6b08086ba5' style=''><a href="https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct">meta-llama/Llama-3.1-8B-Instruct</a>
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJG814aa5d1c1c64eb2968e7a007' style=''>8B
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJGae76c63851b648439cab3eb14' style=''>25.3
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJG00e2e6d7000147df97cc77e28' style='background-color:#69c397;text-align: right;vertical-align: bottom;' class="extra-columns">38.1
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJG67a5d462ce30423c8fb870364' style='background-color:#90d2b2;text-align: right;vertical-align: bottom;' class="extra-columns">17.1
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJGbb5cc10122964720bc8da76c5' style='background-color:#7dcba5;text-align: right;vertical-align: bottom;' class="extra-columns">67.8
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJG0d0b379bcb6f462185bacedaa' style='background-color:#72c69d;text-align: right;vertical-align: bottom;' class="extra-columns">26.2
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJG750f96b656904402b0685a9dc' style='background-color:#87cfab;text-align: right;vertical-align: bottom;' class="extra-columns">28
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJG090381cf461f4ec1be545a3b6' style='background-color:#68c296;text-align: right;vertical-align: bottom;' class="extra-columns">24.9
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJGdbb0b3f228a24416873a958a9' style='background-color:#ccebdc;text-align: right;vertical-align: bottom;' class="extra-columns">4.5
<br/></td><td id='temp:s:temp:C:GJG9fd759f704714916869c72b96;temp:C:GJG6886327be27f435c846f58863' style='background-color:#e3f4ec;text-align: right;vertical-align: bottom;' class="extra-columns">0.1
<br/></td></tr><tr id='temp:C:GJGa47d78fca9d44923919996e67'><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJG35c2ff53e6ca4ff6a432d6e10' style=''><a href="https://huggingface.co/MichiganNLP/TAMA-vB">TAMA-vB</a>
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJG8f95d72be7214550b9aebfb0a' style=''><a href="https://arxiv.org/pdf/2501.14693">Rethinking Table Instruction Tuning</a>
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJG16f50893a8014805a44a1007a' style=''>2.6K
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJG774257f6b0044ad6b08086ba5' style=''><a href="https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct">meta-llama/Llama-3.1-8B-Instruct</a>
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJG814aa5d1c1c64eb2968e7a007' style=''>8B
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJGae76c63851b648439cab3eb14' style=''>21.1
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJG00e2e6d7000147df97cc77e28' style='background-color:#86ceab;text-align: right;vertical-align: bottom;' class="extra-columns">30.9
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJG67a5d462ce30423c8fb870364' style='background-color:#9cd7ba;text-align: right;vertical-align: bottom;' class="extra-columns">15.2
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJGbb5cc10122964720bc8da76c5' style='background-color:#72c69d;text-align: right;vertical-align: bottom;' class="extra-columns">73.7
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJG0d0b379bcb6f462185bacedaa' style='background-color:#b2e0ca;text-align: right;vertical-align: bottom;' class="extra-columns">14.3
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJG750f96b656904402b0685a9dc' style='background-color:#a5dbc1;text-align: right;vertical-align: bottom;' class="extra-columns">20.9
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJG090381cf461f4ec1be545a3b6' style='background-color:#8ed1b0;text-align: right;vertical-align: bottom;' class="extra-columns">18.7
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJGdbb0b3f228a24416873a958a9' style='background-color:#bde4d1;text-align: right;vertical-align: bottom;' class="extra-columns">5.9
<br/></td><td id='temp:s:temp:C:GJGa47d78fca9d44923919996e67;temp:C:GJG6886327be27f435c846f58863' style='background-color:#e3f4ec;text-align: right;vertical-align: bottom;' class="extra-columns">0.1
<br/></td></tr><tr id='temp:C:GJG49cecd182f02483f9e1a67399'><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJG35c2ff53e6ca4ff6a432d6e10' style=''><a href="https://huggingface.co/MichiganNLP/TAMA-vA">TAMA-vA</a>
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJG8f95d72be7214550b9aebfb0a' style=''><a href="https://arxiv.org/pdf/2501.14693">Rethinking Table Instruction Tuning</a>
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJG16f50893a8014805a44a1007a' style=''>2.6K
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJG774257f6b0044ad6b08086ba5' style=''><a href="https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct">meta-llama/Llama-3.1-8B-Instruct</a>
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJG814aa5d1c1c64eb2968e7a007' style=''>8B
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJGae76c63851b648439cab3eb14' style=''>16.9
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJG00e2e6d7000147df97cc77e28' style='background-color:#a6dbc1;text-align: right;vertical-align: bottom;' class="extra-columns">22.6
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJG67a5d462ce30423c8fb870364' style='background-color:#a7dcc2;text-align: right;vertical-align: bottom;' class="extra-columns">13.5
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJGbb5cc10122964720bc8da76c5' style='background-color:#afdfc7;text-align: right;vertical-align: bottom;' class="extra-columns">41.8
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJG0d0b379bcb6f462185bacedaa' style='background-color:#bbe4d0;text-align: right;vertical-align: bottom;' class="extra-columns">12.7
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJG750f96b656904402b0685a9dc' style='background-color:#a4dbc0;text-align: right;vertical-align: bottom;' class="extra-columns">21.1
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJG090381cf461f4ec1be545a3b6' style='background-color:#97d5b7;text-align: right;vertical-align: bottom;' class="extra-columns">17.1
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJGdbb0b3f228a24416873a958a9' style='background-color:#b8e3ce;text-align: right;vertical-align: bottom;' class="extra-columns">6.3
<br/></td><td id='temp:s:temp:C:GJG49cecd182f02483f9e1a67399;temp:C:GJG6886327be27f435c846f58863' style='background-color:#e3f4ec;text-align: right;vertical-align: bottom;' class="extra-columns">0.1
<br/></td></tr><tr id='temp:C:GJGb7a26b478b8a4d66bce379049'><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJG35c2ff53e6ca4ff6a432d6e10' style=''><a href="https://huggingface.co/Qwen/Qwen2.5-7B-Instruct">Qwen2.5-7B</a>
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJG8f95d72be7214550b9aebfb0a' style=''><a href="https://arxiv.org/abs/2412.15115">Qwen2.5 Technical Report</a>
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJG16f50893a8014805a44a1007a' style=''>-
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJG774257f6b0044ad6b08086ba5' style=''><a href="https://huggingface.co/Qwen/Qwen2.5-7B-Instruct">Qwen/Qwen2.5-7B-Instruct</a>
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJG814aa5d1c1c64eb2968e7a007' style=''>7B
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJGae76c63851b648439cab3eb14' style=''>28.5
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJG00e2e6d7000147df97cc77e28' style='background-color:#68c296;text-align: right;vertical-align: bottom;' class="extra-columns">38.5
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJG67a5d462ce30423c8fb870364' style='background-color:#86ceab;text-align: right;vertical-align: bottom;' class="extra-columns">18.6
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJGbb5cc10122964720bc8da76c5' style='background-color:#57bb8a;text-align: right;vertical-align: bottom;' class="extra-columns">87.3
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJG0d0b379bcb6f462185bacedaa' style='background-color:#5bbd8d;text-align: right;vertical-align: bottom;' class="extra-columns">30.5
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJG750f96b656904402b0685a9dc' style='background-color:#74c79e;text-align: right;vertical-align: bottom;' class="extra-columns">32.3
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJG090381cf461f4ec1be545a3b6' style='background-color:#72c69d;text-align: right;vertical-align: bottom;' class="extra-columns">23.2
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJGdbb0b3f228a24416873a958a9' style='background-color:#a6dbc1;text-align: right;vertical-align: bottom;' class="extra-columns">7.9
<br/></td><td id='temp:s:temp:C:GJGb7a26b478b8a4d66bce379049;temp:C:GJG6886327be27f435c846f58863' style='background-color:#abddc5;text-align: right;vertical-align: bottom;' class="extra-columns">0.3
<br/></td></tr><tr id='temp:C:GJG601c6f2758b244f1b1c00defc'><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJG35c2ff53e6ca4ff6a432d6e10' style=''><a href="https://huggingface.co/tablegpt/TableGPT2-7B">TableGPT2-7B</a>
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJG8f95d72be7214550b9aebfb0a' style=''><a href="https://arxiv.org/abs/2411.02059">TableGPT2: A Large Multimodal Model with Tabular Data Integration</a>
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJG16f50893a8014805a44a1007a' style=''>2.36M
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJG774257f6b0044ad6b08086ba5' style=''><a href="https://huggingface.co/Qwen/Qwen2.5-7B-Instruct">Qwen/Qwen2.5-7B-Instruct</a>
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJG814aa5d1c1c64eb2968e7a007' style=''>7B
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJGae76c63851b648439cab3eb14' style=''>30.0
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJG00e2e6d7000147df97cc77e28' style='background-color:#57bb8a;text-align: right;vertical-align: bottom;' class="extra-columns">42.6
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJG67a5d462ce30423c8fb870364' style='background-color:#6ac397;text-align: right;vertical-align: bottom;' class="extra-columns">22.9
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJGbb5cc10122964720bc8da76c5' style='background-color:#5cbd8e;text-align: right;vertical-align: bottom;' class="extra-columns">84.9
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJG0d0b379bcb6f462185bacedaa' style='background-color:#57bb8a;text-align: right;vertical-align: bottom;' class="extra-columns">31.1
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJG750f96b656904402b0685a9dc' style='background-color:#74c79e;text-align: right;vertical-align: bottom;' class="extra-columns">32.4
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJG090381cf461f4ec1be545a3b6' style='background-color:#69c397;text-align: right;vertical-align: bottom;' class="extra-columns">24.7
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJGdbb0b3f228a24416873a958a9' style='background-color:#57bb8a;text-align: right;vertical-align: bottom;' class="extra-columns">14.8
<br/></td><td id='temp:s:temp:C:GJG601c6f2758b244f1b1c00defc;temp:C:GJG6886327be27f435c846f58863' style='background-color:#abddc5;text-align: right;vertical-align: bottom;' class="extra-columns">0.3
<br/></td></tr><tr id='temp:C:GJG2ec77b03a0cc4e77b475b33d2'><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJG35c2ff53e6ca4ff6a432d6e10' style=''><a href="https://huggingface.co/Table-R1/Table-R1-Zero-8B">Table-R1-Zero-8B</a>
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJG8f95d72be7214550b9aebfb0a' style=''><a href="https://arxiv.org/html/2505.23621v1">Table-R1: Inference-Time Scaling for Table Reasoning</a>
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJG16f50893a8014805a44a1007a' style=''>48.6K
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJG774257f6b0044ad6b08086ba5' style=''><a href="https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct">meta-llama/Llama-3.1-8B-Instruct</a>
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJG814aa5d1c1c64eb2968e7a007' style=''>8B
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJGae76c63851b648439cab3eb14' style=''>26.6
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJG00e2e6d7000147df97cc77e28' style='background-color:#64c193;text-align: right;vertical-align: bottom;' class="extra-columns">39.4
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJG67a5d462ce30423c8fb870364' style='background-color:#7fcca6;text-align: right;vertical-align: bottom;' class="extra-columns">19.6
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJGbb5cc10122964720bc8da76c5' style='background-color:#80cca7;text-align: right;vertical-align: bottom;' class="extra-columns">66
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJG0d0b379bcb6f462185bacedaa' style='background-color:#72c69d;text-align: right;vertical-align: bottom;' class="extra-columns">26.2
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJG750f96b656904402b0685a9dc' style='background-color:#7ac9a2;text-align: right;vertical-align: bottom;' class="extra-columns">31
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJG090381cf461f4ec1be545a3b6' style='background-color:#62c092;text-align: right;vertical-align: bottom;' class="extra-columns">25.9
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJGdbb0b3f228a24416873a958a9' style='background-color:#d4eee1;text-align: right;vertical-align: bottom;' class="extra-columns">3.8
<br/></td><td id='temp:s:temp:C:GJG2ec77b03a0cc4e77b475b33d2;temp:C:GJG6886327be27f435c846f58863' style='background-color:#c7e9d8;text-align: right;vertical-align: bottom;' class="extra-columns">0.2
<br/></td></tr><tr id='temp:C:GJGfed3c3e942e044ad9ee5502f1'><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJG35c2ff53e6ca4ff6a432d6e10' style=''><a href="https://huggingface.co/MichiganNLP/TAMA-QWen2.5">TAMA-QWen2.5</a>
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJG8f95d72be7214550b9aebfb0a' style=''><a href="https://arxiv.org/pdf/2501.14693">Rethinking Table Instruction Tuning</a>
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJG16f50893a8014805a44a1007a' style=''>2.6K
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJG774257f6b0044ad6b08086ba5' style=''><a href="https://huggingface.co/Qwen/Qwen2.5-7B-Instruct">Qwen/Qwen2.5-7B-Instruct</a>
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJG814aa5d1c1c64eb2968e7a007' style=''>7B
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJGae76c63851b648439cab3eb14' style=''>27.6
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJG00e2e6d7000147df97cc77e28' style='background-color:#70c59b;text-align: right;vertical-align: bottom;' class="extra-columns">36.5
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJG67a5d462ce30423c8fb870364' style='background-color:#90d2b2;text-align: right;vertical-align: bottom;' class="extra-columns">17.1
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJGbb5cc10122964720bc8da76c5' style='background-color:#58bc8b;text-align: right;vertical-align: bottom;' class="extra-columns">86.8
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJG0d0b379bcb6f462185bacedaa' style='background-color:#5ebe8f;text-align: right;vertical-align: bottom;' class="extra-columns">29.9
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJG750f96b656904402b0685a9dc' style='background-color:#7acaa3;text-align: right;vertical-align: bottom;' class="extra-columns">30.8
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJG090381cf461f4ec1be545a3b6' style='background-color:#70c59b;text-align: right;vertical-align: bottom;' class="extra-columns">23.6
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJGdbb0b3f228a24416873a958a9' style='background-color:#a0d9bd;text-align: right;vertical-align: bottom;' class="extra-columns">8.4
<br/></td><td id='temp:s:temp:C:GJGfed3c3e942e044ad9ee5502f1;temp:C:GJG6886327be27f435c846f58863' style='background-color:#c7e9d8;text-align: right;vertical-align: bottom;' class="extra-columns">0.2
<br/></td></tr><tr id='temp:C:GJG0085c5b822744c81b434cc6cd'><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJG35c2ff53e6ca4ff6a432d6e10' style=''><a href="https://huggingface.co/Qwen/Qwen3-8B">QWen-3-8B</a>
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJG8f95d72be7214550b9aebfb0a' style=''><a href="https://arxiv.org/abs/2505.09388">Qwen3 Technical Report</a>
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJG16f50893a8014805a44a1007a' style=''>-
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJG774257f6b0044ad6b08086ba5' style=''><a href="https://huggingface.co/Qwen/Qwen3-8B">Qwen/Qwen3-8B</a>
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJG814aa5d1c1c64eb2968e7a007' style=''>8B
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJGae76c63851b648439cab3eb14' style=''>32.9
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJG00e2e6d7000147df97cc77e28' style='background-color:#6cc499;text-align: right;vertical-align: bottom;' class="extra-columns">37.4
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJG67a5d462ce30423c8fb870364' style='background-color:#57bb8a;text-align: right;vertical-align: bottom;' class="extra-columns">25.7
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJGbb5cc10122964720bc8da76c5' style='background-color:#5fbf90;text-align: right;vertical-align: bottom;' class="extra-columns">83.2
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJG0d0b379bcb6f462185bacedaa' style='background-color:#63c093;text-align: right;vertical-align: bottom;' class="extra-columns">28.9
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJG750f96b656904402b0685a9dc' style='background-color:#57bb8a;text-align: right;vertical-align: bottom;' class="extra-columns">38.9
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJG090381cf461f4ec1be545a3b6' style='background-color:#57bb8a;text-align: right;vertical-align: bottom;' class="extra-columns">27.6
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJGdbb0b3f228a24416873a958a9' style='background-color:#5ebe8f;text-align: right;vertical-align: bottom;' class="extra-columns">14.2
<br/></td><td id='temp:s:temp:C:GJG0085c5b822744c81b434cc6cd;temp:C:GJG6886327be27f435c846f58863' style='background-color:#57bb8a;text-align: right;vertical-align: bottom;' class="extra-columns">0.6
<br/></td></tr><tr id='temp:C:GJG3cb5f57e8e3748a9b13fe5018'><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJG35c2ff53e6ca4ff6a432d6e10' style=''><a href="https://huggingface.co/MichiganNLP/TAMA-QWen3">TAMA-QWen3</a>
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJG8f95d72be7214550b9aebfb0a' style=''><a href="https://arxiv.org/pdf/2501.14693">Rethinking Table Instruction Tuning</a>
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJG16f50893a8014805a44a1007a' style=''>2.6K
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJG774257f6b0044ad6b08086ba5' style=''><a href="https://huggingface.co/Qwen/Qwen3-8B">Qwen/Qwen3-8B</a>
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJG814aa5d1c1c64eb2968e7a007' style=''>8B
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJGae76c63851b648439cab3eb14' style='' class='bold'> <b>33.9</b>
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJG00e2e6d7000147df97cc77e28' style='background-color:#69c397;text-align: right;vertical-align: bottom;' class="extra-columns">38.2
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJG67a5d462ce30423c8fb870364' style='background-color:#5dbe8f;text-align: right;vertical-align: bottom;' class="extra-columns">24.8
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJGbb5cc10122964720bc8da76c5' style='background-color:#60bf90;text-align: right;vertical-align: bottom;' class="extra-columns">83
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJG0d0b379bcb6f462185bacedaa' style='background-color:#61bf91;text-align: right;vertical-align: bottom;' class="extra-columns">29.3
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJG750f96b656904402b0685a9dc' style='background-color:#5abd8c;text-align: right;vertical-align: bottom;' class="extra-columns">38.3
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJG090381cf461f4ec1be545a3b6' style='background-color:#5ebe8f;text-align: right;vertical-align: bottom;' class="extra-columns">26.6
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJGdbb0b3f228a24416873a958a9' style='background-color:#62c092;text-align: right;vertical-align: bottom;' class="extra-columns">13.9
<br/></td><td id='temp:s:temp:C:GJG3cb5f57e8e3748a9b13fe5018;temp:C:GJG6886327be27f435c846f58863' style='background-color:#57bb8a;text-align: right;vertical-align: bottom;' class="extra-columns">0.6
<br/></td></tr>
</tbody>
</table>
</div>
</div>
<h3>Evaluation Details</h3>
<p>
We adopt the official <a href="https://github.com/MMTU-Benchmark/MMTU">MMTU evaluation script</a> to compute scores. For overall performance, we use the evaluation function described <a href="https://github.com/MMTU-Benchmark/MMTU?tab=readme-ov-file#step-3-evaluation">here</a>. Category scores are the arithmetic mean across datasets in that category.
For QWen 3 model and TAMA-QWen3, we turned off the thinking mode.
</p>
<hr>
<h2>References</h2>
<div class="references">
<p id="ref1">[1] <a href="https://arxiv.org/abs/2501.14693">Deng, Naihao, and Rada Mihalcea. "Rethinking Table Instruction Tuning." arXiv:2501.14693 (2025).</a></p>
<p id="ref2">[2] <a href="https://arxiv.org/abs/2506.05587">Xing, Junjie, et al. "MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark." arXiv:2506.05587 (2025).</a></p>
<p id="ref3">[3] <a href="https://arxiv.org/abs/2505.23621">Yang, Zheyuan, et al. "Table-r1: Inference-time scaling for table reasoning." arXiv:2505.23621 (2025).</a></p>
<p id="ref4">[4] <a href="https://arxiv.org/abs/2411.02059">Su, Aofeng, et al. "TableGPT2: A Large Multimodal Model with Tabular Data Integration." arXiv:2411.02059 (2024).</a></p>
</div>
</body>
</html>