Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | @@ -7,7 +7,7 @@ language: | |
| 7 | 
             
            # ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild
         | 
| 8 |  | 
| 9 |  | 
| 10 | 
            -
            Paper Link: 
         | 
| 11 |  | 
| 12 | 
             
            The abstract of the paper states that: 
         | 
| 13 | 
             
            > Given the ubiquity of charts as a data analysis, visualization, and decision-making tool across industries and sciences, there has been a growing interest in developing pre-trained foundation models as well as general purpose instruction-tuned models for chart understanding and reasoning. However, existing methods suffer crucial drawbacks across two critical axes affecting the performance of chart representation models: they are trained on data generated from underlying data tables of the charts, ignoring the visual trends and patterns in chart images, \emph{and} use weakly aligned vision-language backbone models for domain-specific training, limiting their generalizability when encountering charts in the wild. We address these important drawbacks and introduce ChartGemma, a novel chart understanding and reasoning model developed over PaliGemma. Rather than relying on underlying data tables, ChartGemma is trained on instruction-tuning data generated directly from chart images, thus capturing both high-level trends and low-level visual information from a diverse set of charts. Our simple approach achieves state-of-the-art results across $5$ benchmarks spanning chart summarization, question answering, and fact-checking, and our elaborate qualitative studies on real-world charts show that ChartGemma generates more realistic and factually correct summaries compared to its contemporaries.
         | 
| @@ -62,5 +62,14 @@ If you have any questions about this work, please contact **[Ahmed Masry](https: | |
| 62 | 
             
            Please cite our paper if you use our model in your research. 
         | 
| 63 |  | 
| 64 | 
             
            ```
         | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 65 |  | 
| 66 | 
             
            ```
         | 
|  | |
| 7 | 
             
            # ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild
         | 
| 8 |  | 
| 9 |  | 
| 10 | 
            +
            Paper Link: https://arxiv.org/abs/2407.04172
         | 
| 11 |  | 
| 12 | 
             
            The abstract of the paper states that: 
         | 
| 13 | 
             
            > Given the ubiquity of charts as a data analysis, visualization, and decision-making tool across industries and sciences, there has been a growing interest in developing pre-trained foundation models as well as general purpose instruction-tuned models for chart understanding and reasoning. However, existing methods suffer crucial drawbacks across two critical axes affecting the performance of chart representation models: they are trained on data generated from underlying data tables of the charts, ignoring the visual trends and patterns in chart images, \emph{and} use weakly aligned vision-language backbone models for domain-specific training, limiting their generalizability when encountering charts in the wild. We address these important drawbacks and introduce ChartGemma, a novel chart understanding and reasoning model developed over PaliGemma. Rather than relying on underlying data tables, ChartGemma is trained on instruction-tuning data generated directly from chart images, thus capturing both high-level trends and low-level visual information from a diverse set of charts. Our simple approach achieves state-of-the-art results across $5$ benchmarks spanning chart summarization, question answering, and fact-checking, and our elaborate qualitative studies on real-world charts show that ChartGemma generates more realistic and factually correct summaries compared to its contemporaries.
         | 
|  | |
| 62 | 
             
            Please cite our paper if you use our model in your research. 
         | 
| 63 |  | 
| 64 | 
             
            ```
         | 
| 65 | 
            +
            @misc{masry2024chartgemmavisualinstructiontuningchart,
         | 
| 66 | 
            +
                  title={ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild}, 
         | 
| 67 | 
            +
                  author={Ahmed Masry and Megh Thakkar and Aayush Bajaj and Aaryaman Kartha and Enamul Hoque and Shafiq Joty},
         | 
| 68 | 
            +
                  year={2024},
         | 
| 69 | 
            +
                  eprint={2407.04172},
         | 
| 70 | 
            +
                  archivePrefix={arXiv},
         | 
| 71 | 
            +
                  primaryClass={cs.AI},
         | 
| 72 | 
            +
                  url={https://arxiv.org/abs/2407.04172}, 
         | 
| 73 | 
            +
            }
         | 
| 74 |  | 
| 75 | 
             
            ```
         | 
