Spaces:
				
			
			
	
			
			
		Paused
		
	
	
	
			
			
	
	
	
	
		
		A newer version of the Gradio SDK is available:
									5.49.1
title: thread-gpt
app_file: app.py
sdk: gradio
sdk_version: 4.4.1
ThreadGPT
   
Struggling to keep up with the latest AI research papers? ThreadGPT is here to help. It seamlessly transforms complex academic papers into concise, easy-to-understand threads. Not only does it summarize the text, but it also includes relevant figures, tables, and visuals from the papers directly into the threads. π§΅β¨π
   
  
  Gradio App UI
   
  
  Examples of threads generated by ThreadGPT (@paper_threadoor)
π οΈ Installation
Clone the repo
git clone https://github.com/wiskojo/thread-gpt
Install dependencies
# Install PyTorch, torchvision, and torchaudio
# Please refer to the official PyTorch website (https://pytorch.org) for the installation command that matches your system. Example:
pip install torch==2.0.0 torchvision==0.15.1
# Install all other dependencies
pip install -r requirements.txt
Configure environment variables
Copy the .env.template file and fill in your OPENAI_API_KEY.
cp .env.template .env
π Getting Started
Before proceeding, please ensure that all the installation steps have been successfully completed.
π¨ Cost Warning
Please be aware that usage of GPT-4 with the assistant API can incur high costs. Make sure to monitor your usage and understand the pricing details provided by OpenAI before proceeding.
Gradio
python app.py
CLI
π§΅ Create Thread
To create a thread, you can either provide a URL to a file or a local path to a file. Use the following commands:
# For a URL
python thread.py <URL_TO_PDF>
# For a local file
python thread.py <LOCAL_PATH_TO_PDF>
By default, you will find all outputs under ./data/<PDF_NAME>. It will have the following structure.
./data/<PDF_NAME>/
βββ figures/
β   βββ <figure_1_name>.jpg
β   βββ <figure_2_name>.png
β   βββ ...
βββ <PDF_NAME>.pdf
βββ results.json
βββ thread.json
βββ processed_thread.json
βββ processed_thread.md
The final output for user consumption is located at ./data/<PDF_NAME>/processed_thread.md. This file is formatted in Markdown and can be conveniently viewed using any Markdown editor.
All Contents
- figures/: This directory contains all the figures, tables, and visuals that have been extracted from the paper.
- <PDF_NAME>.pdf: This is the original PDF file.
- results.json: This file contains the results of the layout parsing. It includes an index of all figures, their paths, and captions that were passed to OpenAI.
- thread.json: This file contains the raw thread that was generated by OpenAI before any post-processing was done.
- processed_thread.json: This file is a post-processed version of- thread.json. The post-processing includes steps such as removing source annotations and duplicate figures.
- processed_thread.md: This is a markdown version of- processed_thread.json. It is the final output provided for user consumption.
π¨ Share Thread
To actually share the thread on X/Twitter, you need to set up the credentials in the .env file. This requires creating a developer account and filling in your CONSUMER_KEY, CONSUMER_SECRET, ACCESS_KEY, and ACCESS_SECRET. Then run this command on the created JSON file:
python tweet.py ./data/<PDF_NAME>/processed_thread.json
π§ Customize Assistant
ThreadGPT utilizes OpenAI's assistant API. To customize the assistant's behavior, you need to modify the create_assistant.py file. This script has defaults for the prompt, name, tools, and model (gpt-4-1106-preview). You can customize these parameters to your liking.