im2latex_model
This model is a VisionEncoderDecoderModel trained on a dataset for generating LaTeX formulas from images. This is part of a project that reproduces the following paper: https://arxiv.org/html/2408.04015v1. NOTE: In the paper, the model is finetuned on handwritten data after training. This is the model before finetuning.
Model Details
- Encoder: Swin Transformer
- Decoder: GPT-2
- Framework: PyTorch
Training Data
The data is taken from OleehyO/latex-formulas. The data was divided into 80:10:10 for train, val and test. The splits were made as follows:
dataset = load_dataset(OleehyO/latex-formulas, cleaned_formulas)
train_val_split = dataset["train"].train_test_split(test_size=0.2, seed=42)
train_ds = train_val_split["train"]
val_test_split = train_val_split["test"].train_test_split(test_size=0.5, seed=42)
val_ds = val_test_split["train"]
test_ds = val_test_split["test"]
Evaluation Metrics
The model was evaluated on a test set with the following results:
- Test Loss: TBA
- Test BLEU Score: ~0.7
Usage
You can use the model directly with the transformers
library:
from transformers import VisionEncoderDecoderModel, AutoTokenizer, AutoFeatureExtractor
import torch
from PIL import Image
# Load model, tokenizer, and feature extractor
model = VisionEncoderDecoderModel.from_pretrained("your-username/your-model-name")
tokenizer = AutoTokenizer.from_pretrained("your-username/your-model-name")
feature_extractor = AutoFeatureExtractor.from_pretrained("your-username/your-model-name")
# Prepare an image
image = Image.open("path/to/your/image.png")
pixel_values = feature_extractor(images=image, return_tensors="pt").pixel_values
# Generate LaTeX formula
generated_ids = model.generate(pixel_values)
generated_texts = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print("Generated LaTeX formula:", generated_texts[0])
Training Script
The training script for this model can be found in the following repository: GitHub
license: agpl-3.0
- Downloads last month
- 112
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.