Longformer-base-4096 fine-tuned on SQuAD v2

Longformer-base-4096 model fine-tuned on SQuAD v2 for Q&A downstream task.

Longformer-base-4096

Longformer is a transformer model for long documents.

longformer-base-4096 is a BERT-like model started from the RoBERTa checkpoint and pretrained for MLM on long documents. It supports sequences of length up to 4,096.

Longformer uses a combination of a sliding window (local) attention and global attention. Global attention is user-configured based on the task to allow the model to learn task-specific representations.

Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓

Dataset ID: squad_v2 from HuggingFace/Datasets

Dataset Split # samples
squad_v2 train 130319
squad_v2 valid 11873

How to load it from datasets

!pip install datasets
from datasets import load_dataset
dataset = load_dataset('squad_v2')

Check out more about this dataset and others in Datasets Viewer

Model fine-tuning 🏋️‍

The training script is a slightly modified version of this one

Model in Action 🚀

import torch
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
ckpt = "mrm8488/longformer-base-4096-finetuned-squadv2"
tokenizer = AutoTokenizer.from_pretrained(ckpt)
model = AutoModelForQuestionAnswering.from_pretrained(ckpt)

text = "Huggingface has democratized NLP. Huge thanks to Huggingface for this."
question = "What has Huggingface done ?"
encoding = tokenizer(question, text, return_tensors="pt")
input_ids = encoding["input_ids"]

# default is local attention everywhere
# the forward method will automatically set global attention on question tokens
attention_mask = encoding["attention_mask"]

start_scores, end_scores = model(input_ids, attention_mask=attention_mask)
all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())

answer_tokens = all_tokens[torch.argmax(start_scores) :torch.argmax(end_scores)+1]
answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))

# output => democratized NLP

Usage with HF pipleine

from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline

ckpt = "mrm8488/longformer-base-4096-finetuned-squadv2"
tokenizer = AutoTokenizer.from_pretrained(ckpt)
model = AutoModelForQuestionAnswering.from_pretrained(ckpt)

qa = pipeline("question-answering", model=model, tokenizer=tokenizer)

text = "Huggingface has democratized NLP. Huge thanks to Huggingface for this."
question = "What has Huggingface done?"

qa({"question": question, "context": text})

If given the same context we ask something that is not there, the output for no answer will be <s>

Created by Manuel Romero/@mrm8488 | LinkedIn

Made with in Spain

ko-fi

Downloads last month
599
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mrm8488/longformer-base-4096-finetuned-squadv2

Finetunes
3 models

Dataset used to train mrm8488/longformer-base-4096-finetuned-squadv2

Space using mrm8488/longformer-base-4096-finetuned-squadv2 1

Evaluation results