|
--- |
|
library_name: transformers |
|
tags: |
|
- Summarization |
|
- Longformer |
|
- LED |
|
- Fine-Tuned |
|
- Abstractive |
|
- Scientific |
|
- seq2seq |
|
- transformers |
|
- english |
|
- attention |
|
- text-processing |
|
- NLP |
|
- beam-search |
|
anguage: null |
|
language: |
|
- en |
|
metrics: |
|
- rouge |
|
- precision |
|
pipeline_tag: summarization |
|
--- |
|
|
|
|
|
# Model Card for Model ID |
|
|
|
This model is a fine-tuned version of the Longformer Encoder-Decoder (LED)- "allenai/led-base-16384", specifically tailored for [describe the task, e.g., "summarizing scientific articles"]. |
|
LED, originally designed for long document tasks, leverages a sparse attention mechanism to handle much longer contexts than standard transformer models. |
|
Our version extends its capabilities to efficiently summarize texts with high fidelity and relevance. |
|
|
|
This Model can handle a total input token of upto "16000" tokens which is larger than most of the models present out there. |
|
|
|
# Base code is specified below, try that out, API example wont work as tokenizer of allenai/led-base-16384 is used! |
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
This model is intended for use in scenarios where understanding and condensing long texts is necessary. It is particularly useful for: |
|
|
|
Academic researchers needing summaries of lengthy papers. |
|
Professionals who require digests of extensive reports. |
|
Content creators looking for concise versions of long articles. |
|
|
|
Please note: This model will work for any summarization process to generate abstractive summary, just keep in mind to get the best results for a particular domain, |
|
you need to train the model on your specific dataset if for a specific domain. |
|
|
|
|
|
|
|
## Limitations |
|
|
|
<!-- This section is meant to convey both technical and sociotechnical limitations. --> |
|
|
|
The only limitation you might face is, to get the best results, you will have to fine-tune it. LOL!! |
|
|
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
--- |
|
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer <br> |
|
import torch <br> |
|
<b><i>#Load the model and tokenizer</i></b><br> |
|
model = AutoModelForSeq2SeqLM.from_pretrained("yashrane2904/LED_Finetuned").to("cuda")<br> |
|
tokenizer = AutoTokenizer.from_pretrained("allenai/led-base-16384") # Since it is a fine-tuned version of led-base-16348, we use the same tokenizer as that model used<br> |
|
LONG_ARTICLE = "Your long text goes here..."<br> |
|
<b><i>#Tokenize the input article</i></b><br> |
|
input_ids = tokenizer(LONG_ARTICLE, return_tensors="pt").input_ids.to("cuda")<br> |
|
global_attention_mask = torch.zeros_like(input_ids)<br> |
|
global_attention_mask[:, 0] = 1<br> |
|
<b><i>#Generate summaries</i></b><br> |
|
sequences_tensor = model.generate(input_ids, global_attention_mask=global_attention_mask, num_beams=10, num_beam_groups=1,repetition_penalty=6.0,max_length=600,min_length=350,temperature=1.5)<br> |
|
sequences = sequences_tensor.tolist() # Convert Tensor to list of token IDs<br> |
|
summary = tokenizer.batch_decode(sequences, skip_special_tokens=True) # Decode token IDs into text<br> |
|
<b><i>#Print the generated summary</i></b><br> |
|
print(summary) |
|
--- |
|
|
|
## Feel free to play around with the hyperparameters in the generate, or some other parameters to include for experimentation purpose. |
|
|
|
|
|
## Model Card Authors & Citation |
|
|
|
@misc {yash_rane_2024,<br> |
|
author = { {Yash Rane} },<br> |
|
title = { LED_Finetuned (Revision f480282) },<br> |
|
year = 2024,<br> |
|
url = { https://huggingface.co/yashrane2904/LED_Finetuned },<br> |
|
doi = { 10.57967/hf/2074 },<br> |
|
publisher = { Hugging Face }<br> |
|
}<br> |