README.md · yashrane2904/LED

LED_Finetuned / README.md

yashrane2904

Update README.md

ac8a984 verified over 1 year ago

preview code

raw

history blame contribute delete

3.66 kB

	---
	library_name: transformers
	tags:
	- Summarization
	- Longformer
	- LED
	- Fine-Tuned
	- Abstractive
	- Scientific
	- seq2seq
	- transformers
	- english
	- attention
	- text-processing
	- NLP
	- beam-search
	anguage: null
	language:
	- en
	metrics:
	- rouge
	- precision
	pipeline_tag: summarization
	---


	# Model Card for Model ID

	This model is a fine-tuned version of the Longformer Encoder-Decoder (LED)- "allenai/led-base-16384", specifically tailored for [describe the task, e.g., "summarizing scientific articles"].
	LED, originally designed for long document tasks, leverages a sparse attention mechanism to handle much longer contexts than standard transformer models.
	Our version extends its capabilities to efficiently summarize texts with high fidelity and relevance.

	This Model can handle a total input token of upto "16000" tokens which is larger than most of the models present out there.

	# Base code is specified below, try that out, API example wont work as tokenizer of allenai/led-base-16384 is used!

	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

	This model is intended for use in scenarios where understanding and condensing long texts is necessary. It is particularly useful for:

	Academic researchers needing summaries of lengthy papers.
	Professionals who require digests of extensive reports.
	Content creators looking for concise versions of long articles.

	Please note: This model will work for any summarization process to generate abstractive summary, just keep in mind to get the best results for a particular domain,
	you need to train the model on your specific dataset if for a specific domain.



	## Limitations

	<!-- This section is meant to convey both technical and sociotechnical limitations. -->

	The only limitation you might face is, to get the best results, you will have to fine-tune it. LOL!!



	## How to Get Started with the Model

	Use the code below to get started with the model.

	---
	from transformers import AutoModelForSeq2SeqLM, AutoTokenizer <br>
	import torch <br>
	<b><i>#Load the model and tokenizer</i></b><br>
	model = AutoModelForSeq2SeqLM.from_pretrained("yashrane2904/LED_Finetuned").to("cuda")<br>
	tokenizer = AutoTokenizer.from_pretrained("allenai/led-base-16384") # Since it is a fine-tuned version of led-base-16348, we use the same tokenizer as that model used<br>
	LONG_ARTICLE = "Your long text goes here..."<br>
	<b><i>#Tokenize the input article</i></b><br>
	input_ids = tokenizer(LONG_ARTICLE, return_tensors="pt").input_ids.to("cuda")<br>
	global_attention_mask = torch.zeros_like(input_ids)<br>
	global_attention_mask[:, 0] = 1<br>
	<b><i>#Generate summaries</i></b><br>
	sequences_tensor = model.generate(input_ids, global_attention_mask=global_attention_mask, num_beams=10, num_beam_groups=1,repetition_penalty=6.0,max_length=600,min_length=350,temperature=1.5)<br>
	sequences = sequences_tensor.tolist() # Convert Tensor to list of token IDs<br>
	summary = tokenizer.batch_decode(sequences, skip_special_tokens=True) # Decode token IDs into text<br>
	<b><i>#Print the generated summary</i></b><br>
	print(summary)
	---

	## Feel free to play around with the hyperparameters in the generate, or some other parameters to include for experimentation purpose.


	## Model Card Authors & Citation

	@misc {yash_rane_2024,<br>
	author = { {Yash Rane} },<br>
	title = { LED_Finetuned (Revision f480282) },<br>
	year = 2024,<br>
	url = { https://huggingface.co/yashrane2904/LED_Finetuned },<br>
	doi = { 10.57967/hf/2074 },<br>
	publisher = { Hugging Face }<br>
	}<br>