Model Card for Model ID

This model card describes the LED model fine-tuned for financial text summarization, aimed at summarizing long financial documents such as reports and statements. The model is based on the LED (Longformer Encoder-Decoder) architecture, which is specifically designed to handle long documents efficiently with the help of global and local attention mechanisms

Model Details

Model Description

fahil2631/led-financial_summarization-genai15 a.k.a LED-FINAL-GENAI15 is a fine-tuned version of the pszemraj/led-large-book-summary model, adapted for the task of financial summarization. It was developed by GEN AI GROUP 15 (Fakhri, Amaan, Aisyah, Aditya, Jerry, Mewmew, Ridhi, Chinmay) from Warwick Business School (2024/2025).

The model was trained on the kritsadaK/EDGAR-CORPUS-Financial-Summarization dataset, which contains long-form financial texts like 10-K filings from EDGAR (1993โ€“2020). Summaries were primarily generated by ChatGPT (70%), ensuring consistency in style and format.

This model enables accurate summarization of long financial documents (up to 8000 tokens input) while maintaining essential content and coherence.

  • Developed by: GenAI Group 15 2024/2025, Warwick Business School
  • Fine-tuned from: pszemraj/led-large-book-summary
  • Task: Abstractive summarization (financial domain)
  • Language(s): English

Model Sources


Intended Uses

This model is designed for tasks where summarization of long financial documents is required. The use cases include:

Summarizing quarterly and annual financial reports

Generating executive summaries for financial filings

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

You can start using the led-financial_summarization-genai15 model to summarize long financial documents with either a simple pipeline or a custom global attention mask setup for more control.

๐Ÿ”น Basic Usage (via pipeline)

import torch
from transformers import pipeline

hf_name = 'fahil2631/led-financial_summarization-genai15'

summarizer = pipeline(
    "summarization",
    model=hf_name,
    tokenizer=hf_name,
    device=0 if torch.cuda.is_available() else -1,
)

wall_of_text = """Your long financial text goes here."""

result = summarizer(
    wall_of_text,
    min_length=16,
    max_length=256,
    no_repeat_ngram_size=3,
    encoder_no_repeat_ngram_size=3,
    repetition_penalty=2.5,
    num_beams=4,
    early_stopping=True,
)

print(result[0]["summary_text"])

๐Ÿ”น With Global Attention Mask

import torch
from transformers import pipeline,AutoTokenizer, AutoModelForSeq2SeqLM

hf_name = 'fahil2631/led-financial_summarization-genai15'

summarizer_1 = pipeline(
    "summarization",
    hf_name,
    device=0 if torch.cuda.is_available() else -1,
)

wall_of_text = """Your long financial text goes here."""


# Input tokenization
inputs = tokenizer(
    wall_of_text,
    return_tensors="pt",
    truncation=True,
    max_length=8000
)

# Mglobal attention mask
global_attention_mask = torch.zeros(inputs["input_ids"].shape, dtype=torch.long)

# Set first and last token to get the global attention
global_attention_mask[:, 0] = 1
global_attention_mask[:, -1] = 1

#Generate summary
model_1 = AutoModelForSeq2SeqLM.from_pretrained(hf_name).to(device)  # Move the model to the same device as input

summary_ids_1 = model_1.generate(
    inputs["input_ids"].to(device),  # Move input to the same device
    attention_mask=inputs["attention_mask"].to(device),  # Move attention mask to the same device
    global_attention_mask=global_attention_mask.to(device),  # Move global attention mask to the same device
    max_length=256,
    min_length=16,
    num_beams=4,
    repetition_penalty=2.5,
    no_repeat_ngram_size=3,
    early_stopping=True
)

#Decode the summary result
result_globalmask_pretrained = tokenizer.decode(summary_ids_1[0], skip_special_tokens=True)
result_globalmask_pretrained

Training Details

Training Data

The model was trained on a filtered subset of the kritsadaK/EDGAR-CORPUS-Financial-Summarization dataset, which contains financial reports (primarily 10-K filings) submitted by public companies to the U.S. SEC between 1993 and 2020.

Each document is paired with an abstractive summary generated by large language models (ChatGPT or Claude). To ensure consistency and style alignment, only the ChatGPT-generated summaries (approximately 70% of the dataset) were retained for training. The dataset was split into train/validation/test sets using group-based splitting based on hashed document IDs to prevent content leakage.

  • Total samples used: 6,664 (ChatGPT only)
    • Train: 5,331
    • Validation: 666
    • Test: 667
  • Input fields: input (original financial document), summary (target text), model (summary generator)
  • Filtering criteria: model == "ChatGPT"

This preprocessing ensured more consistent summary formats and improved training convergence.

Training Procedure

Fine-Tuning Dataset: EDGAR-CORPUS-Financial-Summarization

Training Batch Size: 1 (with gradient accumulation)

Training Epochs: 3

Optimizer: AdamW with 8-bit precision

Learning Rate: 3e-5

Evaluation: Every 500 steps

Checkpoints Saved: Every 1000 steps

GPU: NVIDIA L4 GPU.

Training Hyperparameters

  • Training regime: FP16 mixed precision
  • Batch size: 1 (with gradient accumulation steps = 2, effective batch size = 2)
  • Learning rate: 3e-5
  • Epochs: 3
  • Optimizer: AdamW (8-bit via bitsandbytes)
  • Evaluation steps: every 500 steps
  • Checkpointing: every 1000 steps
  • Max input length: 8000 tokens
  • Max target length: 256 tokens
  • Beam search: 4 beams
  • Repetition penalty: 2.5
  • No-repeat n-gram size: 3
  • Global attention mask: enabled on the first token

Speeds, Sizes, Times

  • GPU used: NVIDIA L4
  • Training runtime: ~2.5 hours per 1000 steps (7995 steps total)
  • Training throughput: ~1.68 samples/sec
  • Checkpoint size: ~1.84 GB (.safetensors)
  • Saved model size: ~1.84 GB

Evaluation

Metrics

The model was evaluated using standard ROUGE metrics:

  • ROUGE-1: Measures overlap of unigrams (individual words) between the system and reference summaries.
  • ROUGE-2: Measures overlap of bigrams (two consecutive words).
  • ROUGE-L: Measures the longest common subsequence between the system and reference summaries.
  • ROUGE-Lsum: A variation of ROUGE-L for multi-sentence summaries.

Evaluation Results

The following results were obtained on a set of 20 randomly selected samples from the test set:

Model ROUGE-1 ROUGE-2 ROUGE-L ROUGE-Lsum
led-financial_summarization-genai15 0.5121 0.2089 0.2987 0.4359
BART-financial-summarization 0.4574 0.1976 0.2728 0.3876
LED-large-book-summary 0.3066 0.0470 0.1391 0.2128

Summary

led-financial_summarization-genai15 outperformed both the BART-based and base LED models across all ROUGE metrics. This demonstrates its effectiveness in capturing financial context from long documents and generating coherent and information-rich summaries.

Downloads last month
117
Safetensors
Model size
460M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for fahil2631/led-financial_summarization-genai15

Finetuned
(1)
this model

Dataset used to train fahil2631/led-financial_summarization-genai15