Model Card for Model ID

This model card describes the LED model fine-tuned for financial text summarization, aimed at summarizing long financial documents such as reports and statements. The model is based on the LED (Longformer Encoder-Decoder) architecture, which is specifically designed to handle long documents efficiently with the help of global and local attention mechanisms

Model Details

Model Description

fahil2631/led-financial_summarization-genai15 a.k.a LED-FINAL-GENAI15 is a fine-tuned version of the pszemraj/led-large-book-summary model, adapted for the task of financial summarization. It was developed by GEN AI GROUP 15 (Fakhri, Amaan, Aisyah, Aditya, Jerry, Mewmew, Ridhi, Chinmay) from Warwick Business School (2024/2025).

The model was trained on the kritsadaK/EDGAR-CORPUS-Financial-Summarization dataset, which contains long-form financial texts like 10-K filings from EDGAR (1993–2020). Summaries were primarily generated by ChatGPT (70%), ensuring consistency in style and format.

This model enables accurate summarization of long financial documents (up to 8000 tokens input) while maintaining essential content and coherence.

Developed by: GenAI Group 15 2024/2025, Warwick Business School
Fine-tuned from: pszemraj/led-large-book-summary
Task: Abstractive summarization (financial domain)
Language(s): English

Model Sources

Intended Uses

This model is designed for tasks where summarization of long financial documents is required. The use cases include:

Summarizing quarterly and annual financial reports

Generating executive summaries for financial filings

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

You can start using the led-financial_summarization-genai15 model to summarize long financial documents with either a simple pipeline or a custom global attention mask setup for more control.

🔹 Basic Usage (via `pipeline`)

import torch
from transformers import pipeline

hf_name = 'fahil2631/led-financial_summarization-genai15'

summarizer = pipeline(
    "summarization",
    model=hf_name,
    tokenizer=hf_name,
    device=0 if torch.cuda.is_available() else -1,
)

wall_of_text = """Your long financial text goes here."""

result = summarizer(
    wall_of_text,
    min_length=16,
    max_length=256,
    no_repeat_ngram_size=3,
    encoder_no_repeat_ngram_size=3,
    repetition_penalty=2.5,
    num_beams=4,
    early_stopping=True,
)

print(result[0]["summary_text"])

🔹 With Global Attention Mask

import torch
from transformers import pipeline,AutoTokenizer, AutoModelForSeq2SeqLM

hf_name = 'fahil2631/led-financial_summarization-genai15'

summarizer_1 = pipeline(
    "summarization",
    hf_name,
    device=0 if torch.cuda.is_available() else -1,
)

wall_of_text = """Your long financial text goes here."""


# Input tokenization
inputs = tokenizer(
    wall_of_text,
    return_tensors="pt",
    truncation=True,
    max_length=8000
)

# Mglobal attention mask
global_attention_mask = torch.zeros(inputs["input_ids"].shape, dtype=torch.long)

# Set first and last token to get the global attention
global_attention_mask[:, 0] = 1
global_attention_mask[:, -1] = 1

#Generate summary
model_1 = AutoModelForSeq2SeqLM.from_pretrained(hf_name).to(device)  # Move the model to the same device as input

summary_ids_1 = model_1.generate(
    inputs["input_ids"].to(device),  # Move input to the same device
    attention_mask=inputs["attention_mask"].to(device),  # Move attention mask to the same device
    global_attention_mask=global_attention_mask.to(device),  # Move global attention mask to the same device
    max_length=256,
    min_length=16,
    num_beams=4,
    repetition_penalty=2.5,
    no_repeat_ngram_size=3,
    early_stopping=True
)

#Decode the summary result
result_globalmask_pretrained = tokenizer.decode(summary_ids_1[0], skip_special_tokens=True)
result_globalmask_pretrained

Training Details

Training Data

The model was trained on a filtered subset of the kritsadaK/EDGAR-CORPUS-Financial-Summarization dataset, which contains financial reports (primarily 10-K filings) submitted by public companies to the U.S. SEC between 1993 and 2020.

Each document is paired with an abstractive summary generated by large language models (ChatGPT or Claude). To ensure consistency and style alignment, only the ChatGPT-generated summaries (approximately 70% of the dataset) were retained for training. The dataset was split into train/validation/test sets using group-based splitting based on hashed document IDs to prevent content leakage.

Total samples used: 6,664 (ChatGPT only)
- Train: 5,331
- Validation: 666
- Test: 667
Input fields: input (original financial document), summary (target text), model (summary generator)
Filtering criteria: model == "ChatGPT"

This preprocessing ensured more consistent summary formats and improved training convergence.

Training Procedure

Fine-Tuning Dataset: EDGAR-CORPUS-Financial-Summarization

Training Batch Size: 1 (with gradient accumulation)

Training Epochs: 3

Optimizer: AdamW with 8-bit precision

Learning Rate: 3e-5

Evaluation: Every 500 steps

Checkpoints Saved: Every 1000 steps

GPU: NVIDIA L4 GPU.

Training Hyperparameters

Training regime: FP16 mixed precision
Batch size: 1 (with gradient accumulation steps = 2, effective batch size = 2)
Learning rate: 3e-5
Epochs: 3
Optimizer: AdamW (8-bit via bitsandbytes)
Evaluation steps: every 500 steps
Checkpointing: every 1000 steps
Max input length: 8000 tokens
Max target length: 256 tokens
Beam search: 4 beams
Repetition penalty: 2.5
No-repeat n-gram size: 3
Global attention mask: enabled on the first token

Speeds, Sizes, Times

GPU used: NVIDIA L4
Training runtime: ~2.5 hours per 1000 steps (7995 steps total)
Training throughput: ~1.68 samples/sec
Checkpoint size: ~1.84 GB (.safetensors)
Saved model size: ~1.84 GB

Evaluation

Metrics

The model was evaluated using standard ROUGE metrics:

ROUGE-1: Measures overlap of unigrams (individual words) between the system and reference summaries.
ROUGE-2: Measures overlap of bigrams (two consecutive words).
ROUGE-L: Measures the longest common subsequence between the system and reference summaries.
ROUGE-Lsum: A variation of ROUGE-L for multi-sentence summaries.

Evaluation Results

The following results were obtained on a set of 20 randomly selected samples from the test set:

Model	ROUGE-1	ROUGE-2	ROUGE-L	ROUGE-Lsum
led-financial_summarization-genai15	0.5121	0.2089	0.2987	0.4359
BART-financial-summarization	0.4574	0.1976	0.2728	0.3876
LED-large-book-summary	0.3066	0.0470	0.1391	0.2128

Summary

led-financial_summarization-genai15 outperformed both the BART-based and base LED models across all ROUGE metrics. This demonstrates its effectiveness in capturing financial context from long documents and generating coherent and information-rich summaries.

fahil2631
/

led-financial_summarization-genai15

Model Card for Model ID

Model Details

Model Description

Model Sources

Intended Uses

How to Get Started with the Model

🔹 Basic Usage (via `pipeline`)

🔹 With Global Attention Mask

Training Details

Training Data

Training Procedure

Training Hyperparameters

Speeds, Sizes, Times

Evaluation

Metrics

Evaluation Results

Summary

Model tree for fahil2631/led-financial_summarization-genai15

Dataset used to train fahil2631/led-financial_summarization-genai15

Model Card for Model ID

Model Details

Model Description

Model Sources

Intended Uses

How to Get Started with the Model

🔹 Basic Usage (via pipeline)

🔹 With Global Attention Mask

Training Details

Training Data

Training Procedure

Training Hyperparameters

Speeds, Sizes, Times

Evaluation

Metrics

Evaluation Results

Summary

Model tree for fahil2631/led-financial_summarization-genai15

Dataset used to train fahil2631/led-financial_summarization-genai15

🔹 Basic Usage (via `pipeline`)