ModernBERT-base-subjectivity-english

This model is a fine-tuned version of answerdotai/ModernBERT-base on the CheckThat! Lab Task 1 Subjectivity Detection at CLEF 2025.

The model was presented in the paper AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles.

The official code repository can be found at: https://github.com/MatteoFasulo/clef2025-checkthat

It achieves the following results on the evaluation set:

Loss: 1.0478
Macro F1: 0.7034
Macro P: 0.7058
Macro R: 0.7051
Subj F1: 0.6989
Subj P: 0.7395
Subj R: 0.6625
Accuracy: 0.7035

Model description

This model, ModernBERT-base-subjectivity-english, is a fine-tuned version of answerdotai/ModernBERT-base designed for subjectivity detection in news articles. It was developed as part of AI Wizards' participation in the CLEF 2025 CheckThat! Lab Task 1, aiming to classify sentences as subjective or objective. The core innovation of this model lies in enhancing transformer-based embeddings by integrating sentiment scores, derived from an auxiliary model, with sentence representations. This approach has shown to significantly boost performance, particularly the subjective F1 score, and aims to improve upon standard fine-tuning methods. To address prevalent class imbalance across languages, the model also employs decision threshold calibration optimized on the development set.

Intended uses & limitations

This model is intended for classifying sentences in news articles as subjective (opinion-laden) or objective. This capability is crucial for applications such as combating misinformation, improving fact-checking pipelines, and supporting journalistic efforts. While this specific model is tailored for English, the broader research explored its effectiveness across monolingual (Arabic, German, Italian, Bulgarian) and zero-shot transfer settings (Greek, Polish, Romanian, Ukrainian). A key strength is its use of decision threshold calibration to mitigate class imbalance. However, users should note that the original submission had an issue with skewed class distribution which was later corrected, indicating the importance of proper data splits and calibration for optimal performance.

Training and evaluation data

The ModernBERT-base-subjectivity-english model was fine-tuned on the English portion of the CheckThat! Lab Task 1: Subjectivity Detection in News Articles dataset provided for CLEF 2025. The training and development datasets included sentences in English (among other languages like Arabic, German, Italian, and Bulgarian). For final evaluation, the broader project also assessed generalization on unseen languages like Greek, Romanian, Polish, and Ukrainian. The training strategy involved augmenting transformer embeddings with sentiment signals and employing decision threshold calibration to improve performance and handle class imbalance.

How to use

You can use this model directly with the transformers library for text classification:

from transformers import pipeline

# Load the text classification pipeline
classifier = pipeline(
    "text-classification",
    model="MatteoFasulo/ModernBERT-base-subjectivity-english",
    tokenizer="answerdotai/ModernBERT-base",
)

text1 = "The company reported a 10% increase in profits in the last quarter."
result1 = classifier(text1)
print(f"Text: '{text1}' Classification: {result1}")

text2 = "This product is absolutely amazing and everyone should try it!"
result2 = classifier(text2)
print(f"Text: '{text2}' Classification: {result2}")

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 6

Training results

Training Loss	Epoch	Step	Validation Loss	Macro F1	Macro P	Macro R	Subj F1	Subj P	Subj R	Accuracy
No log	1.0	52	0.5800	0.6904	0.6932	0.6923	0.6843	0.7277	0.6458	0.6905
No log	2.0	104	0.5345	0.7242	0.7250	0.7239	0.7403	0.7269	0.7542	0.7251
No log	3.0	156	0.7359	0.6953	0.7078	0.7009	0.6729	0.7660	0.6	0.6970
No log	4.0	208	0.7670	0.7249	0.7248	0.7251	0.7326	0.7404	0.725	0.7251
No log	5.0	260	1.0715	0.7027	0.7102	0.7065	0.6879	0.7588	0.6292	0.7035
No log	6.0	312	1.0478	0.7034	0.7058	0.7051	0.6989	0.7395	0.6625	0.7035

Framework versions

Transformers 4.49.0
Pytorch 2.5.1+cu121
Datasets 3.3.1
Tokenizers 0.21.0

Code

The official code and materials for this submission are available on GitHub: https://github.com/MatteoFasulo/clef2025-checkthat

Citation

If you find our work helpful or inspiring, please feel free to cite it:

@misc{fasulo2025aiwizardscheckthat2025,
      title={AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles}, 
      author={Matteo Fasulo and Luca Babboni and Luca Tedeschini},
      year={2025},
      eprint={2507.11764},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2507.11764}, 
}

MatteoFasulo
/

ModernBERT-base-subjectivity-english

ModernBERT-base-subjectivity-english

Model description

Intended uses & limitations

Training and evaluation data

How to use

Training procedure

Training hyperparameters

Training results

Framework versions

Code

Citation

Model tree for MatteoFasulo/ModernBERT-base-subjectivity-english

Dataset used to train MatteoFasulo/ModernBERT-base-subjectivity-english

Collection including MatteoFasulo/ModernBERT-base-subjectivity-english

CLEF 2025 - CheckThat! Lab - Task 1 Subjectivity

Evaluation results