mdeberta-v3-base-subjectivity-sentiment-italian

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the CheckThat! Lab Task 1 Subjectivity Detection at CLEF 2025. For the official code and materials, please refer to the GitHub repository.

It achieves the following results on the evaluation set:

Loss: 0.6602
Macro F1: 0.7437
Macro P: 0.7322
Macro R: 0.7690
Subj F1: 0.6437
Subj P: 0.5696
Subj R: 0.7401
Accuracy: 0.7826

Model description

This model, mdeberta-v3-base-subjectivity-sentiment-italian, is part of the AI Wizards' submission to the CLEF 2025 CheckThat! Lab Task 1: Subjectivity Detection in News Articles. Its primary goal is to classify sentences as subjective or objective. A key innovation in its development involved enhancing transformer-based classifiers, specifically mDeBERTaV3-base, by integrating sentiment scores derived from an auxiliary model with sentence representations. This sentiment-augmented architecture, combined with decision threshold calibration to address class imbalance, consistently boosted performance, especially the subjective F1 score.

Intended uses & limitations

This model is intended for identifying whether a sentence is subjective (opinion-laden) or objective in news articles, which is crucial for combating misinformation, improving fact-checking pipelines, and supporting journalists. It has been evaluated in monolingual (Italian, Arabic, German, English, Bulgarian), zero-shot (Greek, Polish, Romanian, Ukrainian), and multilingual settings. While the sentiment augmentation consistently improved performance, users should be aware that the model's effectiveness may vary across languages and specific domains not covered in the training data. The model was trained with specific hyperparameters and decision threshold calibration for class imbalance, which are critical for its reported performance.

Training and evaluation data

More information needed

How to use

You can use this model with the transformers library for text classification:

import torch
import torch.nn as nn
from transformers import DebertaV2Model, DebertaV2Config, AutoTokenizer, PreTrainedModel, pipeline, AutoModelForSequenceClassification 
from transformers.models.deberta.modeling_deberta import ContextPooler

sent_pipe = pipeline(
    "sentiment-analysis",
    model="cardiffnlp/twitter-xlm-roberta-base-sentiment",
    tokenizer="cardiffnlp/twitter-xlm-roberta-base-sentiment",
    top_k=None,  # return all 3 sentiment scores
)

class CustomModel(PreTrainedModel):
    config_class = DebertaV2Config
    def __init__(self, config, sentiment_dim=3, num_labels=2, *args, **kwargs):
        super().__init__(config, *args, **kwargs)
        self.deberta = DebertaV2Model(config)
        self.pooler = ContextPooler(config)
        output_dim = self.pooler.output_dim
        self.dropout = nn.Dropout(0.1)
        self.classifier = nn.Linear(output_dim + sentiment_dim, num_labels)

    def forward(self, input_ids, positive, neutral, negative, token_type_ids=None, attention_mask=None, labels=None):
        outputs = self.deberta(input_ids=input_ids, attention_mask=attention_mask)
        encoder_layer = outputs[0]
        pooled_output = self.pooler(encoder_layer)
        sentiment_features = torch.stack((positive, neutral, negative), dim=1).to(pooled_output.dtype)
        combined_features = torch.cat((pooled_output, sentiment_features), dim=1)
        logits = self.classifier(self.dropout(combined_features))
        return {'logits': logits}

model_name = "MatteoFasulo/mdeberta-v3-base-subjectivity-sentiment-italian"
tokenizer = AutoTokenizer.from_pretrained("microsoft/mdeberta-v3-base")
config = DebertaV2Config.from_pretrained(
    model_name, 
    num_labels=2, 
    id2label={0: 'OBJ', 1: 'SUBJ'}, 
    label2id={'OBJ': 0, 'SUBJ': 1},
    output_attentions=False, 
    output_hidden_states=False
)
model = CustomModel(config=config, sentiment_dim=3, num_labels=2).from_pretrained(model_name)

def classify_subjectivity(text: str):
    # get full sentiment distribution
    dist = sent_pipe(text)[0]
    pos = next(d["score"] for d in dist if d["label"] == "positive")
    neu = next(d["score"] for d in dist if d["label"] == "neutral")
    neg = next(d["score"] for d in dist if d["label"] == "negative")

    # tokenize the text
    inputs = tokenizer(text, padding=True, truncation=True, max_length=256, return_tensors='pt')

    # feeding in the three sentiment scores
    with torch.no_grad():
        outputs = model(
            input_ids=inputs["input_ids"],
            attention_mask=inputs["attention_mask"],
            positive=torch.tensor(pos).unsqueeze(0).float(),
            neutral=torch.tensor(neu).unsqueeze(0).float(),
            negative=torch.tensor(neg).unsqueeze(0).float()
        )

    # compute probabilities and pick the top label
    probs = torch.softmax(outputs.get('logits')[0], dim=-1)
    label = model.config.id2label[int(probs.argmax())]
    score = probs.max().item()

    return {"label": label, "score": score}

examples = [
    "Per quanto riguarda le motivazioni, è importante chiedersi se l’intervento è realmente mirato a risolvere un complesso, a modificare un aspetto del corpo con cui non si riesce a convivere serenamente, oppure se è frutto di una moda passeggera o dell’influenza tossica del web che spesso induce a volere cose di cui non si ha assolutamente bisogno.",
    "Un roulette di tensioni in cui alla fine spunta un match point per Sinner.",
]
for text in examples:
    result = classify_subjectivity(text)
    print(f"Text: {text}")
    print(f"→ Subjectivity: {result['label']} (score={result['score']:.2f})\n")

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 6

Training results

Training Loss	Epoch	Step	Validation Loss	Macro F1	Macro P	Macro R	Subj F1	Subj P	Subj R	Accuracy
No log	1.0	101	0.6392	0.7244	0.7284	0.7208	0.5913	0.6071	0.5763	0.7886
No log	2.0	202	0.5375	0.6731	0.7018	0.7579	0.6064	0.4548	0.9096	0.6867
No log	3.0	303	0.5731	0.7453	0.7373	0.7563	0.6349	0.5970	0.6780	0.7931
No log	4.0	404	0.5788	0.7522	0.7405	0.7752	0.6534	0.5848	0.7401	0.7916
0.4395	5.0	505	0.6922	0.7491	0.7400	0.7628	0.6423	0.5971	0.6949	0.7946
0.4395	6.0	606	0.6602	0.7437	0.7322	0.7690	0.6437	0.5696	0.7401	0.7826

Framework versions

Transformers 4.49.0
Pytorch 2.5.1+cu121
Datasets 3.3.1
Tokenizers 0.21.0

Code

The official code and materials for this submission are available on GitHub: https://github.com/MatteoFasulo/clef2025-checkthat

Citation

If you find our work helpful or inspiring, please feel free to cite it:

@misc{fasulo2025aiwizardscheckthat2025,
      title={AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles}, 
      author={Matteo Fasulo and Luca Babboni and Luca Tedeschini},
      year={2025},
      eprint={2507.11764},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2507.11764}, 
}

MatteoFasulo
/

mdeberta-v3-base-subjectivity-sentiment-italian

mdeberta-v3-base-subjectivity-sentiment-italian

Model description

Intended uses & limitations

Training and evaluation data

How to use

Training procedure

Training hyperparameters

Training results

Framework versions

Code

Citation

Model tree for MatteoFasulo/mdeberta-v3-base-subjectivity-sentiment-italian

Dataset used to train MatteoFasulo/mdeberta-v3-base-subjectivity-sentiment-italian

Collection including MatteoFasulo/mdeberta-v3-base-subjectivity-sentiment-italian

CLEF 2025 - CheckThat! Lab - Task 1 Subjectivity

Evaluation results