---
tags:
  - text
  - stance
language:
- en
metrics:
- f1
- accuracy
pipeline_tag: text-classification

widget:
- text: user Bolsonaro is the president of Brazil. He speaks for all brazilians. Greta is a climate activist. Their opinions do create a balance that the world needs now
  example_title: example 1
- text: user The fact is that she still doesn’t change her ways and still stays non environmental friendly
  example_title: example 2
- text: user The criteria for these awards dont seem to be very high.
  example_title: example 3

model-index:
- name: StanceBERTa
  results:
  - task:
      type: text-classification
      name: Text Classification         # Optional. Example: Speech Recognition
    dataset:
      type: social media          # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
      name: unpublished          # Required. A pretty name for the dataset. Example: Common Voice (French)
    metrics:
      - type: f1          
        value: 77.8
      - type: accuracy          
        value: 78.5   
---

# eevvgg/StanceBERTa

<!-- Provide a quick summary of what the model is/does. -->

This model is a fine-tuned version of **distilroberta-base** model to predict 3 categories of stance (negative, positive, neutral) towards some entity mentioned in the text.
Fine-tuned on a larger and more balanced data sample compared with the previous version [eevvgg/Stance-Tw](https://huggingface.co/eevvgg/Stance-Tw). 


- **Developed by:** Ewelina Gajewska 

- **Model type:** RoBERTa for stance classification
- **Language(s) (NLP):** English social media data from Twitter and Reddit
- **Finetuned from model:** [distilroberta-base](distilroberta-base)


## Uses

```
from transformers import pipeline

model_path = "eevvgg/StanceBERTa"
cls_task = pipeline(task = "text-classification", model = model_path, tokenizer = model_path)#, device=0 

sequence = ["user The fact is that she still doesn’t change her ways and still stays non environmental friendly"
            "user The criteria for these awards dont seem to be very high."]
            
result = cls_task(sequence)
                                        
```

Model suited for classification of stance in short text. Fine-tuned on a balanced corpus of size 5.6k, partially semi-annotated. 
*Suitable for fine-tuning on hate/offensive language detection.

## Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** training procedure available in [Colab notebook](https://colab.research.google.com/drive/1-C47Ei7vgYtcfLLBB_Vkm3nblE5zH-aL?usp=sharing)
- **Paper :** tba


## Training Details

### Preprocessing 

Normalization of user mentions and hyperlinks to "@user" and "http" tokens, respectively.

### Training Hyperparameters

- trained for 3 epochs, mini-batch size of 8.
- loss: 0.509
- learning_rate: 5e-5; weight_decay: 1e-2

## Evaluation

### Results

- evaluation on 15% of data.

- accuracy: 0.785
- macro avg:
  - f1: 0.778
  - precision: 0.779
  - recall: 0.778
- weighted avg:
  - f1: 0.786
  - precision: 0.786
  - recall: 0.785


## Citation 

**BibTeX:** tba