Fixed minor details in README.md
Browse files
README.md
CHANGED
@@ -12,13 +12,13 @@ This model is a fine-tuned checkpoint of [RoBERTa-large](https://huggingface.co/
|
|
12 |
|
13 |
|
14 |
# Predictions on a data set
|
15 |
-
If you want to predict sentiment for your own data, we provide an example script via [Google Colab](https://colab.research.google.com/notebooks/intro.ipynb). You can load your data to a Google Drive and run the script for free on a Colab GPU. Set-up takes only a few minutes. We suggest that you manually label a subset of your data to evaluate performance for your use case. For performance benchmark values across
|
16 |
|
17 |
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/chrsiebert/sentiment-roberta-large-english/blob/main/sentiment_roberta_prediction_example.ipynb)
|
18 |
|
19 |
|
20 |
# Use in a Hugging Face pipeline
|
21 |
-
The easiest way to use the model for single predictions is Hugging Face's [sentiment analysis pipeline](https://huggingface.co/transformers/quicktour.html#getting-started-on-a-task-with-a-pipeline), which only needs a couple lines of code as in the following example:
|
22 |
```
|
23 |
from transformers import pipeline
|
24 |
sentiment_analysis = pipeline("sentiment-analysis",model="siebert/sentiment-roberta-large-english")
|
@@ -29,11 +29,11 @@ print(sentiment_analysis("I love this!"))
|
|
29 |
|
30 |
|
31 |
# Use for further fine-tuning
|
32 |
-
The model can also be used as a starting point for further fine-tuning on your specific data. Please refer to Hugging Face's [documentation](https://huggingface.co/transformers/custom_datasets.html#fine-tuning-with-trainer) for further details and example code.
|
33 |
|
34 |
|
35 |
# Performance
|
36 |
-
To evaluate the performance of our general-purpose sentiment analysis model, we set aside an evaluation set from each data set, which was not used for training. On average, our model outperforms a [DistilBERT-based model](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) (which is solely fine-tuned on the popular SST-2 data set) by more than 15 percentage points (78.1 vs. 93.2, see table below). As a robustness check, we evaluate the model in a leave-on-out manner (training on 14 data sets, evaluating on the one left out), which decreases model performance by only about 3 percentage points on average and underscores its generalizability. Model performance is given as evaluation set accuracy in percent.
|
37 |
|
38 |
|Dataset|DistilBERT SST-2|This model|
|
39 |
|---|---|---|
|
|
|
12 |
|
13 |
|
14 |
# Predictions on a data set
|
15 |
+
If you want to predict sentiment for your own data, we provide an example script via [Google Colab](https://colab.research.google.com/notebooks/intro.ipynb). You can load your data to a Google Drive and run the script for free on a Colab GPU. Set-up takes only a few minutes. We suggest that you manually label a subset of your data to evaluate performance for your use case. For performance benchmark values across various sentiment analysis contexts, please refer to our paper ([Heitmann et al. 2020](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3489963)).
|
16 |
|
17 |
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/chrsiebert/sentiment-roberta-large-english/blob/main/sentiment_roberta_prediction_example.ipynb)
|
18 |
|
19 |
|
20 |
# Use in a Hugging Face pipeline
|
21 |
+
The easiest way to use the model for single predictions is Hugging Face's [sentiment analysis pipeline](https://huggingface.co/transformers/quicktour.html#getting-started-on-a-task-with-a-pipeline), which only needs a couple lines of code as shown in the following example:
|
22 |
```
|
23 |
from transformers import pipeline
|
24 |
sentiment_analysis = pipeline("sentiment-analysis",model="siebert/sentiment-roberta-large-english")
|
|
|
29 |
|
30 |
|
31 |
# Use for further fine-tuning
|
32 |
+
The model can also be used as a starting point for further fine-tuning of RoBERTa on your specific data. Please refer to Hugging Face's [documentation](https://huggingface.co/transformers/custom_datasets.html#fine-tuning-with-trainer) for further details and example code.
|
33 |
|
34 |
|
35 |
# Performance
|
36 |
+
To evaluate the performance of our general-purpose sentiment analysis model, we set aside an evaluation set from each data set, which was not used for training. On average, our model outperforms a [DistilBERT-based model](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) (which is solely fine-tuned on the popular SST-2 data set) by more than 15 percentage points (78.1 vs. 93.2 percent, see table below). As a robustness check, we evaluate the model in a leave-on-out manner (training on 14 data sets, evaluating on the one left out), which decreases model performance by only about 3 percentage points on average and underscores its generalizability. Model performance is given as evaluation set accuracy in percent.
|
37 |
|
38 |
|Dataset|DistilBERT SST-2|This model|
|
39 |
|---|---|---|
|