siebert
/

sentiment-roberta-large-english

@@ -12,13 +12,13 @@ This model is a fine-tuned checkpoint of [RoBERTa-large](https://huggingface.co/
 # Predictions on a data set
-If you want to predict sentiment for your own data, we provide an example script via [Google Colab](https://colab.research.google.com/notebooks/intro.ipynb). You can load your data to a Google Drive and run the script for free on a Colab GPU. Set-up takes only a few minutes. We suggest that you manually label a subset of your data to evaluate performance for your use case. For performance benchmark values across different sentiment analysis contexts, please refer to our paper ([Heitmann et al. 2020](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3489963)).
 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/chrsiebert/sentiment-roberta-large-english/blob/main/sentiment_roberta_prediction_example.ipynb)
 # Use in a Hugging Face pipeline
-The easiest way to use the model for single predictions is Hugging Face's [sentiment analysis pipeline](https://huggingface.co/transformers/quicktour.html#getting-started-on-a-task-with-a-pipeline), which only needs a couple lines of code as in the following example:
 ```
 from transformers import pipeline
 sentiment_analysis = pipeline("sentiment-analysis",model="siebert/sentiment-roberta-large-english")
@@ -29,11 +29,11 @@ print(sentiment_analysis("I love this!"))
 # Use for further fine-tuning
-The model can also be used as a starting point for further fine-tuning on your specific data. Please refer to Hugging Face's [documentation](https://huggingface.co/transformers/custom_datasets.html#fine-tuning-with-trainer) for further details and example code.
 # Performance
-To evaluate the performance of our general-purpose sentiment analysis model, we set aside an evaluation set from each data set, which was not used for training. On average, our model outperforms a [DistilBERT-based model](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) (which is solely fine-tuned on the popular SST-2 data set) by more than 15 percentage points (78.1 vs. 93.2, see table below). As a robustness check, we evaluate the model in a leave-on-out manner (training on 14 data sets, evaluating on the one left out), which decreases model performance by only about 3 percentage points on average and underscores its generalizability. Model performance is given as evaluation set accuracy in percent.
 |Dataset|DistilBERT SST-2|This model|
 |---|---|---|

 # Predictions on a data set
+If you want to predict sentiment for your own data, we provide an example script via [Google Colab](https://colab.research.google.com/notebooks/intro.ipynb). You can load your data to a Google Drive and run the script for free on a Colab GPU. Set-up takes only a few minutes. We suggest that you manually label a subset of your data to evaluate performance for your use case. For performance benchmark values across various sentiment analysis contexts, please refer to our paper ([Heitmann et al. 2020](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3489963)).
 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/chrsiebert/sentiment-roberta-large-english/blob/main/sentiment_roberta_prediction_example.ipynb)
 # Use in a Hugging Face pipeline
+The easiest way to use the model for single predictions is Hugging Face's [sentiment analysis pipeline](https://huggingface.co/transformers/quicktour.html#getting-started-on-a-task-with-a-pipeline), which only needs a couple lines of code as shown in the following example:
 ```
 from transformers import pipeline
 sentiment_analysis = pipeline("sentiment-analysis",model="siebert/sentiment-roberta-large-english")
 # Use for further fine-tuning
+The model can also be used as a starting point for further fine-tuning of RoBERTa on your specific data. Please refer to Hugging Face's [documentation](https://huggingface.co/transformers/custom_datasets.html#fine-tuning-with-trainer) for further details and example code.
 # Performance
+To evaluate the performance of our general-purpose sentiment analysis model, we set aside an evaluation set from each data set, which was not used for training. On average, our model outperforms a [DistilBERT-based model](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) (which is solely fine-tuned on the popular SST-2 data set) by more than 15 percentage points (78.1 vs. 93.2 percent, see table below). As a robustness check, we evaluate the model in a leave-on-out manner (training on 14 data sets, evaluating on the one left out), which decreases model performance by only about 3 percentage points on average and underscores its generalizability. Model performance is given as evaluation set accuracy in percent.
 |Dataset|DistilBERT SST-2|This model|
 |---|---|---|