Commit
·
0043349
1
Parent(s):
87211ae
Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
# robertuito-base-uncased
|
2 |
-
|
3 |
|
4 |
# RoBERTuito
|
5 |
## A pre-trained language model for social media text in Spanish
|
@@ -67,33 +67,7 @@ tokenizer.tokenize(preprocessed_text)
|
|
67 |
|
68 |
We are working on integrating this preprocessing step into a Tokenizer within `transformers` library
|
69 |
|
70 |
-
## Development
|
71 |
-
|
72 |
-
### Installing
|
73 |
-
|
74 |
-
We use `python==3.7` and `poetry` to manage dependencies.
|
75 |
-
|
76 |
-
```bash
|
77 |
-
pip install poetry
|
78 |
-
poetry install
|
79 |
-
```
|
80 |
-
|
81 |
-
### Benchmarking
|
82 |
-
|
83 |
-
To run benchmarks
|
84 |
|
85 |
-
```bash
|
86 |
-
python bin/run_benchmark.py <model_name> --times 5 --output_path <output_path>
|
87 |
-
```
|
88 |
-
|
89 |
-
Check [RUN_BENCHMARKS](RUN_BENCHMARKS.md) for all experiments
|
90 |
-
|
91 |
-
### Smoke test
|
92 |
-
Test the benchmark running
|
93 |
-
|
94 |
-
```
|
95 |
-
./smoke_test.sh
|
96 |
-
```
|
97 |
## Citation
|
98 |
|
99 |
If you use *RoBERTuito*, please cite our paper:
|
|
|
1 |
# robertuito-base-uncased
|
2 |
+
|
3 |
|
4 |
# RoBERTuito
|
5 |
## A pre-trained language model for social media text in Spanish
|
|
|
67 |
|
68 |
We are working on integrating this preprocessing step into a Tokenizer within `transformers` library
|
69 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
70 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
71 |
## Citation
|
72 |
|
73 |
If you use *RoBERTuito*, please cite our paper:
|