ihk commited on
Commit
f312c3c
·
1 Parent(s): e8a2b50

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -3
README.md CHANGED
@@ -7,10 +7,51 @@ license: mit
7
  language:
8
  - en
9
  widget:
10
- - text: "You must be proficient in [MASK]."
11
- - text: "Would you like to join a major manufacturing [MASK]?"
 
12
  ---
13
 
14
  _Nesta, the UK's innovation agency, has been scraping online job adverts since 2021 and building algorithms to extract and structure information as part of the [Open Jobs Observatory](https://www.nesta.org.uk/project/open-jobs-observatory/) project._
15
 
16
- _Although we are unable to share the raw data openly, we aim to open source **our models, algorithms and tools** so that anyone can use them for their own research and analysis._
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  language:
8
  - en
9
  widget:
10
+ - text: Would you like to join a major [MASK] company?
11
+ tags:
12
+ - jobs
13
  ---
14
 
15
  _Nesta, the UK's innovation agency, has been scraping online job adverts since 2021 and building algorithms to extract and structure information as part of the [Open Jobs Observatory](https://www.nesta.org.uk/project/open-jobs-observatory/) project._
16
 
17
+ _Although we are unable to share the raw data openly, we aim to open source **our models, algorithms and tools** so that anyone can use them for their own research and analysis._
18
+
19
+ This model is pre-trained from a `distilbert-base-uncased` checkpoint on 100k sentences from scraped online job postings as part of the Open Jobs Observatory.
20
+
21
+ 🖨️ Use
22
+ To use the model:
23
+
24
+ ```
25
+ from transformers import pipeline
26
+
27
+ model = pipeline('fill-mask', model='ihk/ojobert', tokenizer='ihk/ojobert')
28
+
29
+ ```
30
+
31
+ An example use is as follows:
32
+
33
+ text = "Would you like to join a major [MASK] company?"
34
+ model(text, top_k=3)
35
+
36
+ >> [{'score': 0.1886572688817978,
37
+ 'token': 13859,
38
+ 'token_str': 'pharmaceutical',
39
+ 'sequence': 'would you like to join a major pharmaceutical company?'},
40
+ {'score': 0.07436735928058624,
41
+ 'token': 5427,
42
+ 'token_str': 'insurance',
43
+ 'sequence': 'would you like to join a major insurance company?'},
44
+ {'score': 0.06400047987699509,
45
+ 'token': 2810,
46
+ 'token_str': 'construction',
47
+ 'sequence': 'would you like to join a major construction company?'}]
48
+
49
+ ⚖️ Training results
50
+ The fine-tuning metrics are as follows:
51
+
52
+ - eval_loss: 2.5871026515960693
53
+ - eval_runtime: 134.4452
54
+ - eval_samples_per_second: 14.281
55
+ - eval_steps_per_second: 0.223
56
+ - epoch: 3.0
57
+ - perplexity: 13.29