pascalhuerten
/

instructor-skillfit

Model card Files Files and versions

xet

Metrics Training metrics Community

pascalhuerten commited on Jan 23, 2024

Commit

d71e600

verified ·

1 Parent(s): eb63880

Add short description and example for skill retrieval task

Browse files

Files changed (1) hide show

README.md +19 -1

README.md CHANGED Viewed

@@ -2523,8 +2523,26 @@ model-index:
     - type: max_f1
       value: 78.39889075384951
 ---
-# hkunlp/instructor-base
 We introduce **Instructor**👨‍🏫, an instruction-finetuned text embedding model that can generate text embeddings tailored to any task (e.g., classification, retrieval, clustering, text evaluation, etc.) and domains (e.g., science, finance, etc.) ***by simply providing the task instruction, without any finetuning***. Instructor👨‍ achieves sota on 70 diverse embedding tasks!
 The model is easy to use with **our customized** `sentence-transformer` library. For more details, check out [our paper](https://arxiv.org/abs/2212.09741) and [project page](https://instructor-embedding.github.io/)!

     - type: max_f1
       value: 78.39889075384951
 ---
+# pascalhuerten/instructor-skillfit
+A finetuning of hkunlp/instructor-base specialized on performing retrival of relevant skills based on a given learning outcome.
+## Skill Retrieval
+You can use **customized embeddings** for skill retrieval.
+```python
+import numpy as np
+from sklearn.metrics.pairwise import cosine_similarity
+query  = [['Represent the learning outcome for retrieval: : ','WordPress installieren\nWebsite- oder Blogplanung\nPlugins und Widges einfügen']]
+corpus = [['Represent the skill for retrieval: ','WordPress'],
+          ['Represent the skill for retrieval: ','Website-Wireframe erstellen'],
+          ['Represent the skill for retrieval: ','Software für Content-Management-Systeme nutzen']]
+query_embeddings = model.encode(query)
+corpus_embeddings = model.encode(corpus)
+similarities = cosine_similarity(query_embeddings,corpus_embeddings)
+retrieved_doc_id = np.argmax(similarities)
+print(retrieved_doc_id)
+```
+## hkunlp/instructor-base
 We introduce **Instructor**👨‍🏫, an instruction-finetuned text embedding model that can generate text embeddings tailored to any task (e.g., classification, retrieval, clustering, text evaluation, etc.) and domains (e.g., science, finance, etc.) ***by simply providing the task instruction, without any finetuning***. Instructor👨‍ achieves sota on 70 diverse embedding tasks!
 The model is easy to use with **our customized** `sentence-transformer` library. For more details, check out [our paper](https://arxiv.org/abs/2212.09741) and [project page](https://instructor-embedding.github.io/)!