Sentence Similarity
Safetensors
English
bert
SalmanFaroz commited on
Commit
b8e722d
·
verified ·
1 Parent(s): f7f8fe1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -31
README.md CHANGED
@@ -82,19 +82,17 @@ datasets:
82
  - SalmanFaroz/DisEmbed-Symptom-Disease-v1
83
  ---
84
 
85
- # bge-small-en
 
86
 
87
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
88
 
89
  ## Model Details
90
 
91
  ### Model Description
92
- - **Model Type:** Sentence Transformer
93
- - **Base model:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) <!-- at revision 5c38ec7c405ec4b44b94cc5a9bb96e735b38267a -->
94
  - **Maximum Sequence Length:** 512 tokens
95
  - **Output Dimensionality:** 384 dimensions
96
  - **Similarity Function:** Cosine Similarity
97
- <!-- - **Training Dataset:** Unknown -->
98
  - **Language:** en
99
  - **License:** mit
100
 
@@ -127,30 +125,4 @@ print(embeddings.shape)
127
  similarities = model.similarity(embeddings, embeddings)
128
  print(similarities.shape)
129
  # [3, 3]
130
- ```
131
-
132
-
133
- #### Sentence Transformers
134
- ```bibtex
135
- @inproceedings{reimers-2019-sentence-bert,
136
- title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
137
- author = "Reimers, Nils and Gurevych, Iryna",
138
- booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
139
- month = "11",
140
- year = "2019",
141
- publisher = "Association for Computational Linguistics",
142
- url = "https://arxiv.org/abs/1908.10084",
143
- }
144
- ```
145
-
146
- #### MultipleNegativesRankingLoss
147
- ```bibtex
148
- @misc{henderson2017efficient,
149
- title={Efficient Natural Language Response Suggestion for Smart Reply},
150
- author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
151
- year={2017},
152
- eprint={1705.00652},
153
- archivePrefix={arXiv},
154
- primaryClass={cs.CL}
155
- }
156
  ```
 
82
  - SalmanFaroz/DisEmbed-Symptom-Disease-v1
83
  ---
84
 
85
+ # DisEmbed-v1
86
+ DisEmbed-v1 is a disease-focused embedding model designed for the medical domain, trained on a synthetic dataset comprising disease descriptions, symptoms, and Q&A pairs. It outperforms general medical models in disease-specific tasks, particularly in distinguishing similar diseases. DisEmbed excels in retrieval task and disease-context identification.
87
 
 
88
 
89
  ## Model Details
90
 
91
  ### Model Description
92
+ - **Dataset : [DisEmbed-Symptom-Disease-v1](https://huggingface.co/datasets/SalmanFaroz/DisEmbed-Symptom-Disease-v1)**
 
93
  - **Maximum Sequence Length:** 512 tokens
94
  - **Output Dimensionality:** 384 dimensions
95
  - **Similarity Function:** Cosine Similarity
 
96
  - **Language:** en
97
  - **License:** mit
98
 
 
125
  similarities = model.similarity(embeddings, embeddings)
126
  print(similarities.shape)
127
  # [3, 3]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
128
  ```