SalmanFaroz
/

DisEmbed-v1

Sentence Similarity

Model card Files Files and versions Community

SalmanFaroz commited on Dec 16, 2024

Commit

b8e722d

·

verified ·

1 Parent(s): f7f8fe1

Update README.md

Files changed (1) hide show

README.md +3 -31

README.md CHANGED Viewed

@@ -82,19 +82,17 @@ datasets:
 - SalmanFaroz/DisEmbed-Symptom-Disease-v1
 ---
-# bge-small-en
-This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Model Details
 ### Model Description
-- **Model Type:** Sentence Transformer
-- **Base model:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) <!-- at revision 5c38ec7c405ec4b44b94cc5a9bb96e735b38267a -->
 - **Maximum Sequence Length:** 512 tokens
 - **Output Dimensionality:** 384 dimensions
 - **Similarity Function:** Cosine Similarity
-<!-- - **Training Dataset:** Unknown -->
 - **Language:** en
 - **License:** mit
@@ -127,30 +125,4 @@ print(embeddings.shape)
 similarities = model.similarity(embeddings, embeddings)
 print(similarities.shape)
 # [3, 3]
-```
-#### Sentence Transformers
-```bibtex
-@inproceedings{reimers-2019-sentence-bert,
-    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
-    author = "Reimers, Nils and Gurevych, Iryna",
-    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
-    month = "11",
-    year = "2019",
-    publisher = "Association for Computational Linguistics",
-    url = "https://arxiv.org/abs/1908.10084",
-}
-```
-#### MultipleNegativesRankingLoss
-```bibtex
-@misc{henderson2017efficient,
-    title={Efficient Natural Language Response Suggestion for Smart Reply},
-    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
-    year={2017},
-    eprint={1705.00652},
-    archivePrefix={arXiv},
-    primaryClass={cs.CL}
-}
 ```

 - SalmanFaroz/DisEmbed-Symptom-Disease-v1
 ---
+# DisEmbed-v1
+DisEmbed-v1 is a disease-focused embedding model designed for the medical domain, trained on a synthetic dataset comprising disease descriptions, symptoms, and Q&A pairs. It outperforms general medical models in disease-specific tasks, particularly in distinguishing similar diseases. DisEmbed excels in retrieval task and disease-context identification.
 ## Model Details
 ### Model Description
+- **Dataset : [DisEmbed-Symptom-Disease-v1](https://huggingface.co/datasets/SalmanFaroz/DisEmbed-Symptom-Disease-v1)**
 - **Maximum Sequence Length:** 512 tokens
 - **Output Dimensionality:** 384 dimensions
 - **Similarity Function:** Cosine Similarity
 - **Language:** en
 - **License:** mit
 similarities = model.similarity(embeddings, embeddings)
 print(similarities.shape)
 # [3, 3]
 ```