--- language: "en" license: "apache-2.0" tags: - semantic-search - research-papers - arxiv - sbert model_name: "Fine-Tuned Semantic Search Model (Arxiv Papers)" base_model: "sentence-transformers/all-MiniLM-L6-v2" datasets: - "arxiv_community/arxiv_dataset" --- # arxiv-search This model is a fine-tuned version of [`all-MiniLM-L6-v2`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2), trained on **Arxiv research papers** to perform **semantic similarity search**. ## Model Details - **Base Model:** `sentence-transformers/all-MiniLM-L6-v2` - **Training Data:** Arxiv Research Papers (`title + abstract`) - **Fine-Tuned Task:** Semantic Search - **Use Case:** Find **similar research papers** based on a query - **License:** Apache 2.0 ## How to Use ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("Talina06/arxiv-search") query = "Neural networks in medicine" query_embedding = model.encode(query) # Use FAISS or cosine similarity to retrieve similar papers ``` ## Training Details - **Training Data:** 100k+ Arxiv research papers - **Training Framework:** Sentence Transformers - **Hyperparameters:** - Learning Rate: `2e-5` - Batch Size: `100` - Epochs: `10` - **Hardware Used:** TPU & GPU ## Example Search Results | **Query** | **Top Matching Paper Title** | **Similarity Score** | |----------|------------------------------|----------------------| | "Neural networks in healthcare" | "Deep Learning for Medical Diagnosis" | 0.89 | | "Quantum cryptography" | "A Survey on Quantum-Safe Encryption" | 0.87 |