🧠 BGE-Large Embedding Model (Fine-Tuned with LoRA/QLoRA)

This is a fine-tuned version of the BAAI/bge-large-en model using LoRA/QLoRA adapters, developed as part of a Final Year Project (FYP) focused on high-quality sentence embeddings for semantic similarity tasks. The model has been adapted for lightweight deployment and improved domain-specific performance.

📌 Model Details

Base Model: BAAI/bge-large-en
Fine-Tuning Method: LoRA / QLoRA
Architecture: Transformer-based encoder
Language: English
Precision: float16 (fp16)
Use Case: Sentence embeddings, information retrieval, semantic similarity
License: Apache 2.0

Model Description

Developed by: HNM
Institution: University of Agriculture Faisalabad
Model type: Sentence Embedding Model
Language(s): English
License: Apache 2.0
Finetuned from: BAAI/bge-large-en

💼 Use Cases

✅ Intended Use

Sentence similarity search
Clustering and classification (using embeddings)
Dense retrieval and reranking
Custom NLP tasks in academic or commercial settings

❌ Out-of-Scope Use

Generative NLP tasks (e.g., summarization, translation)
Non-English datasets (unless further fine-tuned)

🏋️ Training Details

Technique Used: QLoRA + PEFT (Parameter Efficient Fine-Tuning)
Batch Size: 16
Epochs: 3
Optimizer: AdamW
Loss Function: Cosine Similarity Loss
Hardware: NVIDIA T4 (via Colab Pro)
Time: ~2.5 hours

🔄 Preprocessing

Sentence-pair dataset for similarity (custom domain-specific)
Tokenized using AutoTokenizer.from_pretrained("BAAI/bge-large-en")

📊 Evaluation

Metric Used: Cosine similarity, MSE (Mean Squared Error)
Test Set: Hold-out portion of custom dataset
Result: Model achieved higher relevance ranking vs base BGE-Large on domain queries

🔌 How to Use

from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained("hafsanaz0076/bge-large-lora-finetuned")
model = AutoModel.from_pretrained("hafsanaz0076/bge-large-lora-finetuned")

text = "This is a sample sentence."
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model(**inputs)
    embeddings = outputs.last_hidden_state.mean(dim=1)

🌍 Environmental Impact
Hardware Used: NVIDIA T4 (Colab Pro)

Duration: 2.5 hours

Compute Region: Global (Cloud)

Estimated Emissions: < 0.015 kg CO2eq

📖 Citation:

@misc{hafsa2025bgeqlora,
  title={BGE-Large Fine-Tuned with QLoRA for Sentence Embeddings},
  author={Hafsa Naz and Team},
  year={2025},
  howpublished={\url{https://huggingface.co/hafsanaz0076/bge-large-lora-finetuned}},
  note={Final Year Project, University of Agriculture Faisalabad}
}

👩‍💻 Author & Contact:

Name: Hafsa Naz

Email: [email protected]

Hugging Face: https://huggingface.co/hafsanaz0076