🧠 BGE-Large Embedding Model (Fine-Tuned with LoRA/QLoRA)

This is a fine-tuned version of the BAAI/bge-large-en model using LoRA/QLoRA adapters, developed as part of a Final Year Project (FYP) focused on high-quality sentence embeddings for semantic similarity tasks. The model has been adapted for lightweight deployment and improved domain-specific performance.

πŸ“Œ Model Details

  • Base Model: BAAI/bge-large-en
  • Fine-Tuning Method: LoRA / QLoRA
  • Architecture: Transformer-based encoder
  • Language: English
  • Precision: float16 (fp16)
  • Use Case: Sentence embeddings, information retrieval, semantic similarity
  • License: Apache 2.0

Model Description

  • Developed by: HNM
  • Institution: University of Agriculture Faisalabad
  • Model type: Sentence Embedding Model
  • Language(s): English
  • License: Apache 2.0
  • Finetuned from: BAAI/bge-large-en

πŸ’Ό Use Cases

βœ… Intended Use

  • Sentence similarity search
  • Clustering and classification (using embeddings)
  • Dense retrieval and reranking
  • Custom NLP tasks in academic or commercial settings

❌ Out-of-Scope Use

  • Generative NLP tasks (e.g., summarization, translation)
  • Non-English datasets (unless further fine-tuned)

πŸ‹οΈ Training Details

  • Technique Used: QLoRA + PEFT (Parameter Efficient Fine-Tuning)
  • Batch Size: 16
  • Epochs: 3
  • Optimizer: AdamW
  • Loss Function: Cosine Similarity Loss
  • Hardware: NVIDIA T4 (via Colab Pro)
  • Time: ~2.5 hours

πŸ”„ Preprocessing

  • Sentence-pair dataset for similarity (custom domain-specific)
  • Tokenized using AutoTokenizer.from_pretrained("BAAI/bge-large-en")

πŸ“Š Evaluation

  • Metric Used: Cosine similarity, MSE (Mean Squared Error)
  • Test Set: Hold-out portion of custom dataset
  • Result: Model achieved higher relevance ranking vs base BGE-Large on domain queries

πŸ”Œ How to Use

from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained("hafsanaz0076/bge-large-lora-finetuned")
model = AutoModel.from_pretrained("hafsanaz0076/bge-large-lora-finetuned")

text = "This is a sample sentence."
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model(**inputs)
    embeddings = outputs.last_hidden_state.mean(dim=1)

🌍 Environmental Impact
Hardware Used: NVIDIA T4 (Colab Pro)

Duration: 2.5 hours

Compute Region: Global (Cloud)

Estimated Emissions: < 0.015 kg CO2eq

πŸ“– Citation:

@misc{hafsa2025bgeqlora,
  title={BGE-Large Fine-Tuned with QLoRA for Sentence Embeddings},
  author={Hafsa Naz and Team},
  year={2025},
  howpublished={\url{https://huggingface.co/hafsanaz0076/bge-large-lora-finetuned}},
  note={Final Year Project, University of Agriculture Faisalabad}
}

πŸ‘©β€πŸ’» Author & Contact:

Name: Hafsa Naz

Email: [email protected]

Hugging Face: https://huggingface.co/hafsanaz0076


Downloads last month
49
Safetensors
Model size
335M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support