MS MARCO Dual Encoder Model

This repository contains a Dual Encoder (Two-Tower) model trained on the Microsoft MS MARCO dataset for information retrieval tasks.

Model Details

  • Architecture: Two-Tower (Dual Encoder)
  • Embedding Dimension: 128
  • Training Strategy: Triplet loss with margin 0.2
  • Vocabulary Size: 50,001
  • Dataset Size: 5,000
  • Parameters:
    • Query Tower: 16,512
    • Document Tower: 16,512
    • Total: 33,024
  • Training Device: cuda

Usage

import torch
from model import QryTower, DocTower

# Load the models
embedding_dim = 128
qry_model = QryTower(embedding_dim)
doc_model = DocTower(embedding_dim)

qry_model.load_state_dict(torch.load("qry_tower.pth"))
doc_model.load_state_dict(torch.load("doc_tower.pth"))

# Get embeddings for query and document
query_embedding = qry_model(preprocessed_query)
document_embedding = doc_model(preprocessed_document)

# Calculate similarity
similarity = torch.cosine_similarity(query_embedding, document_embedding)

Training

This model was trained for 5 epochs with a batch size of 32 and learning rate of 0.001.

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Kogero/ms-marco-dual-encoder