johannhartmann's picture
Upload retrieval-optimized German embedding model with LoRA fine-tuning
582b928 verified
metadata
language:
  - de
library_name: sentence-transformers
license: apache-2.0
tags:
  - sentence-transformers
  - text-embeddings
  - german
  - retrieval
  - lora
  - embedding-model
pipeline_tag: sentence-similarity
widget:
  - source_sentence: Was ist die Hauptstadt von Deutschland?
    sentences:
      - Berlin ist die Hauptstadt von Deutschland.
      - München ist eine Stadt in Bayern.
      - Paris ist die Hauptstadt von Frankreich.
datasets:
  - deepset/germanquad
metrics:
  - cosine_accuracy
  - mrr
  - recall
model-index:
  - name: SmolLM3-3B-German-Embed
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          type: deepset/germanquad
          name: GermanQuAD
          config: default
          split: test
          revision: main
        metrics:
          - type: mrr
            value: 0.9166
            name: Mean Reciprocal Rank
          - type: recall_at_1
            value: 0.872
            name: Recall@1
          - type: recall_at_5
            value: 0.98
            name: Recall@5
          - type: recall_at_10
            value: 0.986
            name: Recall@10

Experimental SmolLM3 3B German Embedding Model

This is an experimental German text embedding model based on SmolLM3 3B, optimized for retrieval tasks using LoRA (Low-Rank Adaptation) fine-tuning. The model has been specifically trained to excel at German information retrieval and semantic similarity tasks.

Model Details

  • Base Model: SmolLM3 3B (microsoft/SmolLM3-3B)
  • Language: German (de)
  • Model Type: Sentence Transformers
  • Embedding Dimension: 2048
  • Max Sequence Length: 512
  • Pooling Strategy: Mean pooling
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Training Data: German retrieval datasets

Key Features

🚀 Retrieval-Optimized: Specifically fine-tuned for information retrieval tasks
🇩🇪 German-Focused: Optimized for German language understanding
High Performance: Significant improvements over baseline embeddings
📏 Standard Format: Compatible with sentence-transformers library

Usage

Installation

pip install sentence-transformers

Basic Usage

from sentence_transformers import SentenceTransformer

# Load the model
model = SentenceTransformer('mayflowergmbh/smollm3-3b-german-embed')

# Encode sentences
sentences = [
    "Was ist die Hauptstadt von Deutschland?",
    "Berlin ist die Hauptstadt von Deutschland.", 
    "München ist eine große Stadt in Bayern."
]

embeddings = model.encode(sentences)
print(f"Embeddings shape: {embeddings.shape}")

# Compute similarities
from sklearn.metrics.pairwise import cosine_similarity
similarities = cosine_similarity(embeddings)
print(similarities)

Information Retrieval Example

import numpy as np
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('mayflowergmbh/smollm3-3b-german-embed')

# Query and documents
query = "Was ist die Hauptstadt von Deutschland?"
documents = [
    "Berlin ist die Hauptstadt und größte Stadt Deutschlands.",
    "München ist die Hauptstadt des Freistaates Bayern.",
    "Hamburg ist eine Hansestadt im Norden Deutschlands.",
    "Köln ist eine Großstadt in Nordrhein-Westfalen."
]

# Encode query and documents
query_embedding = model.encode([query])
doc_embeddings = model.encode(documents)

# Compute similarities
similarities = np.dot(query_embedding, doc_embeddings.T)[0]

# Rank documents by relevance
ranked_indices = np.argsort(similarities)[::-1]

print("Query:", query)
print("\nRanked Results:")
for i, idx in enumerate(ranked_indices):
    print(f"{i+1}. {documents[idx]} (Score: {similarities[idx]:.3f})")

Technical Details

Architecture

The model uses the LLM2Vec approach to convert the decoder-only SmolLM3 model into an effective encoder:

  1. Bidirectional Attention: Modified attention mechanism for better context understanding
  2. Mean Pooling: Aggregates token embeddings using attention-weighted mean
  3. LoRA Fine-tuning: Parameter-efficient adaptation targeting Q and V projection layers

Training Process

  1. Base Model: Started with SmolLM3 3B converted to LLM2Vec format
  2. LoRA Configuration:
    • Rank (r): 8
    • Alpha: 16
    • Target modules: q_proj, v_proj
    • Dropout: 0.05
  3. Training Data: German retrieval datasets with contrastive learning
  4. Optimization: Hard negative mining for improved discrimination

Model Card Metadata

  • Developed by: mayflowergmbh
  • Model type: Sentence Transformer
  • Language(s): German (de)
  • License: Apache 2.0
  • Base model: microsoft/SmolLM3-3B
  • Training approach: LoRA fine-tuning
  • Primary use: Information retrieval, semantic similarity

Limitations and Bias

  • Language Scope: Optimized specifically for German; performance on other languages not evaluated
  • Domain: Best performance on factual/informational content similar to training data
  • Sequence Length: Maximum 512 tokens; longer texts will be truncated
  • Computational Requirements: Requires ~6GB GPU memory for inference

Citation

If you use this model in your research, please cite:

@misc{smollm3-german-embed-retrieval,
  title={SmolLM3 3B German Embedding Model (Retrieval-Optimized)},
  author={mayflowergmbh},
  year={2025},
  howpublished={\url{https://huggingface.co/mayflowergmbh/smollm3-3b-german-embed}},
  note={Retrieved-optimized German embedding model using LoRA fine-tuning}
}

Contact

For questions or issues, please open an issue on the model repository or contact the author.


Generated on 2025-07-16