SetFit with ibm-granite/granite-embedding-107m-multilingual

This is a SetFit model that can be used for Text Classification. This SetFit model uses ibm-granite/granite-embedding-107m-multilingual as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
summarization
  • 'Resuma um texto acad锚mico sobre psicologia do comportamento.'
  • 'Summarize the timeline and outcomes of a historical event based on multiple eyewitness accounts.'
  • 'Extract and summarize the key lessons learned from multiple post-project reviews.'
general_knowledge
  • 'Qual 茅 a import芒ncia da agricultura para a economia brasileira?'
  • 'Quais s茫o os principais pa铆ses membros da Organiza莽茫o dos Pa铆ses Exportadores de Petr贸leo (OPEP)?'
  • 'What is the mechanism by which vaccines provide immunity?'
roleplay
  • 'Personifique um chef p芒tissier criando uma sobremesa para um j煤ri exigente.'
  • 'You are a software tester devising scenarios to uncover bugs in a complex system.'
  • 'Simule uma reuni茫o de conselho editorial decidindo o rumo de uma grande publica莽茫o.'
creativity
  • 'Write a thriller in which the protagonist communicates only through artwork.'
  • 'Imagine um poema narrativo sobre a rela莽茫o entre o sert茫o e a poesia de uma gera莽茫o esquecida.'
  • 'Write a story from the perspective of a shadow that gains independence.'
complex_reasoning
  • 'Analise as implica莽玫es do uso de drones aut么nomos para entregas em 谩reas urbanas densas.'
  • 'Proponha um sistema para avalia莽茫o automatizada e justa de curr铆culos em processos seletivos corporativos.'
  • 'Proponha um modelo para prever o crescimento urbano sustent谩vel considerando vari谩veis ambientais e sociais.'
coding
  • 'Implemente uma fun莽茫o para decompor n煤meros inteiros em fatores primos eficientemente para valores grandes.'
  • 'Create an integration that consumes streaming data from an external message broker and processes events in real-time with backpressure management.'
  • 'Escreva um algoritmo para encontrar os pontos de articula莽茫o (cut vertices) em um grafo n茫o direcionado.'
basic_reasoning
  • 'Se um carro consome 12 litros de gasolina para 100 km, quantos litros usar谩 para 150 km?'
  • 'If a ladder leans against a wall forming a 60-degree angle and the ladder length is 10 feet, how high does it reach on the wall?'
  • 'Quantos cent铆metros tem 1 metro?'
tool
  • 'Fetch comprehensive user reviews and ratings for a mobile app across platforms.'
  • 'Analyze sentiment of a tweet and classify it as positive, neutral, or negative.'
  • 'Retrieve country-wise COVID-19 vaccination rates from an authoritative source.'

Evaluation

Metrics

Label Accuracy
all 0.9967

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 馃 Hub
model = SetFitModel.from_pretrained("cnmoro/prompt-router")
# Run inference
preds = model("Get the stock price history of Tesla for the last month.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 5 13.6792 38
Label Training Sample Count
summarization 160
tool 144
general_knowledge 154
roleplay 145
complex_reasoning 130
creativity 164
coding 152
basic_reasoning 148

Training Hyperparameters

  • batch_size: (8, 8)
  • num_epochs: (1, 16)
  • max_steps: 2400
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • evaluation_strategy: steps
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0004 1 0.1954 -
0.0208 50 0.2125 -
0.0417 100 0.2131 -
0.0625 150 0.2072 -
0.0833 200 0.2029 0.1902
0.1042 250 0.1925 -
0.125 300 0.1764 -
0.1458 350 0.1512 -
0.1667 400 0.1229 0.1072
0.1875 450 0.1015 -
0.2083 500 0.0862 -
0.2292 550 0.065 -
0.25 600 0.0505 0.0504
0.2708 650 0.0532 -
0.2917 700 0.0427 -
0.3125 750 0.0378 -
0.3333 800 0.0357 0.0322
0.3542 850 0.0286 -
0.375 900 0.0381 -
0.3958 950 0.0333 -
0.4167 1000 0.0307 0.0235
0.4375 1050 0.0245 -
0.4583 1100 0.0245 -
0.4792 1150 0.0217 -
0.5 1200 0.0193 0.0168
0.5208 1250 0.0167 -
0.5417 1300 0.0158 -
0.5625 1350 0.02 -
0.5833 1400 0.0167 0.0120
0.6042 1450 0.0176 -
0.625 1500 0.0159 -
0.6458 1550 0.0141 -
0.6667 1600 0.0131 0.0094
0.6875 1650 0.0097 -
0.7083 1700 0.0109 -
0.7292 1750 0.0126 -
0.75 1800 0.0115 0.0079
0.7708 1850 0.0122 -
0.7917 1900 0.0104 -
0.8125 1950 0.0111 -
0.8333 2000 0.011 0.0071
0.8542 2050 0.0095 -
0.875 2100 0.009 -
0.8958 2150 0.0107 -
0.9167 2200 0.0099 0.0067
0.9375 2250 0.0084 -
0.9583 2300 0.0086 -
0.9792 2350 0.0089 -
1.0 2400 0.0098 0.0066

Framework Versions

  • Python: 3.11.11
  • SetFit: 1.2.0.dev0
  • Sentence Transformers: 4.0.2
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
21
Safetensors
Model size
107M params
Tensor type
F32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for cnmoro/prompt-router

Finetuned
(1)
this model

Evaluation results