SetFit with ibm-granite/granite-embedding-107m-multilingual

This is a SetFit model that can be used for Text Classification. This SetFit model uses ibm-granite/granite-embedding-107m-multilingual as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
summarization
  • 'Resuma um texto acad锚mico sobre psicologia do comportamento.'
  • 'Summarize the timeline and outcomes of a historical event based on multiple eyewitness accounts.'
  • 'Extract and summarize the key lessons learned from multiple post-project reviews.'
general_knowledge
  • 'Qual 茅 a import芒ncia da agricultura para a economia brasileira?'
  • 'Quais s茫o os principais pa铆ses membros da Organiza莽茫o dos Pa铆ses Exportadores de Petr贸leo (OPEP)?'
  • 'What is the mechanism by which vaccines provide immunity?'
roleplay
  • 'Personifique um chef p芒tissier criando uma sobremesa para um j煤ri exigente.'
  • 'You are a software tester devising scenarios to uncover bugs in a complex system.'
  • 'Simule uma reuni茫o de conselho editorial decidindo o rumo de uma grande publica莽茫o.'
creativity
  • 'Write a thriller in which the protagonist communicates only through artwork.'
  • 'Imagine um poema narrativo sobre a rela莽茫o entre o sert茫o e a poesia de uma gera莽茫o esquecida.'
  • 'Write a story from the perspective of a shadow that gains independence.'
complex_reasoning
  • 'Analise as implica莽玫es do uso de drones aut么nomos para entregas em 谩reas urbanas densas.'
  • 'Proponha um sistema para avalia莽茫o automatizada e justa de curr铆culos em processos seletivos corporativos.'
  • 'Proponha um modelo para prever o crescimento urbano sustent谩vel considerando vari谩veis ambientais e sociais.'
coding
  • 'Implemente uma fun莽茫o para decompor n煤meros inteiros em fatores primos eficientemente para valores grandes.'
  • 'Create an integration that consumes streaming data from an external message broker and processes events in real-time with backpressure management.'
  • 'Escreva um algoritmo para encontrar os pontos de articula莽茫o (cut vertices) em um grafo n茫o direcionado.'
basic_reasoning
  • 'Se um carro consome 12 litros de gasolina para 100 km, quantos litros usar谩 para 150 km?'
  • 'If a ladder leans against a wall forming a 60-degree angle and the ladder length is 10 feet, how high does it reach on the wall?'
  • 'Quantos cent铆metros tem 1 metro?'
tool
  • 'Fetch comprehensive user reviews and ratings for a mobile app across platforms.'
  • 'Analyze sentiment of a tweet and classify it as positive, neutral, or negative.'
  • 'Retrieve country-wise COVID-19 vaccination rates from an authoritative source.'

Evaluation

Metrics

Label Accuracy
all 0.9967

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 馃 Hub
model = SetFitModel.from_pretrained("cnmoro/prompt-router")
# Run inference
preds = model("Get the stock price history of Tesla for the last month.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 5 13.6792 38
Label Training Sample Count
summarization 160
tool 144
general_knowledge 154
roleplay 145
complex_reasoning 130
creativity 164
coding 152
basic_reasoning 148

Training Hyperparameters

  • batch_size: (8, 8)
  • num_epochs: (1, 16)
  • max_steps: 2400
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • evaluation_strategy: steps
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0004 1 0.1954 -
0.0208 50 0.2125 -
0.0417 100 0.2131 -
0.0625 150 0.2072 -
0.0833 200 0.2029 0.1902
0.1042 250 0.1925 -
0.125 300 0.1764 -
0.1458 350 0.1512 -
0.1667 400 0.1229 0.1072
0.1875 450 0.1015 -
0.2083 500 0.0862 -
0.2292 550 0.065 -
0.25 600 0.0505 0.0504
0.2708 650 0.0532 -
0.2917 700 0.0427 -
0.3125 750 0.0378 -
0.3333 800 0.0357 0.0322
0.3542 850 0.0286 -
0.375 900 0.0381 -
0.3958 950 0.0333 -
0.4167 1000 0.0307 0.0235
0.4375 1050 0.0245 -
0.4583 1100 0.0245 -
0.4792 1150 0.0217 -
0.5 1200 0.0193 0.0168
0.5208 1250 0.0167 -
0.5417 1300 0.0158 -
0.5625 1350 0.02 -
0.5833 1400 0.0167 0.0120
0.6042 1450 0.0176 -
0.625 1500 0.0159 -
0.6458 1550 0.0141 -
0.6667 1600 0.0131 0.0094
0.6875 1650 0.0097 -
0.7083 1700 0.0109 -
0.7292 1750 0.0126 -
0.75 1800 0.0115 0.0079
0.7708 1850 0.0122 -
0.7917 1900 0.0104 -
0.8125 1950 0.0111 -
0.8333 2000 0.011 0.0071
0.8542 2050 0.0095 -
0.875 2100 0.009 -
0.8958 2150 0.0107 -
0.9167 2200 0.0099 0.0067
0.9375 2250 0.0084 -
0.9583 2300 0.0086 -
0.9792 2350 0.0089 -
1.0 2400 0.0098 0.0066

Framework Versions

  • Python: 3.11.11
  • SetFit: 1.2.0.dev0
  • Sentence Transformers: 4.0.2
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
8
Safetensors
Model size
107M params
Tensor type
F32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for cnmoro/prompt-router

Finetuned
(4)
this model

Evaluation results