SetFit documentation
SetFit v1.0.0 Migration Guide
SetFit v1.0.0 Migration Guide
To update your code to work with v1.0.0, the following changes must be made:
General Migration Guide
- keep_body_frozenfrom- SetFitModel.unfreezehas been deprecated, simply either pass- "head",- "body"or no arguments to unfreeze both.
- SupConLosshas been moved from- setfit.modelingto- setfit.losses. If you are importing it using- from setfit.modeling import SupConLoss, then import it like- from setfit import SupConLossnow instead.
- use_auth_tokenhas been renamed to- tokenin- SetFitModel.from_pretrained().- use_auth_tokenwill keep working until the next major version, but with a warning.
Training Migration Guide
- Replace all uses of - SetFitTrainerwith Trainer, and all uses of- DistillationSetFitTrainerwith DistillationTrainer.
- Remove - num_iterations,- num_epochs,- learning_rate,- batch_size,- seed,- use_amp,- warmup_proportion,- distance_metric,- margin,- samples_per_labeland- loss_classfrom a- Trainerinitialization, and move them to a- TrainerArgumentsinitialization instead. This instance should then be passed to the trainer via the- argsargument.- num_iterationshas been deprecated, the number of training steps should now be controlled via- num_epochs,- max_stepsor- EarlyStoppingCallback.
- learning_ratehas been split up into- body_learning_rateand- head_learning_rate.
- loss_classhas been renamed to- loss.
 
- Stop providing training arguments like - num_epochsdirectly to- Trainer.train: pass a- TrainingArgumentsinstance via the- argsargument instead.
- Refactor multiple - trainer.train(),- trainer.freeze()and- trainer.unfreeze()calls that were previously necessary to train the differentiable head into just one- trainer.train()call by setting- batch_sizeand- num_epochson the- TrainingArgumentsdataclass with tuples. The first value in the tuple is for training the embeddings, and the second is for training the classifier.
Hard deprecations
- SetFitBaseModel,- SKLearnWrapperand- SetFitPipelinehave been removed. These can no longer be used starting from v1.0.0.
v1.0.0 Changelog
This list contains new functionality that can be used starting from v1.0.0.
- SetFitModel.from_pretrained()now accepts new arguments:- device: Specifies the device on which to load the SetFit model.
- labels: Specify labels corresponding to the training labels - useful if the training labels are integers ranging from- 0to- num_classes - 1. These are automatically applied on calling SetFitModel.predict().
- model_card_data: Provide a SetFitModelCardData instance storing data such as model language, license, dataset name, etc. to be used in the automatically generated model cards.
 
- Certain SetFit configuration options, such as the new - labelsargument from- SetFitModel.from_pretrained(), now get saved in- config_setfit.jsonfiles when a model is saved. This allows- labelsto be automatically fetched when a model is loaded.
- SetFitModel.predict() now accepts new arguments: - batch_size(defaults to- 32): The batch size to use in encoding the sentences to embeddings. Higher often means faster processing but higher memory usage.
- use_labels(defaults to- True): Whether to use the- SetFitModel.labelsto convert integer labels to string labels. Not used if the training labels are already strings.
 
- SetFitModel.encode() has been introduce to convert input sentences to embeddings using the - SentenceTransformerbody.
- SetFitModel.device has been introduced to determine the device of the model. 
- AbsaTrainer and AbsaModel have been introduced for applying SetFit for Aspect Based Sentiment Analysis. 
- Trainer now supports a - callbacksargument for a list of- transformers- TrainerCallbackinstances.- By default, all installed callbacks integrated with transformersare supported, includingTensorBoardCallback,WandbCallbackto log training logs to TensorBoard and W&B, respectively.
- The Trainer will now print embedding_lossin the terminal, as well aseval_embedding_lossifeval_strategyis set to"epoch"or"steps"in TrainingArguments.
 
- By default, all installed callbacks integrated with 
- Trainer.evaluate() now works with string labels. 
- An updated contrastive pair sampler increases the variety of training pairs. 
- TrainingArguments supports various new arguments: - output_dir: The output directory where the model predictions and checkpoints will be written.
- max_steps: If set to a positive number, the total number of training steps to perform. Overrides num_epochs. The training may stop before reaching the set number of steps when all data is exhausted.
- sampling_strategy: The sampling strategy of how to draw pairs in training. Possible values are:- "oversampling": Draws even number of positive/negative sentence pairs until every sentence pair has been drawn.
- "undersampling": Draws the minimum number of positive/negative sentence pairs until every sentence pair in the minority class has been drawn.
- "unique": Draws every sentence pair combination (likely resulting in unbalanced number of positive/negative sentence pairs).
 - The default is set to - "oversampling", ensuring all sentence pairs are drawn at least once. Alternatively, setting- num_iterationswill override this argument and determine the number of generated sentence pairs.
- report_to: The list of integrations to report the results and logs to. Supported platforms are- "azure_ml",- "comet_ml",- "mlflow",- "neptune",- "tensorboard",- "clearml"and- "wandb". Use- "all"to report to all integrations installed,- "none"for no integrations.
- run_name: A descriptor for the run. Typically used for wandb and mlflow logging.
- logging_strategy: The logging strategy to adopt during training. Possible values are:- "no": No logging is done during training.
- "epoch": Logging is done at the end of each epoch.
- "steps": Logging is done every- logging_steps.
 
- logging_first_step: Whether to log and evaluate the first- global_stepor not.
- logging_steps: Number of update steps between two logs if- logging_strategy="steps".
- eval_strategy: The evaluation strategy to adopt during training. Possible values are:- "no": No evaluation is done during training.
- "steps": Evaluation is done (and logged) every- eval_steps.
- "epoch": Evaluation is done at the end of each epoch.
 
- eval_steps: Number of update steps between two evaluations if- eval_strategy="steps". Will default to the same as- logging_stepsif not set.
- eval_delay: Number of epochs or steps to wait for before the first evaluation can be performed, depending on the- eval_strategy.
- eval_max_steps: If set to a positive number, the total number of evaluation steps to perform. The evaluation may stop before reaching the set number of steps when all data is exhausted.
- save_strategy: The checkpoint save strategy to adopt during training. Possible values are:- "no": No save is done during training.
- "epoch": Save is done at the end of each epoch.
- "steps": Save is done every- save_steps.
 
- save_steps: Number of updates steps before two checkpoint saves if- save_strategy="steps".
- save_total_limit: If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints in- output_dir. Note, the best model is always preserved if the- eval_strategyis not- "no".
- load_best_model_at_end: Whether or not to load the best model found during training at the end of training.- When set to - True, the parameters- save_strategyneeds to be the same as- eval_strategy, and in the case it is “steps”,- save_stepsmust be a round multiple of- eval_steps.
 
- Pushing SetFit or SetFitABSA models to the Hub with - SetFitModel.push_to_hub()or AbsaModel.push_to_hub() now results in a detailed model card. As an example, see this SetFitModel or this SetFitABSA polarity model.