bart-base-samsum
This model was trained using Amazon SageMaker and the new Hugging Face Deep Learning container.
You can find the notebook here and the referring blog post here.
For more information look at:
- π€ Transformers Documentation: Amazon SageMaker
- Example Notebooks
- Amazon SageMaker documentation for Hugging Face
- Python SDK SageMaker documentation for Hugging Face
- Deep Learning Container
Hyperparameters
{
"dataset_name": "samsum",
"do_eval": true,
"do_train": true,
"fp16": true,
"learning_rate": 5e-05,
"model_name_or_path": "facebook/bart-base",
"num_train_epochs": 3,
"output_dir": "/opt/ml/model",
"per_device_eval_batch_size": 8,
"per_device_train_batch_size": 8,
"seed": 7
}
Train results
key | value |
---|---|
epoch | 3 |
init_mem_cpu_alloc_delta | 180190 |
init_mem_cpu_peaked_delta | 18282 |
init_mem_gpu_alloc_delta | 558658048 |
init_mem_gpu_peaked_delta | 0 |
train_mem_cpu_alloc_delta | 6658519 |
train_mem_cpu_peaked_delta | 642937 |
train_mem_gpu_alloc_delta | 2267624448 |
train_mem_gpu_peaked_delta | 10355728896 |
train_runtime | 98.4931 |
train_samples | 14732 |
train_samples_per_second | 3.533 |
Eval results
key | value |
---|---|
epoch | 3 |
eval_loss | 1.5356481075286865 |
eval_mem_cpu_alloc_delta | 659047 |
eval_mem_cpu_peaked_delta | 18254 |
eval_mem_gpu_alloc_delta | 0 |
eval_mem_gpu_peaked_delta | 300285440 |
eval_runtime | 0.3116 |
eval_samples | 818 |
eval_samples_per_second | 2625.337 |
Usage
from transformers import pipeline
summarizer = pipeline("summarization", model="philschmid/bart-base-samsum")
conversation = '''Jeff: Can I train a π€ Transformers model on Amazon SageMaker?
Philipp: Sure you can use the new Hugging Face Deep Learning Container.
Jeff: ok.
Jeff: and how can I get started?
Jeff: where can I find documentation?
Philipp: ok, ok you can find everything here. https://huggingface.co/blog/the-partnership-amazon-sagemaker-and-hugging-face
'''
nlp(conversation)
- Downloads last month
- 16
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.