|
|
--- |
|
|
library_name: transformers |
|
|
tags: |
|
|
- summarization |
|
|
- dialogue |
|
|
--- |
|
|
|
|
|
# Model Card for phi-2-dialogsum |
|
|
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
This model is designed for **dialogue summarization**. It takes multi-turn conversations as input and produces concise summaries. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
This is the model card for **phi-2-dialogsum**, a dialogue summarization model built on top of 🤗 Transformers. It leverages phi-2 backbone model, fine-tuned for summarizing dialogues. |
|
|
|
|
|
- **Developed by:** Aygün Varol & Malik Sami |
|
|
- **Model type:** Generative Language Model |
|
|
- **Language(s) (NLP):** English |
|
|
- **License:** MIT |
|
|
- **Finetuned from model:** [Phi-2](https://huggingface.co/microsoft/phi-2) |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Repository (GitHub):** [AygunVarol/phi-2-dialogsum](https://github.com/AygunVarol/phi-2-dialogsum) |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
This model can be used directly for **dialogue summarization** tasks. For example, given a multi-turn conversation, the model will produce a succinct summary capturing the key information and context. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
Below is a quick code snippet to load and run inference with this model: |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
|
|
|
|
|
model_name = "Aygun/phi-2-dialogsum" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModelForSeq2SeqLM.from_pretrained(model_name) |
|
|
|
|
|
input_text = """Speaker1: Hi, how are you doing today? |
|
|
Speaker2: I'm good, thanks! Just finished my coffee. |
|
|
Speaker1: That's nice. Did you sleep well last night? |
|
|
Speaker2: Actually, I slept quite late watching a new show on Netflix.""" |
|
|
inputs = tokenizer([input_text], max_length=512, truncation=True, return_tensors="pt") |
|
|
|
|
|
summary_ids = model.generate(**inputs, max_length=60, num_beams=4, early_stopping=True) |
|
|
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True) |
|
|
|
|
|
print("Summary:", summary) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
Training dataset [Dialogsum](https://huggingface.co/datasets/neil-code/dialogsum-test) |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
ORIGINAL MODEL: |
|
|
{'rouge1': 0.2990526195120211, 'rouge2': 0.10874019046839419, 'rougeL': 0.21186900909813286, 'rougeLsum': 0.22342464591439556} |
|
|
|
|
|
PEFT MODEL: |
|
|
{'rouge1': 0.3132817683433486, 'rouge2': 0.1070363134080079, 'rougeL': 0.23226760188839027, 'rougeLsum': 0.25947902747914586} |
|
|
|
|
|
## Absolute percentage improvement of PEFT MODEL over ORIGINAL MODEL |
|
|
|
|
|
rouge1: 1.42% |
|
|
|
|
|
rouge2: -0.17% |
|
|
|
|
|
rougeL: 2.04% |
|
|
|
|
|
rougeLsum: 3.61% |
|
|
|
|
|
|