|
--- |
|
language: |
|
- en |
|
base_model: |
|
- google/flan-t5-base |
|
pipeline_tag: summarization |
|
library_name: transformers |
|
license: mit |
|
model-index: |
|
- name: flan-t5-titlegen-springer |
|
results: |
|
- task: |
|
type: abstractive-summarization |
|
metrics: |
|
- type: ROUGE_1 |
|
value: 0.6852 |
|
- type: ROUGE_2 |
|
value: 0.5385 |
|
- type: ROUGE_L |
|
value: 0.6411 |
|
- type: ROUGE_Lsum |
|
value: 0.6411 |
|
- type: Precision |
|
value: 0.9383 |
|
- type: Recall |
|
value: 0.9222 |
|
- type: F1 |
|
value: 0.9300 |
|
--- |
|
## Model Details |
|
|
|
|
|
|
|
### Model Description |
|
|
|
This model is a fine-tuned version of **google/flan-t5-base**, specifically adapted for abstractive summarization of scientific abstracts into concise titles. It has been trained on a dataset curated from *Springer* journal publications, filtered to include only machine learning-related research. By leveraging the instruction-following capabilities of FLAN-T5, the model generates precise and contextually relevant titles. |
|
|
|
Further fine-tuning on a broader dataset across multiple disciplines could enhance its generalization and accuracy in other research domains. |
|
|
|
- **Developed by:** [tiam4tt](https://huggingface.co/tiam4tt), [HTThuanHcmus](https://huggingface.co/HTThuanHcmus) |
|
- **Model type:** Language Model |
|
- **Language(s) (NLP):** English |
|
- **License:** MIT |
|
- **Finetuned from model:** [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) |
|
|
|
### Project Repo |
|
|
|
- **Repository:** [tiam4tt/TextMining_TitleGeneration](https://github.com/tiam4tt/TextMining_TitleGeneration.git) |
|
|
|
### Usage |
|
|
|
|
|
```python |
|
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer |
|
|
|
model = AutoModelForSeq2SeqLM.from_pretrained("tiam4tt/flan-t5-titlegen-springer") |
|
tokenizer = AutoTokenizer.from_pretrained("tiam4tt/flan-t5-titlegen-springer") |
|
|
|
abstract = "Transfer learning has become a crucial technique in deep learning, enabling models to leverage knowledge from pre-trained networks for improved performance on new tasks. In this study, we propose an optimized fine-tuning strategy for convolutional neural networks (CNNs), reducing training time while maintaining high accuracy. Experiments on CIFAR-10 show a 15% improvement in efficiency compared to standard fine-tuning methods, demonstrating the effectiveness of our approach." |
|
|
|
inputs = tokenizer(abstract, return_tensors="pt", padding=True, truncation=True) |
|
outputs = model.generate(**inputs, max_new_tokens=32) |
|
|
|
title = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(title) |
|
``` |
|
|
|
### Direct Use and Downstream Use |
|
|
|
This model can be further trained on a more diverse dataset covering fields beyond machine learning to generate more precise and contextually relevant titles. |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
Using this model for unrelated tasks, such as summarizing news articles, generating creative writing, or processing highly technical abstracts from unrelated fields (e.g., chemistry, law, or medicine), may result in inaccurate or low-quality outputs. Fine-tuning on broader datasets would be required for effective generalization beyond machine learning literature. |
|
|
|
|
|
### Training Data |
|
|
|
[springer-journal-final](https://www.kaggle.com/datasets/tiamatt/springer-journal-final) |