Move evaluation to front
Browse files
README.md
CHANGED
|
@@ -22,14 +22,23 @@ pipeline_tag: summarization
|
|
| 22 |
## **Model Details**
|
| 23 |
This is a **LoRA fine-tuned adapter** built on [**meta-llama/Llama-3.2-1B-Instruct**](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct). It is designed for scientific paper summarization tasks and leverages **Low-Rank Adaptation (LoRA)** to enhance model performance efficiently while maintaining a low computational overhead.
|
| 24 |
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
## **Dataset**
|
| 28 |
The model was fine-tuned on the [**armanc/scientific_papers**](https://huggingface.co/datasets/armanc/scientific_papers) dataset. Below are the details of the dataset splits:
|
| 29 |
- **Training Set**: 20K samples
|
| 30 |
- **Validation Set**: 6K samples
|
|
|
|
| 31 |
|
| 32 |
-
---
|
| 33 |
|
| 34 |
## **LoRA Configuration**
|
| 35 |
- **Trainable Parameters**: 850K (~7% of base model parameters)
|
|
@@ -46,19 +55,6 @@ The model was fine-tuned on the [**armanc/scientific_papers**](https://huggingfa
|
|
| 46 |
- **Training Duration**: 28 hours
|
| 47 |
- **Training Scripts**: [gabe-zhang/paper2summary](https://github.com/gabe-zhang/paper2summary)
|
| 48 |
|
| 49 |
-
---
|
| 50 |
-
|
| 51 |
-
## **Evaluation**
|
| 52 |
-
The model was evaluated on a **6K-sample test set** using **ROUGE scores** with the following settings:
|
| 53 |
-
- **Decoding Strategy**: Beam search (beam size = 4)
|
| 54 |
-
|
| 55 |
-
### **Performance Comparison**
|
| 56 |
-
| Model | ROUGE-1 | ROUGE-2 | ROUGE-3 | ROUGE-L |
|
| 57 |
-
|---------------------------|----------|----------|----------|----------|
|
| 58 |
-
| **Llama-3.2-1B-Instruct** | 36.69 | 7.47 | 1.95 | 19.36 |
|
| 59 |
-
| **Llama-PaperSummarization-LoRA** | **41.56** | **11.31** | **2.67** | **21.86** |
|
| 60 |
-
|
| 61 |
-
---
|
| 62 |
|
| 63 |
## **License**
|
| 64 |
This repository contains a **LoRA fine-tuned adapter** derived from the Llama 3.2 model.
|
|
|
|
| 22 |
## **Model Details**
|
| 23 |
This is a **LoRA fine-tuned adapter** built on [**meta-llama/Llama-3.2-1B-Instruct**](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct). It is designed for scientific paper summarization tasks and leverages **Low-Rank Adaptation (LoRA)** to enhance model performance efficiently while maintaining a low computational overhead.
|
| 24 |
|
| 25 |
+
|
| 26 |
+
## **Performance Comparison**
|
| 27 |
+
| Model | ROUGE-1 | ROUGE-2 | ROUGE-3 | ROUGE-L |
|
| 28 |
+
|---------------------------|----------|----------|----------|----------|
|
| 29 |
+
| **Llama-3.2-1B-Instruct** | 36.69 | 7.47 | 1.95 | 19.36 |
|
| 30 |
+
| **Llama-PaperSummarization-LoRA** | **41.56** | **11.31** | **2.67** | **21.86** |
|
| 31 |
+
|
| 32 |
+
The model was evaluated on a **6K-sample test set** using **ROUGE scores** with the following settings:
|
| 33 |
+
- **Decoding Strategy**: Beam search (beam size = 4)
|
| 34 |
+
|
| 35 |
|
| 36 |
## **Dataset**
|
| 37 |
The model was fine-tuned on the [**armanc/scientific_papers**](https://huggingface.co/datasets/armanc/scientific_papers) dataset. Below are the details of the dataset splits:
|
| 38 |
- **Training Set**: 20K samples
|
| 39 |
- **Validation Set**: 6K samples
|
| 40 |
+
- **Test Set**: 6K samples
|
| 41 |
|
|
|
|
| 42 |
|
| 43 |
## **LoRA Configuration**
|
| 44 |
- **Trainable Parameters**: 850K (~7% of base model parameters)
|
|
|
|
| 55 |
- **Training Duration**: 28 hours
|
| 56 |
- **Training Scripts**: [gabe-zhang/paper2summary](https://github.com/gabe-zhang/paper2summary)
|
| 57 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
|
| 59 |
## **License**
|
| 60 |
This repository contains a **LoRA fine-tuned adapter** derived from the Llama 3.2 model.
|