File size: 3,538 Bytes
5316b23
d1d729a
 
 
 
 
706695d
 
 
5316b23
706695d
d1d729a
 
 
 
 
5316b23
 
706695d
d1d729a
5316b23
d1d729a
 
 
 
 
5316b23
d1d729a
5316b23
 
 
d1d729a
 
 
 
 
 
 
 
 
5316b23
d1d729a
5316b23
 
d1d729a
5316b23
d1d729a
 
 
5316b23
d1d729a
 
 
 
5316b23
d1d729a
5316b23
d1d729a
 
 
 
 
5316b23
d1d729a
5316b23
d1d729a
 
 
 
 
 
 
5316b23
d1d729a
5316b23
d1d729a
5316b23
d1d729a
 
 
 
 
 
5316b23
d1d729a
5316b23
d1d729a
5316b23
d1d729a
 
5316b23
d1d729a
 
 
5316b23
d1d729a
 
5316b23
d1d729a
 
 
5316b23
d1d729a
5316b23
d1d729a
5316b23
d1d729a
 
 
 
5316b23
d1d729a
5316b23
d1d729a
5316b23
d1d729a
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
---
tags:
- summarization
- text-generation
- gemma3

base_model:
  - google/gemma-3-270m-it

library_name: transformers
pipeline_tag: text-generation
license: gemma
datasets:
- EdinburghNLP/xsum
language:
- en
---


# Gemma-3 270M Fine-tuned (XSum)

This is a fine-tuned version of
**[google/gemma-3-270m](https://huggingface.co/google/gemma-3-270m)**
using the **XSum** dataset.\
The model was trained with **Unsloth** for efficient fine-tuning and the
**LoRA adapters have been merged into the model weights**.

------------------------------------------------------------------------

## Model Details

-   **Base model:** `google/gemma-3-270m`
-   **Architecture:** Gemma-3, 270M parameters
-   **Training framework:**
    [Unsloth](https://github.com/unslothai/unsloth)
-   **Task:** Abstractive summarization
-   **Dataset:**
    [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum)
-   **Adapter merge:** Yes (LoRA weights merged into final model)
-   **Precision:** Full precision (no 4bit/8bit quantization used)

------------------------------------------------------------------------


## Training Configuration

The model was fine-tuned starting from **`unsloth/gemma-3-270m-it`**
using **LoRA** adapters with the **Unsloth framework**.\
The LoRA adapters were later merged into the base model weights.

-   **Base model:** `unsloth/gemma-3-270m-it`\
-   **Sequence length:** 2048 \
-   **Quantization:** not used (no 4-bit or 8-bit)\
-   **Full finetuning:** disabled (LoRA fine-tuning only)

### LoRA Setup

-   **Rank (r):** 128\
-   **Target modules:** `q_proj`, `k_proj`, `v_proj`, `o_proj`,
    `gate_proj`, `up_proj`, `down_proj`\
-   **LoRA alpha:** 128\
-   **LoRA dropout:** 0\

### Training Details

-   **Dataset:**
    [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum)\
-   **Batch size per device:** 128\
-   **Gradient accumulation steps:** 1\
-   **Warmup steps:** 5\
-   **Training epochs:** 1 \
-   **Learning rate:** 5e-5 (linear schedule)\

------------------------------------------------------------------------

## Intended Use

-   **Primary use case:** Abstractive summarization of long-form text
    (news-style)
-   **Not suitable for:** Factual Q&A, reasoning, coding, or tasks
    requiring large-context models
-   **Limitations:** Small model size (270M) means limited reasoning
    ability compared to larger Gemma models

------------------------------------------------------------------------

## Example Usage

``` python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ShahzebKhoso/Gemma3_270M_FineTuned_XSUM"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

text = "The UK government announced new measures to support renewable energy."
inputs = tokenizer(text, return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

------------------------------------------------------------------------

## License

-   Base model license: [Gemma
    License](https://ai.google.dev/gemma/terms)
-   Dataset license: [XSum (CC BY-NC-SA
    4.0)](https://huggingface.co/datasets/EdinburghNLP/xsum)

------------------------------------------------------------------------

## Acknowledgements

-   [Unsloth](https://github.com/unslothai/unsloth) for efficient
    finetuning
-   [Google DeepMind](https://deepmind.google/) for Gemma-3
-   [EdinburghNLP](https://huggingface.co/EdinburghNLP) for XSum dataset