File size: 3,538 Bytes
5316b23 d1d729a 706695d 5316b23 706695d d1d729a 5316b23 706695d d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a 5316b23 d1d729a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
---
tags:
- summarization
- text-generation
- gemma3
base_model:
- google/gemma-3-270m-it
library_name: transformers
pipeline_tag: text-generation
license: gemma
datasets:
- EdinburghNLP/xsum
language:
- en
---
# Gemma-3 270M Fine-tuned (XSum)
This is a fine-tuned version of
**[google/gemma-3-270m](https://huggingface.co/google/gemma-3-270m)**
using the **XSum** dataset.\
The model was trained with **Unsloth** for efficient fine-tuning and the
**LoRA adapters have been merged into the model weights**.
------------------------------------------------------------------------
## Model Details
- **Base model:** `google/gemma-3-270m`
- **Architecture:** Gemma-3, 270M parameters
- **Training framework:**
[Unsloth](https://github.com/unslothai/unsloth)
- **Task:** Abstractive summarization
- **Dataset:**
[XSum](https://huggingface.co/datasets/EdinburghNLP/xsum)
- **Adapter merge:** Yes (LoRA weights merged into final model)
- **Precision:** Full precision (no 4bit/8bit quantization used)
------------------------------------------------------------------------
## Training Configuration
The model was fine-tuned starting from **`unsloth/gemma-3-270m-it`**
using **LoRA** adapters with the **Unsloth framework**.\
The LoRA adapters were later merged into the base model weights.
- **Base model:** `unsloth/gemma-3-270m-it`\
- **Sequence length:** 2048 \
- **Quantization:** not used (no 4-bit or 8-bit)\
- **Full finetuning:** disabled (LoRA fine-tuning only)
### LoRA Setup
- **Rank (r):** 128\
- **Target modules:** `q_proj`, `k_proj`, `v_proj`, `o_proj`,
`gate_proj`, `up_proj`, `down_proj`\
- **LoRA alpha:** 128\
- **LoRA dropout:** 0\
### Training Details
- **Dataset:**
[XSum](https://huggingface.co/datasets/EdinburghNLP/xsum)\
- **Batch size per device:** 128\
- **Gradient accumulation steps:** 1\
- **Warmup steps:** 5\
- **Training epochs:** 1 \
- **Learning rate:** 5e-5 (linear schedule)\
------------------------------------------------------------------------
## Intended Use
- **Primary use case:** Abstractive summarization of long-form text
(news-style)
- **Not suitable for:** Factual Q&A, reasoning, coding, or tasks
requiring large-context models
- **Limitations:** Small model size (270M) means limited reasoning
ability compared to larger Gemma models
------------------------------------------------------------------------
## Example Usage
``` python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "ShahzebKhoso/Gemma3_270M_FineTuned_XSUM"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
text = "The UK government announced new measures to support renewable energy."
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
------------------------------------------------------------------------
## License
- Base model license: [Gemma
License](https://ai.google.dev/gemma/terms)
- Dataset license: [XSum (CC BY-NC-SA
4.0)](https://huggingface.co/datasets/EdinburghNLP/xsum)
------------------------------------------------------------------------
## Acknowledgements
- [Unsloth](https://github.com/unslothai/unsloth) for efficient
finetuning
- [Google DeepMind](https://deepmind.google/) for Gemma-3
- [EdinburghNLP](https://huggingface.co/EdinburghNLP) for XSum dataset
|