File size: 3,538 Bytes

---
tags:
- summarization
- text-generation
- gemma3

base_model:
  - google/gemma-3-270m-it

library_name: transformers
pipeline_tag: text-generation
license: gemma
datasets:
- EdinburghNLP/xsum
language:
- en
---


# Gemma-3 270M Fine-tuned (XSum)

This is a fine-tuned version of
**[google/gemma-3-270m](https://huggingface.co/google/gemma-3-270m)**
using the **XSum** dataset.\
The model was trained with **Unsloth** for efficient fine-tuning and the
**LoRA adapters have been merged into the model weights**.

------------------------------------------------------------------------

## Model Details

-   **Base model:** `google/gemma-3-270m`
-   **Architecture:** Gemma-3, 270M parameters
-   **Training framework:**
    [Unsloth](https://github.com/unslothai/unsloth)
-   **Task:** Abstractive summarization
-   **Dataset:**
    [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum)
-   **Adapter merge:** Yes (LoRA weights merged into final model)
-   **Precision:** Full precision (no 4bit/8bit quantization used)

------------------------------------------------------------------------


## Training Configuration

The model was fine-tuned starting from **`unsloth/gemma-3-270m-it`**
using **LoRA** adapters with the **Unsloth framework**.\
The LoRA adapters were later merged into the base model weights.

-   **Base model:** `unsloth/gemma-3-270m-it`\
-   **Sequence length:** 2048 \
-   **Quantization:** not used (no 4-bit or 8-bit)\
-   **Full finetuning:** disabled (LoRA fine-tuning only)

### LoRA Setup

-   **Rank (r):** 128\
-   **Target modules:** `q_proj`, `k_proj`, `v_proj`, `o_proj`,
    `gate_proj`, `up_proj`, `down_proj`\
-   **LoRA alpha:** 128\
-   **LoRA dropout:** 0\

### Training Details

-   **Dataset:**
    [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum)\
-   **Batch size per device:** 128\
-   **Gradient accumulation steps:** 1\
-   **Warmup steps:** 5\
-   **Training epochs:** 1 \
-   **Learning rate:** 5e-5 (linear schedule)\

------------------------------------------------------------------------

## Intended Use

-   **Primary use case:** Abstractive summarization of long-form text
    (news-style)
-   **Not suitable for:** Factual Q&A, reasoning, coding, or tasks
    requiring large-context models
-   **Limitations:** Small model size (270M) means limited reasoning
    ability compared to larger Gemma models

------------------------------------------------------------------------

## Example Usage

``` python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ShahzebKhoso/Gemma3_270M_FineTuned_XSUM"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

text = "The UK government announced new measures to support renewable energy."
inputs = tokenizer(text, return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

------------------------------------------------------------------------

## License

-   Base model license: [Gemma
    License](https://ai.google.dev/gemma/terms)
-   Dataset license: [XSum (CC BY-NC-SA
    4.0)](https://huggingface.co/datasets/EdinburghNLP/xsum)

------------------------------------------------------------------------

## Acknowledgements

-   [Unsloth](https://github.com/unslothai/unsloth) for efficient
    finetuning
-   [Google DeepMind](https://deepmind.google/) for Gemma-3
-   [EdinburghNLP](https://huggingface.co/EdinburghNLP) for XSum dataset