File size: 4,536 Bytes

---
library_name: transformers
license: mit
language:
- en
base_model:
- meta-llama/Llama-3.2-1B
pipeline_tag: text-generation
tags:
- prompts
- lora
- fine-tuning
- stable-diffusion
- sdxl
datasets:
- KavinduHansaka/prompt-gen-10k-flux-sdxl

model-index:
- name: Llama-3.2-1B-ImageGen
  results:
  - task:
      type: text-generation
      name: Prompt Generation (Dev)
    dataset:
      type: json
      name: prompt_gen_refined_dev500
    metrics:
    - name: eval_loss
      type: loss
      value: 0.7515
      verified: false
    - name: perplexity
      type: perplexity
      value: 2.12
      verified: false
    - name: avg_target_words
      type: length
      value: 106.8
      verified: false

  - task:
      type: text-generation
      name: Prompt Generation (Test)
    dataset:
      type: json
      name: prompt_gen_refined_test500
    metrics:
    - name: eval_loss
      type: loss
      value: 0.7483
      verified: false
    - name: perplexity
      type: perplexity
      value: 2.11
      verified: false
    - name: avg_target_words
      type: length
      value: 106.7
      verified: false
---

# Llama-3.2-1B — Image Prompt Generation (LoRA Merged)

This repository provides a **LoRA-finetuned & merged** version of `meta-llama/Llama-3.2-1B`, specialized for **image prompt generation**.  
It is designed to create **cinematic, detailed, and structured prompts** for text-to-image models such as **Stable Diffusion XL** and **Flux**.

> **Note:** This is a prompt-generation model, not an instruction/chat model. It is trained to produce concise, creative prompts suitable for diffusion-based image synthesis.

---

## Model Details

- **Maintainer:** [KavinduHansaka](https://huggingface.co/KavinduHansaka)  
- **Base model:** [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B)  
- **Model type:** Decoder-only causal LM (1B parameters)  
- **Languages:** English (prompt tags, stylistic descriptors)  
- **License:** MIT  
- **Finetuned with:** LoRA adapters, then merged  
- **Training dataset:** [prompt-gen-10k-flux-sdxl](https://huggingface.co/datasets/KavinduHansaka/prompt-gen-10k-flux-sdxl)  

### Model Sources
- **Merged model repo:** https://huggingface.co/KavinduHansaka/Llama-3.2-1B-ImageGen  
- **LoRA adapter repo:** https://huggingface.co/KavinduHansaka/Llama-3.2-1B-ImageGen-LoRA 
- **Training dataset:** https://huggingface.co/datasets/KavinduHansaka/prompt-gen-10k-flux-sdxl  

---

## What’s Included

- `config.json`, `generation_config.json`  
- Merged model weights (`model.safetensors`)  
- Tokenizer files (`tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json`)  

---

## Uses

### Direct Use
- Generate **stylized, cinematic, or structured prompts** for image synthesis models (Stable Diffusion, Flux, SDXL).  

### Downstream Use
- As a **base for further LoRA finetuning** on style-specific datasets.  
- As a **prompt generator inside T2I pipelines**.  

### Out-of-Scope Use
- General-purpose chat.  
- Safety-critical applications.  

---

## How to Get Started

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

REPO_ID = "KavinduHansaka/Llama-3.2-1B-ImageGen"

tok = AutoTokenizer.from_pretrained(REPO_ID)
model = AutoModelForCausalLM.from_pretrained(
    REPO_ID, device_map="auto", torch_dtype=torch.bfloat16
)

prompt = "Create a cinematic noir macro photo with film grain, 1:1 ratio, sharp focus."
inputs = tok(prompt, return_tensors="pt").to(model.device)

out = model.generate(**inputs, max_new_tokens=120, do_sample=True, temperature=0.5, top_p=0.9)
print(tok.decode(out[0], skip_special_tokens=True))
```

---

## Training Details

- **Training data:** [prompt-gen-8k-flux-sdxl](https://huggingface.co/datasets/KavinduHansaka/prompt-gen-10k-flux-sdxl)  
- **Training method:** LoRA with PEFT, adapters merged into base model.  
- **Precision:** bfloat16/float16 during training.  

---

## Technical Specifications

- **Architecture:** LLaMA 3.2 (1B parameters)  
- **Hardware:** NVIDIA GPU ≥6 GB VRAM  
- **Dependencies:** `transformers`, `peft`, `accelerate`, `torch`, `sentencepiece`  

---

## Citation

```
@misc{llama3.2-1b,
title = {LLaMA 3.2 (1B)},
author = {Meta AI},
year = {2024},
url = {https://huggingface.co/meta-llama/Llama-3.2-1B}
}

@misc{llama3.2-1b-promptgen,
  title  = {Llama-3.2-1B Image Prompt Generator (LoRA Merged)},
  author = {Kavindu Hansaka Jayasinghe},
  year   = {2025},
  url    = {https://huggingface.co/KavinduHansaka/Llama-3.2-1B-ImageGen}
}
```