File size: 4,536 Bytes
642ae27 614a574 642ae27 614a574 642ae27 f9bfb36 614a574 642ae27 614a574 642ae27 f9bfb36 642ae27 f9bfb36 642ae27 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 |
---
library_name: transformers
license: mit
language:
- en
base_model:
- meta-llama/Llama-3.2-1B
pipeline_tag: text-generation
tags:
- prompts
- lora
- fine-tuning
- stable-diffusion
- sdxl
datasets:
- KavinduHansaka/prompt-gen-10k-flux-sdxl
model-index:
- name: Llama-3.2-1B-ImageGen
results:
- task:
type: text-generation
name: Prompt Generation (Dev)
dataset:
type: json
name: prompt_gen_refined_dev500
metrics:
- name: eval_loss
type: loss
value: 0.7515
verified: false
- name: perplexity
type: perplexity
value: 2.12
verified: false
- name: avg_target_words
type: length
value: 106.8
verified: false
- task:
type: text-generation
name: Prompt Generation (Test)
dataset:
type: json
name: prompt_gen_refined_test500
metrics:
- name: eval_loss
type: loss
value: 0.7483
verified: false
- name: perplexity
type: perplexity
value: 2.11
verified: false
- name: avg_target_words
type: length
value: 106.7
verified: false
---
# Llama-3.2-1B — Image Prompt Generation (LoRA Merged)
This repository provides a **LoRA-finetuned & merged** version of `meta-llama/Llama-3.2-1B`, specialized for **image prompt generation**.
It is designed to create **cinematic, detailed, and structured prompts** for text-to-image models such as **Stable Diffusion XL** and **Flux**.
> **Note:** This is a prompt-generation model, not an instruction/chat model. It is trained to produce concise, creative prompts suitable for diffusion-based image synthesis.
---
## Model Details
- **Maintainer:** [KavinduHansaka](https://huggingface.co/KavinduHansaka)
- **Base model:** [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B)
- **Model type:** Decoder-only causal LM (1B parameters)
- **Languages:** English (prompt tags, stylistic descriptors)
- **License:** MIT
- **Finetuned with:** LoRA adapters, then merged
- **Training dataset:** [prompt-gen-10k-flux-sdxl](https://huggingface.co/datasets/KavinduHansaka/prompt-gen-10k-flux-sdxl)
### Model Sources
- **Merged model repo:** https://huggingface.co/KavinduHansaka/Llama-3.2-1B-ImageGen
- **LoRA adapter repo:** https://huggingface.co/KavinduHansaka/Llama-3.2-1B-ImageGen-LoRA
- **Training dataset:** https://huggingface.co/datasets/KavinduHansaka/prompt-gen-10k-flux-sdxl
---
## What’s Included
- `config.json`, `generation_config.json`
- Merged model weights (`model.safetensors`)
- Tokenizer files (`tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json`)
---
## Uses
### Direct Use
- Generate **stylized, cinematic, or structured prompts** for image synthesis models (Stable Diffusion, Flux, SDXL).
### Downstream Use
- As a **base for further LoRA finetuning** on style-specific datasets.
- As a **prompt generator inside T2I pipelines**.
### Out-of-Scope Use
- General-purpose chat.
- Safety-critical applications.
---
## How to Get Started
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
REPO_ID = "KavinduHansaka/Llama-3.2-1B-ImageGen"
tok = AutoTokenizer.from_pretrained(REPO_ID)
model = AutoModelForCausalLM.from_pretrained(
REPO_ID, device_map="auto", torch_dtype=torch.bfloat16
)
prompt = "Create a cinematic noir macro photo with film grain, 1:1 ratio, sharp focus."
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=120, do_sample=True, temperature=0.5, top_p=0.9)
print(tok.decode(out[0], skip_special_tokens=True))
```
---
## Training Details
- **Training data:** [prompt-gen-8k-flux-sdxl](https://huggingface.co/datasets/KavinduHansaka/prompt-gen-10k-flux-sdxl)
- **Training method:** LoRA with PEFT, adapters merged into base model.
- **Precision:** bfloat16/float16 during training.
---
## Technical Specifications
- **Architecture:** LLaMA 3.2 (1B parameters)
- **Hardware:** NVIDIA GPU ≥6 GB VRAM
- **Dependencies:** `transformers`, `peft`, `accelerate`, `torch`, `sentencepiece`
---
## Citation
```
@misc{llama3.2-1b,
title = {LLaMA 3.2 (1B)},
author = {Meta AI},
year = {2024},
url = {https://huggingface.co/meta-llama/Llama-3.2-1B}
}
@misc{llama3.2-1b-promptgen,
title = {Llama-3.2-1B Image Prompt Generator (LoRA Merged)},
author = {Kavindu Hansaka Jayasinghe},
year = {2025},
url = {https://huggingface.co/KavinduHansaka/Llama-3.2-1B-ImageGen}
}
``` |