metadata
library_name: transformers
license: mit
language:
- en
base_model:
- meta-llama/Llama-3.2-1B
pipeline_tag: text-generation
tags:
- prompts
- lora
- fine-tuning
- stable-diffusion
- sdxl
datasets:
- KavinduHansaka/prompt-gen-10k-flux-sdxl
model-index:
- name: Llama-3.2-1B-ImageGen
results:
- task:
type: text-generation
name: Prompt Generation (Dev)
dataset:
type: json
name: prompt_gen_refined_dev500
metrics:
- name: eval_loss
type: loss
value: 0.7515
verified: false
- name: perplexity
type: perplexity
value: 2.12
verified: false
- name: avg_target_words
type: length
value: 106.8
verified: false
- task:
type: text-generation
name: Prompt Generation (Test)
dataset:
type: json
name: prompt_gen_refined_test500
metrics:
- name: eval_loss
type: loss
value: 0.7483
verified: false
- name: perplexity
type: perplexity
value: 2.11
verified: false
- name: avg_target_words
type: length
value: 106.7
verified: false
Llama-3.2-1B — Image Prompt Generation (LoRA Merged)
This repository provides a LoRA-finetuned & merged version of meta-llama/Llama-3.2-1B
, specialized for image prompt generation.
It is designed to create cinematic, detailed, and structured prompts for text-to-image models such as Stable Diffusion XL and Flux.
Note: This is a prompt-generation model, not an instruction/chat model. It is trained to produce concise, creative prompts suitable for diffusion-based image synthesis.
Model Details
- Maintainer: KavinduHansaka
- Base model: meta-llama/Llama-3.2-1B
- Model type: Decoder-only causal LM (1B parameters)
- Languages: English (prompt tags, stylistic descriptors)
- License: MIT
- Finetuned with: LoRA adapters, then merged
- Training dataset: prompt-gen-10k-flux-sdxl
Model Sources
- Merged model repo: https://huggingface.co/KavinduHansaka/Llama-3.2-1B-ImageGen
- LoRA adapter repo: https://huggingface.co/KavinduHansaka/Llama-3.2-1B-ImageGen-LoRA
- Training dataset: https://huggingface.co/datasets/KavinduHansaka/prompt-gen-10k-flux-sdxl
What’s Included
config.json
,generation_config.json
- Merged model weights (
model.safetensors
) - Tokenizer files (
tokenizer.json
,tokenizer_config.json
,special_tokens_map.json
)
Uses
Direct Use
- Generate stylized, cinematic, or structured prompts for image synthesis models (Stable Diffusion, Flux, SDXL).
Downstream Use
- As a base for further LoRA finetuning on style-specific datasets.
- As a prompt generator inside T2I pipelines.
Out-of-Scope Use
- General-purpose chat.
- Safety-critical applications.
How to Get Started
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
REPO_ID = "KavinduHansaka/Llama-3.2-1B-ImageGen"
tok = AutoTokenizer.from_pretrained(REPO_ID)
model = AutoModelForCausalLM.from_pretrained(
REPO_ID, device_map="auto", torch_dtype=torch.bfloat16
)
prompt = "Create a cinematic noir macro photo with film grain, 1:1 ratio, sharp focus."
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=120, do_sample=True, temperature=0.5, top_p=0.9)
print(tok.decode(out[0], skip_special_tokens=True))
Training Details
- Training data: prompt-gen-8k-flux-sdxl
- Training method: LoRA with PEFT, adapters merged into base model.
- Precision: bfloat16/float16 during training.
Technical Specifications
- Architecture: LLaMA 3.2 (1B parameters)
- Hardware: NVIDIA GPU ≥6 GB VRAM
- Dependencies:
transformers
,peft
,accelerate
,torch
,sentencepiece
Citation
@misc{llama3.2-1b,
title = {LLaMA 3.2 (1B)},
author = {Meta AI},
year = {2024},
url = {https://huggingface.co/meta-llama/Llama-3.2-1B}
}
@misc{llama3.2-1b-promptgen,
title = {Llama-3.2-1B Image Prompt Generator (LoRA Merged)},
author = {Kavindu Hansaka Jayasinghe},
year = {2025},
url = {https://huggingface.co/KavinduHansaka/Llama-3.2-1B-ImageGen}
}