Llama-3.2-1B-ImageGen / README.md

KavinduHansaka

Update README.md

614a574 verified 9 days ago

preview code

raw

history blame contribute delete

4.54 kB

metadata

library_name: transformers
license: mit
language:
  - en
base_model:
  - meta-llama/Llama-3.2-1B
pipeline_tag: text-generation
tags:
  - prompts
  - lora
  - fine-tuning
  - stable-diffusion
  - sdxl
datasets:
  - KavinduHansaka/prompt-gen-10k-flux-sdxl
model-index:
  - name: Llama-3.2-1B-ImageGen
    results:
      - task:
          type: text-generation
          name: Prompt Generation (Dev)
        dataset:
          type: json
          name: prompt_gen_refined_dev500
        metrics:
          - name: eval_loss
            type: loss
            value: 0.7515
            verified: false
          - name: perplexity
            type: perplexity
            value: 2.12
            verified: false
          - name: avg_target_words
            type: length
            value: 106.8
            verified: false
      - task:
          type: text-generation
          name: Prompt Generation (Test)
        dataset:
          type: json
          name: prompt_gen_refined_test500
        metrics:
          - name: eval_loss
            type: loss
            value: 0.7483
            verified: false
          - name: perplexity
            type: perplexity
            value: 2.11
            verified: false
          - name: avg_target_words
            type: length
            value: 106.7
            verified: false

Llama-3.2-1B — Image Prompt Generation (LoRA Merged)

This repository provides a LoRA-finetuned & merged version of meta-llama/Llama-3.2-1B, specialized for image prompt generation.
It is designed to create cinematic, detailed, and structured prompts for text-to-image models such as Stable Diffusion XL and Flux.

Note: This is a prompt-generation model, not an instruction/chat model. It is trained to produce concise, creative prompts suitable for diffusion-based image synthesis.

Model Details

Maintainer: KavinduHansaka
Base model: meta-llama/Llama-3.2-1B
Model type: Decoder-only causal LM (1B parameters)
Languages: English (prompt tags, stylistic descriptors)
License: MIT
Finetuned with: LoRA adapters, then merged
Training dataset: prompt-gen-10k-flux-sdxl

Model Sources

Merged model repo: https://huggingface.co/KavinduHansaka/Llama-3.2-1B-ImageGen
LoRA adapter repo: https://huggingface.co/KavinduHansaka/Llama-3.2-1B-ImageGen-LoRA
Training dataset: https://huggingface.co/datasets/KavinduHansaka/prompt-gen-10k-flux-sdxl

What’s Included

config.json, generation_config.json
Merged model weights (model.safetensors)
Tokenizer files (tokenizer.json, tokenizer_config.json, special_tokens_map.json)

Uses

Direct Use

Generate stylized, cinematic, or structured prompts for image synthesis models (Stable Diffusion, Flux, SDXL).

Downstream Use

As a base for further LoRA finetuning on style-specific datasets.
As a prompt generator inside T2I pipelines.

Out-of-Scope Use

General-purpose chat.
Safety-critical applications.

How to Get Started

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

REPO_ID = "KavinduHansaka/Llama-3.2-1B-ImageGen"

tok = AutoTokenizer.from_pretrained(REPO_ID)
model = AutoModelForCausalLM.from_pretrained(
    REPO_ID, device_map="auto", torch_dtype=torch.bfloat16
)

prompt = "Create a cinematic noir macro photo with film grain, 1:1 ratio, sharp focus."
inputs = tok(prompt, return_tensors="pt").to(model.device)

out = model.generate(**inputs, max_new_tokens=120, do_sample=True, temperature=0.5, top_p=0.9)
print(tok.decode(out[0], skip_special_tokens=True))

Training Details

Training data: prompt-gen-8k-flux-sdxl
Training method: LoRA with PEFT, adapters merged into base model.
Precision: bfloat16/float16 during training.

Technical Specifications

Architecture: LLaMA 3.2 (1B parameters)
Hardware: NVIDIA GPU ≥6 GB VRAM
Dependencies: transformers, peft, accelerate, torch, sentencepiece

Citation

@misc{llama3.2-1b,
title = {LLaMA 3.2 (1B)},
author = {Meta AI},
year = {2024},
url = {https://huggingface.co/meta-llama/Llama-3.2-1B}
}

@misc{llama3.2-1b-promptgen,
  title  = {Llama-3.2-1B Image Prompt Generator (LoRA Merged)},
  author = {Kavindu Hansaka Jayasinghe},
  year   = {2025},
  url    = {https://huggingface.co/KavinduHansaka/Llama-3.2-1B-ImageGen}
}