exl2 quant (measurement.json in main branch)


check revisions for quants


This is the first in a line of models dedicated to creating Stable-Diffusion prompts when given a character appearance, This has been finetuned ontop of NewEden/Qwen-1.5B-Claude.

Prompting

Model has been tuned with the Alapaca formatting. A typical input would look like this:

### Instruction:
Create a prompt for Stable Diffusion based on the information below.
### Input:
Rae has short has dark brown hair and brown eyes, She is commonly seen wearing her Royal Academy uniform, which consists of a red jacket with gold lines, a white ruffled necktie, a red bow tie with an attached blue gem, and a long black skirt with white lines. Along with her uniform, she wears black leggings and brown shoes. 
### Response:

System Prompting

I would highly recommend using the following system prompt for this model.

Create a prompt for Stable Diffusion based on the information below.

Axolotl Config

See Axolotl Trainer config
base_model: NewEden/Qwen-1.5B-Claude
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

trust_remote_code: true

load_in_8bit: false
load_in_4bit: false
strict: false

datasets:
  - path: civit-slop-combined.jsonl 
    type: alpaca
    conversation: mpt-30b-instruct

chat_template: alpaca

dataset_prepared_path:
val_set_size: 0.05
output_dir: ./outputs/sd-prompter
sequence_len: 2048
sample_packing: true
eval_sample_packing: false
pad_to_sequence_len: true

adapter:
lora_model_dir:
lora_r:
lora_alpha:
lora_dropout:
lora_target_linear: true
lora_fan_in_fan_out:

wandb_project: SDprompt-qwen
wandb_entity:
wandb_watch:
wandb_name: qwen1.5b-2
wandb_log_model:

gradient_accumulation_steps: 64
micro_batch_size: 2
num_epochs: 3
optimizer: adamw_torch
lr_scheduler: cosine
learning_rate: 0.00002

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: true

gradient_checkpointing: true
gradient_checkpointing_kwargs:
  use_reentrant: false
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_ratio: 0.05
evals_per_epoch: 4
saves_per_epoch: 1
debug:
#deepspeed: deepspeed_configs/zero2.json
#deepspeed: /training/axolotl/axolotl/deepspeed_configs/zero2.json
weight_decay: 0.0
#fsdp:
#fsdp_config:
#  fsdp_limit_all_gathers: true
#  fsdp_sync_module_states: true
#  fsdp_offload_params: true
#  fsdp_use_orig_params: false
#  fsdp_cpu_ram_efficient_loading: true
#  fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
#  fsdp_transformer_layer_cls_to_wrap: Qwen2DecoderLayer
#  fsdp_state_dict_type: FULL_STATE_DICT
special_tokens:

Credits

Thank you to Kubernetes Bad

Training

The training was done for 2 epochs. I used 2 x RTX 6000s GPUs graciously provided by Kubernetes Bad for the full-parameter fine-tuning of the model.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train Delta-Vector/SD-Prompter-1.5B-V0.1-EXL2

Collection including Delta-Vector/SD-Prompter-1.5B-V0.1-EXL2