English

These are EXL2 quants for Aura-8B, Measurement file in the main branch, Check revisions for different BPW


Aura-8B

image/png

Introduction

Aura-8B is a state of the art dedicated roleplaying model designed to fulfill your every desire.

This finetune has seen several hundreds of millions of tokens of instruction and roleplaying data. A Kahneman-Tversky Optimization was applied as a Low Rank Adapter to give this model a unique output style.

Developed by Aura Industries, with contributions from Anthracite Org

Model Details

  • Model Name: Aura-8B
  • Base Model: arcee-ai/Llama-3.1-SuperNova-Lite
  • Model Type: Chat Completions
  • Prompt Format: Llama 3
  • License: Apache-2.0
  • Language: English
  • Max Context: 8,192+ tokens

License

This model is licensed under the Apache 2.0 License.

Quantizations

Static GGUF

Imatrix GGUF

EXL2

Open LLM Leaderboard Evaluation Results

Metric Value
Avg. 27.34
IFEval (0-Shot) 72.05
BBH (3-Shot) 30.98
MATH Lvl 5 (4-Shot) 15.03
GPQA (0-shot) 4.81
MuSR (0-shot) 9.22
MMLU-PRO (5-shot) 31.93

Training Configuration

Click here for Axolotl configs

SFT

base_model: arcee-ai/Llama-3.1-SuperNova-Lite
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: false
load_in_4bit: false
strict: false

datasets:
  - path: FourOhFour/RP_Phase
    type: chat_template
    chat_template: llama3
    roles_to_train: ["gpt"]
    field_messages: conversations
    message_field_role: from
    message_field_content: value
    train_on_eos: turn
  - path: Nitral-AI/Cybersecurity-ShareGPT
    type: chat_template
    chat_template: llama3
    roles_to_train: ["gpt"]
    field_messages: conversations
    message_field_role: from
    message_field_content: value
    train_on_eos: turn
  - path: Nitral-AI/Medical_Instruct-ShareGPT
    type: chat_template
    chat_template: llama3
    roles_to_train: ["gpt"]
    field_messages: conversations
    message_field_role: from
    message_field_content: value
    train_on_eos: turn
  - path: Nitral-AI/Olympiad_Math-ShareGPT
    type: chat_template
    chat_template: llama3
    roles_to_train: ["gpt"]
    field_messages: conversations
    message_field_role: from
    message_field_content: value
    train_on_eos: turn
  - path: NewEden/Claude-Instruct-5k
    type: chat_template
    chat_template: llama3
    roles_to_train: ["gpt"]
    field_messages: conversations
    message_field_role: from
    message_field_content: value
    train_on_eos: turn
  - path: lodrick-the-lafted/kalo-opus-instruct-3k-filtered
    type: chat_template
    chat_template: llama3
    roles_to_train: ["gpt"]
    field_messages: conversations
    message_field_role: from
    message_field_content: value
    train_on_eos: turn
  - path: Nitral-AI/Creative_Writing-ShareGPT
    type: chat_template
    chat_template: llama3
    roles_to_train: ["gpt"]
    field_messages: conversations
    message_field_role: from
    message_field_content: value
    train_on_eos: turn
  - path: jeiku/Writing
    type: completion
    field: text

shuffle_merged_datasets: true
dataset_prepared_path:
val_set_size: 0.01
output_dir: ./output/out

hub_model_id: jeiku/Aura-8B
hub_strategy: "all_checkpoints"
push_dataset_to_hub:
hf_use_auth_token: true

sequence_len: 8192
sample_packing: true
eval_sample_packing: false
pad_to_sequence_len:

wandb_project: Aura-8B
wandb_entity:
wandb_watch:
wandb_name: Aura-8B
wandb_log_model:

gradient_accumulation_steps: 16
micro_batch_size: 2
num_epochs: 2
optimizer: paged_adamw_8bit
lr_scheduler: cosine
learning_rate: 1e-5

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_ratio: 0.1
evals_per_epoch: 2
eval_table_size:
eval_max_new_tokens:
saves_per_epoch: 1
debug:
deepspeed: 
weight_decay: 0.05
fsdp:
fsdp_config:
special_tokens:
  pad_token: <|finetune_right_pad_id|>
  eos_token: <|eot_id|>

KTO

base_model: jeiku/Aura-8B
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: false
load_in_4bit: false
strict: false

hub_model_id: jeiku/aurakto
hub_strategy: "all_checkpoints"
push_dataset_to_hub:
hf_use_auth_token: true

chat_template: llama3

rl: kto
rl_beta: 0.2
kto_desirable_weight: 0.2

datasets:
  - path: anthracite-core/full-opus-chosen-hermes-rejected-kto-v1
    type: llama3.argilla

shuffle_merged_datasets: true
val_set_size: 0.0
output_dir: ./outputs/out

adapter: lora
lora_model_dir:

lora_r: 32
lora_alpha: 64
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:

sequence_len: 8192
sample_packing: false
eval_sample_packing: false
pad_to_sequence_len: false

wandb_project: Aura-8B
wandb_entity:
wandb_watch:
wandb_name: Aura-8B
wandb_log_model:

gradient_accumulation_steps: 16
micro_batch_size: 2
num_epochs: 2
max_steps: 500

optimizer: adamw_8bit
lr_scheduler: cosine
learning_rate: 0.0001
weight_decay: 0.05

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: true

gradient_checkpointing: true
gradient_checkpointing_kwargs:
  use_reentrant: true
remove_unused_columns: false
early_stopping_patience:
resume_from_checkpoint: 
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10
evals_per_epoch: 2
eval_table_size:
eval_max_new_tokens: 
saves_per_epoch: 1

debug:
deepspeed: 
fsdp:
fsdp_config:
fsdp:
fsdp_config:

special_tokens:
  pad_token: <|finetune_right_pad_id|>
  eos_token: <|eot_id|>

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for NewEden/Aura-8B-EXL2

Datasets used to train NewEden/Aura-8B-EXL2

Collection including NewEden/Aura-8B-EXL2