--- license: llama3 --- Based on Meta-Llama-3-8b-Instruct, and is governed by Meta Llama 3 License agreement: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct v0.2 version with better improved dolphin based dataset but only 150K for testing instead of the full 850K. Doesn't seem to work that well so I will need to add the rest of the dataset. We are happy for anyone to try it out and give some feedback. You can also try this model on our API at https://www.awanllm.com/ (might take a bit till its available there) Training: - 4096 sequence length, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine. - Trained on a modified and improved version of Cognitive Computations Eric Hartford's Dolphin dataset. https://huggingface.co/datasets/cognitivecomputations/dolphin - Training duration is around 1 day on 2x RTX3090 on our own machine, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights. The goal for this model is to have the model less-censored and great at general tasks like the previous dolphin based models by Eric Hartford. Instruct format: ``` <|begin_of_text|><|start_header_id|>system<|end_header_id|> {{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|> {{ user_message_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|> {{ model_answer_1 }}<|eot_id|><|start_header_id|>user<|end_header_id|> {{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|> ``` Quants: [Built with Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) Axolotl Config: ``` base_model: /home/owen/models/Meta-Llama-3-8B-Instruct model_type: LlamaForCausalLM tokenizer_type: AutoTokenizer train_on_inputs: false group_by_length: false load_in_8bit: false load_in_4bit: true strict: false sequence_len: 4096 bf16: true fp16: false tf32: false flash_attention: true # Data datasets: - path: /home/owen/datasets/cleaned-dolphin201-sharegpt2-uuid-improved.jsonl type: field_instruction: input field_output: output format: "<|start_header_id|>user<|end_header_id|>\n\n{instruction}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" no_input_format: "<|start_header_id|>user<|end_header_id|>\n\n{instruction}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" warmup_steps: 10 dataset_prepared_path: ./last_run_prepared # Iterations num_epochs: 1 saves_per_epoch: 4 # Evaluation val_set_size: 0.01 eval_table_size: eval_table_max_new_tokens: eval_sample_packing: false evals_per_epoch: 4 # LoRA output_dir: ./qlora-out adapter: qlora lora_model_dir: lora_r: 64 lora_alpha: 128 lora_dropout: 0.05 lora_target_linear: true lora_fan_in_fan_out: lora_target_modules: save_safetensors: true # Sampling sample_packing: true pad_to_sequence_len: true # Batching gradient_accumulation_steps: 32 micro_batch_size: 2 gradient_checkpointing: true gradient_checkpointing_kwargs: use_reentrant: true # wandb wandb_mode: # "offline" to save run metadata locally and not sync to the server, "disabled" to turn off wandb wandb_project: llama-3-8b-instruct-dolphin-q wandb_entity: # A wandb Team name if using a Team wandb_watch: wandb_name: 64-128-4096-1ep-v0.2 wandb_run_id: # Set the ID of your wandb run wandb_log_model: # "checkpoint" to log model to wandb Artifacts every `save_steps` or "end" to log only at the end of training # Optimizer optimizer: paged_adamw_8bit lr_scheduler: cosine learning_rate: 0.0002 # Misc early_stopping_patience: resume_from_checkpoint: logging_steps: 1 debug: deepspeed: /home/owen/axolotl/deepspeed_configs/zero3_bf16.json weight_decay: 0.1 special_tokens: pad_token: <|end_of_text|> ```