--- datasets: - mlabonne/alpagasus language: - en pipeline_tag: text-generation tags: - llama - alpaca - alpagasus --- # πŸ¦™πŸ•ŠοΈ Alpagasus-2-7b πŸ“ [Paper](https://arxiv.org/abs/2307.08701) | πŸ“„ [Blog](https://lichang-chen.github.io/AlpaGasus/) | πŸ’» [Code](https://github.com/gpt4life/alpagasus/tree/main) | πŸ€— [Model](https://huggingface.co/gpt4life/alpagasus-7b) (unofficial) This is a `Llama-2-7b-hf` model fine-tuned using QLoRA (4-bit precision) on the [`mlabonne/alpagasus`](https://huggingface.co/datasets/mlabonne/alpagasus) dataset, which is a high-quality subset (9k samples) of the Alpaca dataset (52k samples). ## πŸ”§ Training ![](https://i.imgur.com/ebwyRbo.png) It was trained on an RTX 3090 using the [πŸœπŸ”§TinyTuner](https://github.com/mlabonne/tinytuner). Parameters: ```yaml # Dataset dataset_name: mlabonne/alpagasus prompt_template: alpaca max_seq_length: 512 val_set_size: 0.01 # Loading load_in_8bit: false load_in_4bit: true bf16: true fp16: false tf32: true # Lora adapter: qlora lora_model_dir: lora_r: 8 lora_alpha: 16 lora_dropout: 0.1 lora_target_modules: - q_proj - v_proj lora_fan_in_fan_out: # Training learning_rate: 0.00002 micro_batch_size: 24 gradient_accumulation_steps: 1 num_epochs: 3 lr_scheduler_type: cosine optim: paged_adamw_32bit group_by_length: true warmup_ratio: 0.03 eval_steps: 0.01 save_strategy: epoch logging_steps: 1 weight_decay: 0 max_grad_norm: max_steps: -1 gradient_checkpointing: true # QLoRA bnb_4bit_compute_dtype: float16 bnb_4bit_quant_type: nf4 bnb_4bit_use_double_quant: false ``` ## πŸ’» Usage ``` python # pip install transformers accelerate from transformers import AutoTokenizer import transformers import torch model = "mlabonne/alpagasus-2-7b" prompt = "What is a large language model?" tokenizer = AutoTokenizer.from_pretrained(model) pipeline = transformers.pipeline( "text-generation", model=model, torch_dtype=torch.float16, device_map="auto", ) sequences = pipeline( f'### Instruction: {prompt}', do_sample=True, top_k=10, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id, max_length=200, ) for seq in sequences: print(f"Result: {seq['generated_text']}") ```