|
--- |
|
base_model: meta-llama/Llama-2-7b-hf |
|
library_name: peft |
|
license: llama2 |
|
datasets: |
|
- timdettmers/openassistant-guanaco |
|
language: |
|
- en |
|
- th |
|
- zh |
|
metrics: |
|
- accuracy |
|
pipeline_tag: question-answering |
|
--- |
|
|
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
- **Developed by:** [Jixin Yang @ HKUST] |
|
- **Model type:** [PEFT (LoRA) fine-tuned LLaMA-2 7B for backward text generation] |
|
- **Finetuned from model [optional]:** [meta-llama/Llama-2-7b-hf] |
|
|
|
|
|
|
|
## Uses |
|
This model is designed for backward text generation - given an output text, it generates the corresponding input. |
|
|
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_name = "jasperyeoh2/llama2-7b-backward-model" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto") |
|
|
|
input_text = "Output text to reverse" |
|
inputs = tokenizer(input_text, return_tensors="pt").to("cuda") |
|
outputs = model.generate(**inputs, max_new_tokens=50) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
|
- Dataset: [OpenAssistant-Guanaco](https://huggingface.co/datasets/timdettmers/openassistant-guanaco) |
|
- Number of examples used: ~3,200 |
|
- Task: Instruction Backtranslation (Answer → Prompt) |
|
|
|
### Training Procedure |
|
|
|
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
|
|
|
#### Preprocessing [optional] |
|
|
|
- Method: PEFT with LoRA (Low-Rank Adaptation) |
|
- Quantization: 4-bit (NF4) |
|
- LoRA config: |
|
- `r`: 8 |
|
- `alpha`: 16 |
|
- `target_modules`: ["q_proj", "v_proj"] |
|
- `dropout`: 0.05 |
|
- Max sequence length: 512 tokens |
|
- Epochs: 10 |
|
- Batch size: 2 |
|
- Gradient accumulation steps: 8 |
|
- Effective batch size: 16 |
|
- Learning rate: 2e-5 |
|
- Scheduler: linear with warmup |
|
- Optimizer: AdamW |
|
- Early stopping: enabled (patience=2) |
|
|
|
|
|
#### Metrics |
|
|
|
<!-- These are the evaluation metrics being used, ideally with a description of why. --> |
|
|
|
[wandb: https://wandb.ai/jyang577-hong-kong-university-of-science-and-technology/huggingface?nw=nwuserjyang577] |
|
|
|
### Results |
|
|
|
[- Final eval loss: ~1.436 |
|
- Final train loss: ~1.4 |
|
- Training completed in ~8 epochs] |
|
|
|
|
|
|
|
### Compute Infrastructure |
|
|
|
- GPU: 1× NVIDIA A800 (80GB) |
|
- CUDA Version: 12.1 |
|
|
|
#### Software |
|
|
|
- OS: Ubuntu 20.04 |
|
- Python: 3.10 |
|
- Transformers: 4.38.2 |
|
- PEFT: 0.15.1 |
|
- Accelerate: 0.28.0 |
|
- BitsAndBytes: 0.41.2] |
|
|
|
#### Hardware |
|
|
|
NVIDIA A800 GPU |
|
|
|
|
|
### Framework versions |
|
|
|
- PEFT 0.15.1 |