Omid-sar
/

fine-tuning-llama2-7b-code-generation-ludwig

Model card Files Files and versions

Omid-sar commited on Sep 29, 2023

Commit

ed25397

·

1 Parent(s): 7987642

Update

Files changed (1) hide show

README.md +66 -0

README.md CHANGED Viewed

@@ -33,3 +33,69 @@ The following `bitsandbytes` quantization config was used during training:
 - PEFT 0.5.0
 - PEFT 0.5.0

 - PEFT 0.5.0
 - PEFT 0.5.0
+- # Fine-Tuning Llama2-7b (Ludwig) for Code Generation
+## Goal
+The goal of this project is to fine-tune the Llama2-7b language model for code generation using the QLORA method. The model will take natural language as input, and should return code as output. We're first going to iterate on a base Llama-2-7b model with prompting, and finally instruction-fine-tune the model.
+As an example, if we prompt the model with this instruction:
+```
+Instruction: Create an array of length 5 which contains all even numbers between 1 and 10.
+```
+We want the model to produce exactly this response:
+```
+Response: array = [2, 4, 6, 8, 10]
+```
+## QLORA Method for Fine-Tuning
+The QLORA method for fine-tuning large language models (LLMs) is a parameter-efficient approach that uses 4-bit quantization to reduce the memory and computational requirements of fine-tuning. QLORA is implemented in the PEFT library, which is built on top of the Hugging Face Transformers library.
+## Ludwig Data Format
+Ludwig requires data to be in a specific format. The main components of the data format are:
+- `input_features`: Defines the input features of the model. Each feature must have a `name` and `type`.
+- `output_features`: Defines the output features of the model. Similar to input features, each output feature must have a `name` and `type`.
+Here is an example of a simple Ludwig config:
+```yaml
+input_features:
+  - name: instruction
+    type: text
+output_features:
+  - name: output
+    type: text
+```
+This config tells Ludwig to use the column called `instruction` in the dataset as an input feature and the `output` column as an output feature.
+## Prerequisites
+- Python 3.x
+- Ludwig
+- GPU (recommended for faster training)
+## Setup and Installation
+1. Clone the repository:
+   ```sh
+   git clone https://github.com/omid-sar/Llama2-7B-Fine-Tuning--Google-Colab-.git
+   ```
+2. Open the notebook `Fine_Tuning_Llama2_7b(_Ludwig).ipynb` in Google Colab or a local Jupyter environment.
+3. Install the required libraries:
+   ```sh
+   pip install ludwig
+   ```
+4. Follow the instructions in the notebook to download any additional datasets or models.
+## Usage
+1. Run the cells in the notebook sequentially, following the instructions and comments provided.
+2. Modify the model configuration, training parameters, or input data as needed to suit your use case.