---
language:
- km
- en
library_name: unsloth
license: llama3
base_model: unsloth/llama-3-8b-bnb-4bit
tags:
- khmer
- cambodian
- llama-3
- fine-tuned
- unsloth
- lora
- text-generation
datasets:
- metythorn/khmer-corpus
model-index:
- name: llama-3-8b-bnb-4bit-khmer
  results: []
---

# Llama-3-8B Continue Pretraining on Khmer Corpus

This model is a continue pretraining version of [unsloth/llama-3-8b-bnb-4bit](https://huggingface.co/unsloth/llama-3-8b-bnb-4bit) on the [metythorn/khmer-corpus](https://huggingface.co/datasets/metythorn/khmer-corpus) dataset.

## Model Description

This is a Llama-3-8B model that has been continue pretraining using the Unsloth framework to improve performance on Khmer text generation tasks. The model uses LoRA (Low-Rank Adaptation) for efficient fine-tuning with 4-bit quantization.

## Training Details

### Training Data
- **Dataset**: [metythorn/khmer-corpus](https://huggingface.co/datasets/metythorn/khmer-corpus)
- **Language**: Primarily Khmer with some English
- **Dataset Split**: Training split

### Training Configuration
- **Base Model**: unsloth/llama-3-8b-bnb-4bit
- **Training Framework**: Unsloth with LoRA
- **Quantization**: 4-bit (bnb-4bit)
- **Max Sequence Length**: 2048
- **LoRA Rank (r)**: 128
- **LoRA Alpha**: 32
- **LoRA Dropout**: 0
- **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, embed_proj, lm_head
- **Use RSLoRA**: True
- **Gradient Checkpointing**: unsloth

### Training Hyperparameters
- **Epochs**: 1
- **Batch Size**: 2 (per device)
- **Gradient Accumulation Steps**: 8
- **Learning Rate**: 5e-5
- **Embedding Learning Rate**: 5e-6
- **Warmup Ratio**: 0.1
- **Optimizer**: adamw_8bit
- **LR Scheduler**: cosine
- **Weight Decay**: 0.0
- **Seed**: 3407

## Usage

### Using with Unsloth

```python
from unsloth import FastLanguageModel
import torch

# Load the model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="metythorn/llama-3-8b-bnb-4bit",
    max_seq_length=2048,
    dtype=None,  # None for auto detection
    load_in_4bit=True,
)

# Enable inference mode
FastLanguageModel.for_inference(model)

# Generate text
inputs = tokenizer("Your prompt in Khmer or English", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

### Using with Transformers

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model and tokenizer
model_name = "metythorn/llama-3-8b-bnb-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Generate text
prompt = "Your prompt here"
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

## Model Performance

This model has been specifically fine-tuned to understand and generate Khmer text more effectively than the base Llama-3-8B model. The training focused on:

- Improved Khmer language understanding
- Better text generation in Khmer
- Maintained multilingual capabilities
- Efficient inference with 4-bit quantization

## Limitations and Biases

- The model's performance is limited by the quality and size of the training dataset
- May exhibit biases present in the training data
- Performance may vary for different Khmer dialects or specialized domains
- 4-bit quantization may slightly impact model quality compared to full precision

## Technical Specifications

- **Model Size**: ~4.5GB (4-bit quantized)
- **Architecture**: Llama-3-8B with LoRA adapters
- **Precision**: 4-bit quantization with LoRA in higher precision
- **Memory Requirements**: ~6-8GB VRAM for inference
- **Framework**: Compatible with Transformers and Unsloth

## Citation

If you use this model in your research, please cite:

```bibtex
@misc{llama3-8b-khmer-2024,
  title={Llama-3-8B Fine-tuned on Khmer Corpus},
  author={metythorn},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/metythorn/llama-3-8b-bnb-4bit}
}
```

## Acknowledgments

- Meta AI for the Llama-3 model
- Unsloth team for the efficient fine-tuning framework
- The Khmer corpus dataset contributors

## License

This model is released under the same license as the base Llama-3 model. Please refer to the [Llama-3 license](https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENSE) for more details.