Evolution Learning Network (ELN) with QLoRA and Genetic Algorithms For LLM

Overview

This project implements an Evolution Learning Network (ELN) to fine-tune transformer-based models like LLaMA using a combination of Quantized Low-Rank Adaptation (QLoRA) and Genetic Algorithms (GA). The primary objective is to evolve a population of models across multiple generations to optimize for performance (fitness) and specialization, while maintaining diversity.

Key Features

  • Efficient model fine-tuning using QLoRA.
  • Evolutionary strategies, including random mutations and fitness-based selection.
  • Hardware-efficient training with 4-bit quantization.
  • Comprehensive experiment tracking with WandB.
  • Diversity maintenance through LoRA weight fingerprinting.

Model Details

Base Model

  • Name: meta-llama/Llama-3.2-1B (can be replaced with any Hugging Face model).
  • Architecture: Transformer-based causal language model.

Quantization Configuration

  • Quantization Type: 4-bit using bitsandbytes (bnb_4bit).
  • Parameters:
    • Compute Type: torch.float16
    • Quantization Type: "nf4" (Nonlinear quantization).
    • Double Quantization: Enabled.
    • Nested Quantization: Enabled.

LoRA (Low-Rank Adaptation)

  • Dimensions (r): 8
  • Alpha (Scaling): 16
  • Target Modules: Query and Value projections (q_proj, v_proj).
  • Dropout: 0.05
  • Task Type: Causal Language Modeling (CAUSAL_LM).

Training Strategy

  • Optimizer: paged_adamw_8bit for memory-efficient updates.
  • Precision: Mixed precision (fp16) for faster training.

Hyperparameters

General Parameters

  • Generations: 10
  • Population Size: 4
  • Dataset Size: 2000 samples per split (adjustable for larger datasets).

Training

  • Batch Size: 8
  • Gradient Accumulation: 16 steps.
  • Learning Rate: 2e-4
  • Epochs per Model: 2

Mutations

  • Mutation Rate: 10% (probability per parameter).
  • Mutation Scale: Noise added with a standard deviation of 0.02.

Dataset Details

Source

  • Name: WikiText (wikitext-2-raw-v1 for larger datasets).
  • Splits:
    • train → Model training.
    • validation → General task evaluation.
    • test → Specific task evaluation.

Tokenization

  • Tokenizer: Hugging Face AutoTokenizer.
  • Max Token Length: 128 tokens.
  • Padding: Fixed to "max_length".

Results

Summary

  • Total Generations: 10
  • Best Fitness Achieved: 0.4772
  • Final Population Diversity: 0.0011

Evolution History (Highlights)

Generation Best Fitness Avg Fitness Diversity Best Specialization
1 0.4096 0.4023 0.00097 0.9967
5 0.4727 0.4722 0.00099 0.9968
10 0.4772 0.4768 0.00106 0.9972

Hardware & Framework

Hardware

  • Multi-GPU support with torch.nn.parallel.DistributedDataParallel or Accelerator.
  • Logs GPU/CPU usage with psutil and torch.cuda.

Frameworks & Libraries

  • Transformers: Hugging Face model and tokenizer handling.
  • Datasets: Data loading and processing.
  • WandB: Experiment tracking and visualization.
  • BitsAndBytes: 4-bit quantization.
  • PEFT: LoRA-based fine-tuning.

Future Work

  • Explore larger population sizes and more generations for enhanced diversity.
  • Experiment with other datasets to generalize findings.
  • Integrate additional mutation strategies for broader exploration.

Citation

Remaining


Code to run locally

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
model = PeftModel.from_pretrained(base_model, "diabolic6045/ELN-AOC-CAIN")

Framework versions

  • PEFT 0.14.0
Downloads last month
13
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for diabolic6045/ELN-AOC-CAIN

Adapter
(124)
this model