text-to-sql-fm / README.md
Lei-bw's picture
Update README.md
a66de90 verified
|
raw
history blame
2 kB
metadata
base_model: google/gemma-2b
library_name: peft
license: bsl-1.0
tags:
  - code
datasets:
  - b-mc2/sql-create-context
language:
  - en
pipeline_tag: text2text-generation

Model Card for Model ID

Model Details

Model Description

This model is quantized in 8-bit and trained with question and answer pairs for text-to-SQL tasks using the LoRA PEFT method. It serves as a foundation model for further development in Text-to-SQL Retrieval-Augmented Generation (RAG) applications.

  • Developed by: Lei-bw
  • Model type: Causal Language Model
  • Language(s) (NLP): English
  • License: bsl-1.0
  • Finetuned from model: google/gemma-2b

How to Get Started with the Model

from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the PEFT configuration
config = PeftConfig.from_pretrained("Lei-bw/text-to-sql-fm")

# Load the base model
base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2b")

# Load the fine-tuned model using PEFT
model = PeftModel.from_pretrained(base_model, "Lei-bw/text-to-sql-fm")

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b")

# Example usage
text = "What is the average salary of employees in the sales department?"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs)
generated_text = tokenizer.decode(outputs[0])

print(generated_text)

Training Details

Training Data

The model was trained on the b-mc2/sql-create-context dataset, which contains question and answer pairs for SQL generation tasks.

Training Hyperparameters

•	Training regime: bf16 mixed precision
•	Batch size: 16 
•	Gradient accumulation steps: 4
•	Warmup steps: 50
•	Number of epochs: 2
•	Learning rate: 2e-4
•	Weight decay: 0.01
•	Optimizer: AdamW
•	Learning rate scheduler: Linear

Hardware

  • Hardware Type: NVIDIA A100
  • GPU RAM: 40 GB

Framework versions

  • PEFT 0.12.0