File size: 3,741 Bytes
928b952 17c2952 928b952 78abc4b 928b952 32c47f3 e820ad2 928b952 b1cac83 928b952 9d6cbd9 b1cac83 928b952 729a32f 928b952 1c5e4a2 928b952 1c5e4a2 928b952 729a32f 928b952 729a32f 928b952 1c5e4a2 928b952 2414448 39fcffe 928b952 1c5e4a2 928b952 89d736e 928b952 0d148b5 928b952 0d148b5 928b952 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 |
---
license: mit
base_model: Deepseek-R1
tags:
- text-generation
- sql
- lora
- unsloth
- Deepseek
---
# SQLNova - LoRA Fine-Tuned Deepseek 8B for Text-to-SQL Generation
**SQLNova** is a lightweight LoRA adapter fine-tuned on top of Unsloth’s Architecture. It is designed to convert natural language instructions into valid SQL queries with minimal compute overhead, making it ideal for integration into data-driven applications or chat interfaces.
The model was trained on over **100,000 natural language-to-SQL pairs** spanning diverse domains, including Education, Technical, Healthcare, and more.
---
## Model Dependencies
- **Python Version**: `3.10`
- **libraries**: `unsloth`
- pip install unsloth
## Model Highlights
- **Base model**: `Deepseek R1 8B Distilled Llama`
- **Tokenizer**: Compatible with `Deepseek R1 8B Distilled Llama`
- **Fine tuned for**: Text to SQL Converter
- **Accuracy**: > 85%
- **Language**: English Natural Language Sentences finetuned
- **Format**: `safetensors`
### General Information
- **Model type:** Text Generation
- **Language:** English
- **License:** MIT
- **Base model:** DeepSeek R1 distilled on Llama3 8B
### Model Repository
- **Hugging Face Model Card:** [https://huggingface.co/mervp/SQLNova](https://huggingface.co/mervp/SQLNova)
---
## 💡 Intended Uses
### Applications
- Generating SQL queries from natural language prompts
- Powering AI assistants for databases
- Enhancing SQL query builders or no-code data tools
- Automating analytics workflows
---
## Limitations
While **SQLNova** performs well in many real-world scenarios Since its a Reasoning Model, there are some limitations:
- It may produce **invalid SQL** for rare or malformed inputs in rarest cases.
- Assumes a **generic SQL dialect**, resembling MySQL/PostgreSQL syntax.
### Recommendation for Use of Model
- Always **validate generated SQL** before executing in production.
- Include **schema context** in prompts to improve accuracy.
- Use with **human-in-the-loop** review for critical applications.
Thanks for visiting and downloading this model!
If this model helped you, please consider leaving a like. Your support helps this model reach more developers and encourages further improvements if any.
---
## How to Use the Model
```python
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="mervp/SQLNova",
max_seq_length=2048,
dtype=None,
)
prompt = """ You are an text to SQL query translator.
Users will ask you questions in English
and you will generate a SQL query based on their question
SQL has to be simple, The schema context has been provided to you.
### User Question:
{}
### Sql Context:
{}
### Sql Query:
{}
"""
question = "List the names of customers who have an account balance greater than 6000."
schema = """
CREATE TABLE socially_responsible_lending (
customer_id INT,
name VARCHAR(50),
account_balance DECIMAL(10, 2)
);
INSERT INTO socially_responsible_lending VALUES
(1, 'james Chad', 5000),
(2, 'Jane Rajesh', 7000),
(3, 'Alia Kapoor', 6000),
(4, 'Fatima Patil', 8000);
"""
inputs = tokenizer(
[prompt.format(question, schema, "")],
return_tensors="pt",
padding=True,
truncation=True
).to("cuda")
output = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.2,
top_p=0.9,
top_k=50,
do_sample=True
)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
if "### Sql Query:" in decoded_output:
sql_query = decoded_output.split("### Sql Query:")[-1].strip()
else:
sql_query = decoded_output.strip()
print(sql_query) |