Upload README_HF.md with huggingface_hub
Browse files- README_HF.md +111 -56
README_HF.md
CHANGED
@@ -10,97 +10,152 @@ tags:
|
|
10 |
- gpt
|
11 |
- from-scratch
|
12 |
- pytorch
|
|
|
|
|
|
|
13 |
library_name: transformers
|
14 |
pipeline_tag: text-generation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
---
|
16 |
|
17 |
-
# MySQL Query Generator - From Scratch
|
18 |
|
19 |
-
|
20 |
|
21 |
-
## Model
|
22 |
|
23 |
-
|
24 |
-
- **Architecture**: Custom from-scratch implementation
|
25 |
-
- **Training**: No pre-trained weights used
|
26 |
-
- **Language**: English (Natural Language to SQL)
|
27 |
-
- **License**: Apache 2.0
|
28 |
|
29 |
-
|
|
|
|
|
|
|
|
|
|
|
30 |
|
31 |
-
|
32 |
-
|-----------|-------|
|
33 |
-
| Layers | 8 |
|
34 |
-
| Attention Heads | 8 |
|
35 |
-
| Hidden Size | 512 |
|
36 |
-
| Feed Forward Size | 2048 |
|
37 |
-
| Max Sequence Length | 512 |
|
38 |
-
| Dropout | 0.1 |
|
39 |
-
| Total Parameters | 29,789,184 |
|
40 |
-
| Model Size | 113.6 MB |
|
41 |
|
42 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
43 |
|
44 |
-
- **Training Time**: 12 minutes
|
45 |
-
- **Hardware**: RTX 5080 16GB
|
46 |
- **Framework**: PyTorch
|
47 |
-
- **Optimizer**: AdamW
|
48 |
-
- **Scheduler**: CosineAnnealingLR
|
49 |
-
- **Epochs**: 8
|
50 |
-
- **
|
|
|
51 |
|
52 |
-
##
|
53 |
|
54 |
-
- **
|
55 |
-
- **Final Training Loss**: 0.3178
|
56 |
-
- **Final Perplexity**: 1.42
|
57 |
-
- **Convergence**: Excellent
|
58 |
-
- **Overfitting**: None detected
|
59 |
|
60 |
-
|
|
|
|
|
61 |
|
62 |
-
|
63 |
-
- Synthetic SQL queries
|
64 |
-
- Spider dataset
|
65 |
-
- WikiSQL dataset
|
66 |
|
67 |
-
|
68 |
|
69 |
-
|
|
|
|
|
|
|
|
|
70 |
|
71 |
-
|
72 |
|
73 |
```python
|
74 |
-
#
|
75 |
-
|
76 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
77 |
```
|
78 |
|
79 |
-
## Files
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
80 |
|
81 |
-
|
82 |
-
- `complete_model_package.pt`: Full model package with all components
|
83 |
-
- `model_info.json`: Detailed model specifications
|
84 |
-
- `training_metrics.json`: Training performance data
|
85 |
-
- `SQLModel.ipynb`: Complete training notebook
|
86 |
|
87 |
-
|
|
|
|
|
|
|
|
|
|
|
88 |
|
89 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
90 |
|
91 |
```bibtex
|
92 |
@misc{mysql-query-generator-from-scratch,
|
93 |
title={MySQL Query Generator: A GPT-style Transformer Trained From Scratch},
|
94 |
author={Anonymous},
|
95 |
year={2025},
|
96 |
-
howpublished={\\url{https://huggingface.co/karthik-2905/nl2sql-pretrained}}
|
|
|
97 |
}
|
98 |
```
|
99 |
|
100 |
-
## License
|
|
|
|
|
101 |
|
102 |
-
|
103 |
|
104 |
-
|
|
|
|
|
|
|
|
|
|
|
105 |
|
106 |
-
|
|
|
10 |
- gpt
|
11 |
- from-scratch
|
12 |
- pytorch
|
13 |
+
- nl2sql
|
14 |
+
- natural-language-to-sql
|
15 |
+
- query-generation
|
16 |
library_name: transformers
|
17 |
pipeline_tag: text-generation
|
18 |
+
widget:
|
19 |
+
- text: "Show me all customers from New York"
|
20 |
+
example_title: "Customer Query"
|
21 |
+
- text: "Find total sales for each product"
|
22 |
+
example_title: "Aggregate Query"
|
23 |
+
- text: "List employees with salary greater than 50000"
|
24 |
+
example_title: "Conditional Query"
|
25 |
---
|
26 |
|
27 |
+
# π MySQL Query Generator - From Scratch
|
28 |
|
29 |
+
A state-of-the-art GPT-style transformer model trained completely from scratch for natural language to MySQL query generation. This model demonstrates that high-quality language models can be built without relying on pre-trained weights, achieving excellent performance with a compact architecture.
|
30 |
|
31 |
+
## π― Model Overview
|
32 |
|
33 |
+
This model specializes in converting natural language descriptions into syntactically correct MySQL queries. It was trained entirely from scratch using a custom transformer architecture, making it highly optimized for SQL generation tasks.
|
|
|
|
|
|
|
|
|
34 |
|
35 |
+
### Key Features
|
36 |
+
- **π§ Built from Scratch**: No pre-trained weights - pure end-to-end training
|
37 |
+
- **πΎ Lightweight**: Compact 29.8M parameters for efficient deployment
|
38 |
+
- **β‘ High Performance**: Excellent convergence with minimal overfitting
|
39 |
+
- **π― MySQL Optimized**: Specifically tuned for MySQL syntax and patterns
|
40 |
+
- **π Production Ready**: Robust performance across diverse query types
|
41 |
|
42 |
+
## ποΈ Architecture
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
43 |
|
44 |
+
| Component | Specification |
|
45 |
+
|-----------|---------------|
|
46 |
+
| **Model Type** | GPT-style Transformer (Decoder-only) |
|
47 |
+
| **Layers** | 8 |
|
48 |
+
| **Attention Heads** | 8 |
|
49 |
+
| **Hidden Dimensions** | 512 |
|
50 |
+
| **Feed Forward Size** | 2048 |
|
51 |
+
| **Max Sequence Length** | 512 tokens |
|
52 |
+
| **Dropout Rate** | 0.1 |
|
53 |
+
| **Total Parameters** | 29,789,184 |
|
54 |
+
| **Model Size** | 113.6 MB |
|
55 |
+
| **Vocabulary Size** | 4,206 tokens |
|
56 |
+
|
57 |
+
## π― Performance Metrics
|
58 |
+
|
59 |
+
| Metric | Value |
|
60 |
+
|--------|-------|
|
61 |
+
| **Validation Loss** | 0.3485 |
|
62 |
+
| **Training Loss** | 0.3178 |
|
63 |
+
| **Perplexity** | 1.42 |
|
64 |
+
| **Convergence** | Excellent |
|
65 |
+
| **Overfitting** | None detected |
|
66 |
+
|
67 |
+
## π Training Configuration
|
68 |
|
|
|
|
|
69 |
- **Framework**: PyTorch
|
70 |
+
- **Optimizer**: AdamW with weight decay
|
71 |
+
- **Learning Rate Scheduler**: CosineAnnealingLR
|
72 |
+
- **Training Epochs**: 8
|
73 |
+
- **Training Examples**: 24,293 high-quality samples
|
74 |
+
- **Hardware**: NVIDIA RTX 5080 16GB
|
75 |
|
76 |
+
## π Dataset
|
77 |
|
78 |
+
The model was trained on a carefully curated dataset of **24,293 high-quality examples** sourced from:
|
|
|
|
|
|
|
|
|
79 |
|
80 |
+
- **π§ Synthetic SQL Queries**: Custom-generated queries covering diverse MySQL patterns
|
81 |
+
- **π·οΈ Spider Dataset**: Complex multi-table queries with natural language descriptions
|
82 |
+
- **π WikiSQL Dataset**: Real-world table-question pairs adapted for MySQL
|
83 |
|
84 |
+
All queries were specifically optimized for MySQL syntax and best practices, ensuring production-ready output.
|
|
|
|
|
|
|
85 |
|
86 |
+
## π Usage
|
87 |
|
88 |
+
This model excels at converting natural language descriptions into syntactically correct MySQL queries. Perfect for:
|
89 |
+
- Database query assistants
|
90 |
+
- Business intelligence tools
|
91 |
+
- Educational SQL learning platforms
|
92 |
+
- Automated report generation
|
93 |
|
94 |
+
### Example Queries
|
95 |
|
96 |
```python
|
97 |
+
# Basic Selection
|
98 |
+
"Show me all customers from New York"
|
99 |
+
# β SELECT * FROM customers WHERE city = 'New York';
|
100 |
+
|
101 |
+
# Aggregation
|
102 |
+
"Find total sales for each product"
|
103 |
+
# β SELECT product_name, SUM(sales) FROM sales_table GROUP BY product_name;
|
104 |
+
|
105 |
+
# Conditional Filtering
|
106 |
+
"List employees with salary greater than 50000"
|
107 |
+
# β SELECT * FROM employees WHERE salary > 50000;
|
108 |
```
|
109 |
|
110 |
+
## π Model Files
|
111 |
+
|
112 |
+
| File | Description |
|
113 |
+
|------|-------------|
|
114 |
+
| `best_pretrained_model.pt` | Optimized model checkpoint for inference |
|
115 |
+
| `complete_model_package.pt` | Full model package with all components |
|
116 |
+
| `model_info.json` | Detailed model specifications and metadata |
|
117 |
+
| `training_metrics.json` | Comprehensive training performance data |
|
118 |
+
| `SQLModel.ipynb` | Complete training and evaluation notebook |
|
119 |
|
120 |
+
## π¬ Technical Details
|
|
|
|
|
|
|
|
|
121 |
|
122 |
+
### Model Capabilities
|
123 |
+
- **Multi-table Joins**: Handles complex relationships between tables
|
124 |
+
- **Aggregation Functions**: SUM, COUNT, AVG, MIN, MAX operations
|
125 |
+
- **Conditional Logic**: WHERE clauses with AND/OR operators
|
126 |
+
- **Sorting & Grouping**: ORDER BY and GROUP BY operations
|
127 |
+
- **Subqueries**: Nested query generation for complex requirements
|
128 |
|
129 |
+
### Limitations
|
130 |
+
- Optimized specifically for MySQL syntax (may not work with other SQL dialects)
|
131 |
+
- Best performance on queries similar to training data patterns
|
132 |
+
- May require fine-tuning for highly specialized domain vocabularies
|
133 |
+
|
134 |
+
## π Citation
|
135 |
+
|
136 |
+
If you use this model in your research or applications, please cite:
|
137 |
|
138 |
```bibtex
|
139 |
@misc{mysql-query-generator-from-scratch,
|
140 |
title={MySQL Query Generator: A GPT-style Transformer Trained From Scratch},
|
141 |
author={Anonymous},
|
142 |
year={2025},
|
143 |
+
howpublished={\\url{https://huggingface.co/karthik-2905/nl2sql-pretrained}},
|
144 |
+
note={Natural Language to SQL Query Generation}
|
145 |
}
|
146 |
```
|
147 |
|
148 |
+
## π License
|
149 |
+
|
150 |
+
This model is released under the **Apache 2.0 License**, allowing for both commercial and non-commercial use.
|
151 |
|
152 |
+
## π€ Community & Support
|
153 |
|
154 |
+
- **Open Source**: Community-driven development
|
155 |
+
- **Contributions Welcome**: Feel free to submit improvements
|
156 |
+
- **Issues & Feedback**: Report problems or suggest enhancements
|
157 |
+
- **Educational Use**: Perfect for learning NL2SQL concepts
|
158 |
+
|
159 |
+
---
|
160 |
|
161 |
+
**β If you find this model useful, please give it a star and share it with others!**
|