Update README.md
Browse files
README.md
CHANGED
@@ -25,6 +25,50 @@ AI2SQL is a specialized LLM fine-tuned from Falcon-7b-instruct with PEFT- LoRA t
|
|
25 |
|
26 |
AI2SQL is designed for data analysts, business intelligence professionals, and developers to facilitate the conversion of natural language questions into SQL queries. This tool aids those who are not proficient in SQL, enabling easier database querying. AI2SQL's performance is inherently tied to the characteristics of its training data. While it has been trained on a diverse and substantial dataset, it may not account for all possible SQL dialects or database structures. Careful review of the generated SQL queries is recommended.
|
27 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
## Training and evaluation data
|
29 |
|
30 |
Trained on a comprehensive dataset comprising 262,000 rows of paired natural language questions and SQL queries sourced from Text-to-SQL Dataset, covering a wide array of domains and question complexities.
|
|
|
25 |
|
26 |
AI2SQL is designed for data analysts, business intelligence professionals, and developers to facilitate the conversion of natural language questions into SQL queries. This tool aids those who are not proficient in SQL, enabling easier database querying. AI2SQL's performance is inherently tied to the characteristics of its training data. While it has been trained on a diverse and substantial dataset, it may not account for all possible SQL dialects or database structures. Careful review of the generated SQL queries is recommended.
|
27 |
|
28 |
+
## Inference
|
29 |
+
|
30 |
+
### Model Deployment
|
31 |
+
AI2SQL is designed for efficient real-time inference, making it suitable for interactive applications where users query databases using natural language.
|
32 |
+
|
33 |
+
### Computational Requirements
|
34 |
+
- **Hardware Requirements**: AI2SQL performs optimally on [specify CPU/GPU requirements].
|
35 |
+
- **Memory Footprint**: The model requires [X] GB of RAM for inference.
|
36 |
+
- **Latency**: The average response time for generating a SQL query is approximately [X] milliseconds.
|
37 |
+
|
38 |
+
### Usage Guidelines
|
39 |
+
To use AI2SQL for generating SQL queries, follow these steps:
|
40 |
+
|
41 |
+
1. **Preparation**: Ensure that your system meets the hardware and software requirements for running the model.
|
42 |
+
2. **Input Formatting**: Format your natural language questions clearly and concisely for best results.
|
43 |
+
3. **Model Invocation**: Call the AI2SQL model with the natural language question as input. The model returns the corresponding SQL query as output.
|
44 |
+
|
45 |
+
### Example Code for Inference
|
46 |
+
```python
|
47 |
+
from transformers import pipeline
|
48 |
+
|
49 |
+
# Initialize the AI2SQL model
|
50 |
+
ai2sql = pipeline('text-to-sql', model='ai2sql')
|
51 |
+
|
52 |
+
# Example natural language question
|
53 |
+
question = "How many products were sold last month?"
|
54 |
+
|
55 |
+
# Generate the SQL query
|
56 |
+
sql_query = ai2sql(question)
|
57 |
+
print("Generated SQL Query:", sql_query)
|
58 |
+
```
|
59 |
+
|
60 |
+
|
61 |
+
### Scalability
|
62 |
+
AI2SQL is scalable and can handle concurrent requests, making it suitable for deployment in high-demand environments.
|
63 |
+
|
64 |
+
### Error Handling
|
65 |
+
The model includes robust error handling for invalid inputs and provides meaningful error messages to guide users in correcting their queries.
|
66 |
+
|
67 |
+
### Security Considerations
|
68 |
+
Users should be aware of security implications when using AI2SQL, especially when dealing with sensitive data or integrating the model into secure environments. Ensure all data handling complies with relevant privacy and security regulations.
|
69 |
+
|
70 |
+
|
71 |
+
|
72 |
## Training and evaluation data
|
73 |
|
74 |
Trained on a comprehensive dataset comprising 262,000 rows of paired natural language questions and SQL queries sourced from Text-to-SQL Dataset, covering a wide array of domains and question complexities.
|