polyglot-ko-1b-txt2sql
polyglot-ko-1b-txt2sql
μ νκ΅μ΄ μμ°μ΄ μ§λ¬Έμ SQL μΏΌλ¦¬λ‘ λ³ννκΈ° μν΄ νμΈνλλ ν
μ€νΈ μμ± λͺ¨λΈμ
λλ€.
κΈ°λ° λͺ¨λΈμ EleutherAI/polyglot-ko-1.3b
λ₯Ό μ¬μ©νμΌλ©°, LoRAλ₯Ό ν΅ν΄ κ²½λ νμΈνλλμμ΅λλ€.
λͺ¨λΈ μ 보
- Base model: EleutherAI/polyglot-ko-1.3b
- Fine-tuning: QLoRA (4bit quantization + PEFT)
- Task: Text2SQL (μμ°μ΄ β SQL λ³ν)
- Tokenizer: λμΌν ν ν¬λμ΄μ μ¬μ©
νμ΅ λ°μ΄ν°
λͺ¨λΈμ νκ΅μ΄ SQL λ³ν νμ€ν¬λ₯Ό μν΄ μ€κ³λ μμ°μ΄ μ§λ¬Έ-쿼리 νμ΄λ‘ νμΈνλλμμ΅λλ€.
λ°μ΄ν°λ λ€μ λ κ°μ§ μμ€ κΈ°λ°μΌλ‘ ꡬμ±λμμ΅λλ€:
- shangrilar/ko_text2sql λ°μ΄ν°μ μΌλΆ
- OpenAI κΈ°λ° LLM(GPT) μΆλ‘ μ ν΅ν΄ μμ±λ synthetic Korean SQL pairs
μ¬μ© μμ
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
model = AutoModelForCausalLM.from_pretrained("your-username/polyglot-ko-1b-txt2sql", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("your-username/polyglot-ko-1b-txt2sql")
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
prompt = """
λΉμ μ SQL μ λ¬Έκ°μ
λλ€.
### DDL:
CREATE TABLE players (
player_id INT PRIMARY KEY AUTO_INCREMENT,
username VARCHAR(255) UNIQUE NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
date_joined DATETIME NOT NULL,
last_login DATETIME
);
### Question:
μ¬μ©μ μ΄λ¦μ 'admin'μ΄ ν¬ν¨λ κ³μ μλ?
### SQL:
"""
outputs = generator(prompt, do_sample=False, max_new_tokens=128)
print(outputs[0]["generated_text"])
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for castellina/polyglot-ko-txt2sql
Base model
EleutherAI/polyglot-ko-1.3b