sweatSmile
/

HF-SmolLM-1.7B-0.5B-4bit-coder

 ---
 license: apache-2.0
+tags:
+- smollm
+- python
+- code-generation
+- instruct
+- qlora
+- fine-tuned
+- code
+- nf4
+datasets:
+- flytech/python-codes-25k
+model-index:
+- name: HF-SmolLM-1.7B-0.5B-4bit-coder
+  results: []
+language:
+- en
+pipeline_tag: text-generation
 ---
+# HF-SmolLM-1.7B-0.5B-4bit-coder
+## Model Summary
+**HF-SmolLM-1.7B-0.5B-4bit-coder** is a fine-tuned variant of [SmolLM-1.7B](https://huggingface.co/HuggingFaceTB/SmolLM-1.7B), optimized for **instruction-following in Python code generation tasks**.
+It was trained on a **1,500-sample subset** of the [flytech/python-codes-25k](https://huggingface.co/datasets/flytech/python-codes-25k) dataset using **parameter-efficient fine-tuning (QLoRA 4-bit)**.
+The model is suitable for:
+- Generating Python code snippets from natural language instructions
+- Completing short code functions
+- Educational prototyping of fine-tuned LMs
+⚠️ This is **not a production-ready coding assistant**. Generated outputs must be manually reviewed before execution.
+---
+## Intended Uses & Limitations
+### ✅ Intended
+- Research on parameter-efficient fine-tuning
+- Educational demos of instruction-tuning workflows
+- Prototype code generation experiments
+### ❌ Not Intended
+- Deployment in production coding assistants
+- Safety-critical applications
+- Long-context multi-file programming tasks
+---
+## Training Details
+### Base Model
+- **Name:** [HuggingFaceTB/SmolLM-1.7B](https://huggingface.co/HuggingFaceTB/SmolLM-1.7B)
+- **Architecture:** Decoder-only causal LM
+- **Total Parameters:** 1.72B
+- **Fine-tuned Trainable Parameters:** ~9M (0.53%)
+### Dataset
+- **Source:** [flytech/python-codes-25k](https://huggingface.co/datasets/flytech/python-codes-25k)
+- **Subset Used:** 1,500 randomly sampled examples
+- **Content:** Instruction + optional input → Python code output
+- **Formatting:** Converted into `chat` format with `user` / `assistant` roles
+### Training Procedure
+- **Framework:** Hugging Face Transformers + TRL (SFTTrainer)
+- **Quantization:** 4-bit QLoRA (nf4) with bfloat16 compute when available
+- **Effective Batch Size:** 6 (with accumulation)
+- **Optimizer:** AdamW
+- **Scheduler:** Cosine decay with warmup ratio 0.05
+- **Epochs:** 3
+- **Learning Rate:** 2e-4
+- **Max Seq Length:** 64 tokens (training)
+- **Mixed Precision:** FP16
+- **Gradient Checkpointing:** Enabled
+---
+## Evaluation
+No formal benchmark evaluation has been conducted yet.
+Empirically, the model:
+- Produces syntactically valid Python code for simple tasks
+- Adheres to given instructions with reasonable accuracy
+- Struggles with multi-step reasoning and long code outputs
+---
+## Example Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+repo = "sweatSmile/HF-SmolLM-1.7B-0.5B-4bit-coder"
+tokenizer = AutoTokenizer.from_pretrained(repo)
+model = AutoModelForCausalLM.from_pretrained(repo, device_map="auto")
+prompt = "Write a Python function that checks if a number is prime."
+inputs = tokenizer.apply_chat_template(
+    [{"role": "user", "content": prompt}],
+    return_tensors="pt",
+    add_generation_prompt=True
+).to(model.device)
+outputs = model.generate(inputs, max_new_tokens=150)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))