|
--- |
|
license: apache-2.0 |
|
tags: |
|
- chat |
|
- chatbot |
|
- LoRA |
|
- instruction-tuning |
|
- conversational |
|
- tinyllama |
|
- transformers |
|
language: |
|
- en |
|
datasets: |
|
- tatsu-lab/alpaca |
|
- databricks/databricks-dolly-15k |
|
- knkarthick/dialogsum |
|
- Anthropic/hh-rlhf |
|
- OpenAssistant/oasst1 |
|
- nomic-ai/gpt4all_prompt_generations |
|
- sahil2801/CodeAlpaca-20k |
|
- Open-Orca/OpenOrca |
|
model-index: |
|
- name: chatbot-v2 |
|
results: [] |
|
--- |
|
|
|
# π€ chatbot-v2 β TinyLLaMA Instruction-Tuned Chatbot (LoRA) |
|
|
|
`chatbot-v2` is a lightweight, instruction-following conversational AI model based on **TinyLLaMA** and fine-tuned using **LoRA** adapters. It has been trained on a carefully curated mixture of open datasets covering assistant-like responses, code generation, summarization, safety alignment, and dialog reasoning. |
|
|
|
This model is ideal for embedding into mobile or edge apps with low-resource inference needs or running via an API. |
|
|
|
--- |
|
|
|
## π§ Base Model |
|
|
|
- **Model**: [`TinyLlama/TinyLlama-1.1B-Chat`](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat) |
|
- **Architecture**: Decoder-only Transformer (GPT-style) |
|
- **Fine-tuning method**: LoRA (low-rank adapters) |
|
- **LoRA Parameters**: |
|
- `r=16` |
|
- `alpha=32` |
|
- `dropout=0.05` |
|
- Target modules: `q_proj`, `v_proj` |
|
|
|
--- |
|
|
|
## π Training Datasets |
|
|
|
The model was fine-tuned on the following instruction-following, summarization, and dialogue datasets: |
|
|
|
- [`tatsu-lab/alpaca`](https://huggingface.co/datasets/tatsu-lab/alpaca) β Stanford Alpaca dataset |
|
- [`databricks/databricks-dolly-15k`](https://huggingface.co/datasets/databricks/databricks-dolly-15k) β Dolly instruction data |
|
- [`knkarthick/dialogsum`](https://huggingface.co/datasets/knkarthick/dialogsum) β Summarization of dialogs |
|
- [`Anthropic/hh-rlhf`](https://huggingface.co/datasets/Anthropic/hh-rlhf) β Harmless/helpful/honest alignment data |
|
- [`OpenAssistant/oasst1`](https://huggingface.co/datasets/OpenAssistant/oasst1) β OpenAssistant dialogues |
|
- [`nomic-ai/gpt4all_prompt_generations`](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations) β Instructional prompt-response pairs |
|
- [`sahil2801/CodeAlpaca-20k`](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k) β Programming/code generation instructions |
|
- [`Open-Orca/OpenOrca`](https://huggingface.co/datasets/Open-Orca/OpenOrca) β High-quality responses to complex questions |
|
|
|
--- |
|
|
|
## π§ Intended Use |
|
|
|
This model is best suited for: |
|
|
|
- **Conversational agents / chatbots** |
|
- **Instruction-following assistants** |
|
- **Lightweight AI on edge devices (via server inference)** |
|
- **Educational tools and experiments** |
|
|
|
--- |
|
|
|
## π« Limitations |
|
|
|
- This model is **not suitable for production use** without safety reviews. |
|
- It may generate **inaccurate or biased responses**, as training data is from public sources. |
|
- It is **not safe for sensitive or medical domains**. |
|
|
|
--- |
|
|
|
## π¬ Example Prompt |
|
|
|
Instruction: |
|
|
|
Explain the difference between supervised and unsupervised learning. |
|
|
|
Response: |
|
|
|
Supervised learning uses labeled data to train models, while unsupervised learning uses unlabeled data to discover patterns or groupings in the data⦠|
|
|
|
--- |
|
|
|
## π₯ How to Load the Adapters |
|
|
|
To use this model, load the base TinyLLaMA model and apply the LoRA adapters: |
|
|
|
```python |
|
from peft import PeftModel |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
"TinyLlama/TinyLlama-1.1B-Chat", |
|
torch_dtype="auto", |
|
device_map="auto" |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat") |
|
|
|
model = PeftModel.from_pretrained(base_model, "sahil239/chatbot-v2") |
|
|
|
π License |
|
|
|
This model is distributed under the Apache 2.0 License. |
|
|
|
π Acknowledgements |
|
|
|
Thanks to the open-source datasets and projects: Alpaca, Dolly, OpenAssistant, Anthropic, OpenOrca, CodeAlpaca, GPT4All, and Hugging Face. |