File size: 3,895 Bytes
81fd850 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
---
license: apache-2.0
tags:
- chat
- chatbot
- LoRA
- instruction-tuning
- conversational
- tinyllama
- transformers
language:
- en
datasets:
- tatsu-lab/alpaca
- databricks/databricks-dolly-15k
- knkarthick/dialogsum
- Anthropic/hh-rlhf
- OpenAssistant/oasst1
- nomic-ai/gpt4all_prompt_generations
- sahil2801/CodeAlpaca-20k
- Open-Orca/OpenOrca
model-index:
- name: chatbot-v2
results: []
---
# π€ chatbot-v2 β TinyLLaMA Instruction-Tuned Chatbot (LoRA)
`chatbot-v2` is a lightweight, instruction-following conversational AI model based on **TinyLLaMA** and fine-tuned using **LoRA** adapters. It has been trained on a carefully curated mixture of open datasets covering assistant-like responses, code generation, summarization, safety alignment, and dialog reasoning.
This model is ideal for embedding into mobile or edge apps with low-resource inference needs or running via an API.
---
## π§ Base Model
- **Model**: [`TinyLlama/TinyLlama-1.1B-Chat`](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat)
- **Architecture**: Decoder-only Transformer (GPT-style)
- **Fine-tuning method**: LoRA (low-rank adapters)
- **LoRA Parameters**:
- `r=16`
- `alpha=32`
- `dropout=0.05`
- Target modules: `q_proj`, `v_proj`
---
## π Training Datasets
The model was fine-tuned on the following instruction-following, summarization, and dialogue datasets:
- [`tatsu-lab/alpaca`](https://huggingface.co/datasets/tatsu-lab/alpaca) β Stanford Alpaca dataset
- [`databricks/databricks-dolly-15k`](https://huggingface.co/datasets/databricks/databricks-dolly-15k) β Dolly instruction data
- [`knkarthick/dialogsum`](https://huggingface.co/datasets/knkarthick/dialogsum) β Summarization of dialogs
- [`Anthropic/hh-rlhf`](https://huggingface.co/datasets/Anthropic/hh-rlhf) β Harmless/helpful/honest alignment data
- [`OpenAssistant/oasst1`](https://huggingface.co/datasets/OpenAssistant/oasst1) β OpenAssistant dialogues
- [`nomic-ai/gpt4all_prompt_generations`](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations) β Instructional prompt-response pairs
- [`sahil2801/CodeAlpaca-20k`](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k) β Programming/code generation instructions
- [`Open-Orca/OpenOrca`](https://huggingface.co/datasets/Open-Orca/OpenOrca) β High-quality responses to complex questions
---
## π§ Intended Use
This model is best suited for:
- **Conversational agents / chatbots**
- **Instruction-following assistants**
- **Lightweight AI on edge devices (via server inference)**
- **Educational tools and experiments**
---
## π« Limitations
- This model is **not suitable for production use** without safety reviews.
- It may generate **inaccurate or biased responses**, as training data is from public sources.
- It is **not safe for sensitive or medical domains**.
---
## π¬ Example Prompt
Instruction:
Explain the difference between supervised and unsupervised learning.
Response:
Supervised learning uses labeled data to train models, while unsupervised learning uses unlabeled data to discover patterns or groupings in the dataβ¦
---
## π₯ How to Load the Adapters
To use this model, load the base TinyLLaMA model and apply the LoRA adapters:
```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = AutoModelForCausalLM.from_pretrained(
"TinyLlama/TinyLlama-1.1B-Chat",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat")
model = PeftModel.from_pretrained(base_model, "sahil239/chatbot-v2")
π License
This model is distributed under the Apache 2.0 License.
π Acknowledgements
Thanks to the open-source datasets and projects: Alpaca, Dolly, OpenAssistant, Anthropic, OpenOrca, CodeAlpaca, GPT4All, and Hugging Face. |