Model Card for DeepSeek-R1-SmolTalk
This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
on the SmolTalk dataset. It is optimized for small-scale, friendly, and engaging instruction-following dialogue.
Model Details
Model Description
This model builds on DeepSeek's distilled Qwen-1.5B architecture and is trained for conversational tasks using the SmolTalk dataset. The goal is to create a lightweight, instruction-following model suitable for use in chatbots or lightweight assistants with limited hardware resources.
- Model type: Instruction-tuned causal decoder (chat)
- Language(s): English
- License: MIT
- Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
Uses
Direct Use
This model can be used as a lightweight assistant or chatbot in applications such as:
- Embedded conversational interfaces
- Educational or toy assistants
- Small devices or local applications
Downstream Use
The model can be further fine-tuned or integrated into larger conversational systems, especially where resource efficiency is crucial.
Out-of-Scope Use
- Not suitable for tasks requiring deep factual accuracy or reasoning
- Should not be used for sensitive or high-stakes decision making
- Not designed for multilingual use
Bias, Risks, and Limitations
Due to the small model size and dataset limitations:
- May produce generic or incorrect outputs
- Can reflect biases present in the training dataset
- Not guaranteed to be safe for all user demographics or use cases
Recommendations
- Use in controlled or sandboxed environments
- Consider integrating content moderation or rule-based filtering
- Do not deploy in contexts requiring factual correctness or ethical judgment
How to Get Started with the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations")
tokenizer = AutoTokenizer.from_pretrained("avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations")
input_text = "Hi there! What can you do?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Training Data
Used SmolTalk dataset, a dataset of lightweight, instruction-style conversations. The dataset is designed to help models learn concise, friendly, and helpful interactions.
Training Procedure
Preprocessing [optional]
Used the DeepSeek tokenizer
LoRA Configuration
- rank: 6
- alpha: 12
- dropout: 0.05
- bias: none
- target: linear
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-04
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 2
- gradient_clipping: 0.3
- total_train_batch_size: 128
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 1
- mixed_precision_training: bf16
Speeds, Sizes, Times [optional]
[More Information Needed]
Evaluation
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
Model Examination [optional]
[More Information Needed]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: [More Information Needed]
- Hours used: [More Information Needed]
- Cloud Provider: [More Information Needed]
- Compute Region: [More Information Needed]
- Carbon Emitted: [More Information Needed]
Technical Specifications [optional]
Model Architecture and Objective
[More Information Needed]
Compute Infrastructure
[More Information Needed]
Hardware
[More Information Needed]
Software
[More Information Needed]
Citation [optional]
BibTeX:
[More Information Needed]
APA:
[More Information Needed]
Glossary [optional]
[More Information Needed]
More Information [optional]
[More Information Needed]
Model Card Authors [optional]
[More Information Needed]
Model Card Contact
[More Information Needed]
fill this model card
- Downloads last month
- 32
Model tree for avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations
Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5BDataset used to train avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations
Evaluation results
- MMLU-PEM (0-shot) on HuggingFaceTB/smoltalkself-reported0.275