Model Card for DeepSeek-R1-SmolTalk

This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B on the SmolTalk dataset. It is optimized for small-scale, friendly, and engaging instruction-following dialogue.

Model Details

Model Description

This model builds on DeepSeek's distilled Qwen-1.5B architecture and is trained for conversational tasks using the SmolTalk dataset. The goal is to create a lightweight, instruction-following model suitable for use in chatbots or lightweight assistants with limited hardware resources.

Model type: Instruction-tuned causal decoder (chat)
Language(s): English
License: MIT
Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Uses

Direct Use

This model can be used as a lightweight assistant or chatbot in applications such as:

Embedded conversational interfaces
Educational or toy assistants
Small devices or local applications

Downstream Use

The model can be further fine-tuned or integrated into larger conversational systems, especially where resource efficiency is crucial.

Out-of-Scope Use

Not suitable for tasks requiring deep factual accuracy or reasoning
Should not be used for sensitive or high-stakes decision making
Not designed for multilingual use

Bias, Risks, and Limitations

Due to the small model size and dataset limitations:

May produce generic or incorrect outputs
Can reflect biases present in the training dataset
Not guaranteed to be safe for all user demographics or use cases

Recommendations

Use in controlled or sandboxed environments
Consider integrating content moderation or rule-based filtering
Do not deploy in contexts requiring factual correctness or ethical judgment

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations")
tokenizer = AutoTokenizer.from_pretrained("avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations")

input_text = "Hi there! What can you do?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

Used SmolTalk dataset, a dataset of lightweight, instruction-style conversations. The dataset is designed to help models learn concise, friendly, and helpful interactions.

Training Procedure

Preprocessing [optional]

Used the DeepSeek tokenizer

LoRA Configuration

rank: 6
alpha: 12
dropout: 0.05
bias: none
target: linear

Training Hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-04
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 2
gradient_clipping: 0.3
total_train_batch_size: 128
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.03
num_epochs: 1
mixed_precision_training: bf16

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

fill this model card

avanishd
/

DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations

Model Card for DeepSeek-R1-SmolTalk

Model Details

Model Description

Uses

Direct Use

Downstream Use

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Training Details

Training Data

Training Procedure

Preprocessing [optional]

LoRA Configuration

Training Hyperparameters

Speeds, Sizes, Times [optional]

Evaluation

Testing Data, Factors & Metrics

Testing Data

Factors

Metrics

Results

Summary

Model Examination [optional]

Environmental Impact

Technical Specifications [optional]

Model Architecture and Objective

Compute Infrastructure

Hardware

Software

Citation [optional]

Glossary [optional]

More Information [optional]

Model Card Authors [optional]

Model Card Contact

Model tree for avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations

Dataset used to train avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations

Evaluation results