Model Card for DeepSeek-R1-SmolTalk

This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B on the SmolTalk dataset. It is optimized for small-scale, friendly, and engaging instruction-following dialogue.

Model Details

Model Description

This model builds on DeepSeek's distilled Qwen-1.5B architecture and is trained for conversational tasks using the SmolTalk dataset. The goal is to create a lightweight, instruction-following model suitable for use in chatbots or lightweight assistants with limited hardware resources.

  • Model type: Instruction-tuned causal decoder (chat)
  • Language(s): English
  • License: MIT
  • Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Uses

Direct Use

This model can be used as a lightweight assistant or chatbot in applications such as:

  • Embedded conversational interfaces
  • Educational or toy assistants
  • Small devices or local applications

Downstream Use

The model can be further fine-tuned or integrated into larger conversational systems, especially where resource efficiency is crucial.

Out-of-Scope Use

  • Not suitable for tasks requiring deep factual accuracy or reasoning
  • Should not be used for sensitive or high-stakes decision making
  • Not designed for multilingual use

Bias, Risks, and Limitations

Due to the small model size and dataset limitations:

  • May produce generic or incorrect outputs
  • Can reflect biases present in the training dataset
  • Not guaranteed to be safe for all user demographics or use cases

Recommendations

  • Use in controlled or sandboxed environments
  • Consider integrating content moderation or rule-based filtering
  • Do not deploy in contexts requiring factual correctness or ethical judgment

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations")
tokenizer = AutoTokenizer.from_pretrained("avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations")

input_text = "Hi there! What can you do?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

Used SmolTalk dataset, a dataset of lightweight, instruction-style conversations. The dataset is designed to help models learn concise, friendly, and helpful interactions.

Training Procedure

Preprocessing [optional]

Used the DeepSeek tokenizer

LoRA Configuration

  • rank: 6
  • alpha: 12
  • dropout: 0.05
  • bias: none
  • target: linear

Training Hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-04
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 2
  • gradient_clipping: 0.3
  • total_train_batch_size: 128
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 1
  • mixed_precision_training: bf16

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

fill this model card

Downloads last month
32
Safetensors
Model size
1.78B params
Tensor type
FP16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations

Finetuned
(306)
this model

Dataset used to train avanishd/DeepSeek-R1-Distill-Qwen-1.5B-finetuned-smoltalk-everyday-conversations

Evaluation results