README: Chatbot Training with BART

Overview

This project trains a chatbot using the facebook/bart-large-cnn model from Hugging Face's Transformers library. The chatbot is trained on a dataset of question-answer pairs and is capable of generating responses to user queries.

Dependencies

Ensure you have the following libraries installed before running the script:

pip install transformers datasets torch

Dataset

The chatbot is trained on a CSV dataset (dataset.csv) containing two columns:

  • question: The input question.
  • answer: The corresponding answer.

The dataset is loaded using the Hugging Face datasets library.

Training Process

  1. Tokenization:

    • Uses AutoTokenizer to process text.
    • Truncates and pads input to a maximum length of 256 tokens.
  2. Data Splitting:

    • The dataset is split into a training set (80%) and an evaluation set (20%).
  3. Training Configuration:

    • Uses Trainer API for fine-tuning.
    • Trains for 10 epochs with a batch size of 12.
    • Saves checkpoints every epoch.
    • Loads the best model at the end.
  4. Model Saving:

    • The trained model and tokenizer are saved in ./saved_model.

Inference (Generating Responses)

After training, you can generate responses using the generate_text() function. It supports parameters like:

  • temperature: Controls randomness of responses.
  • top_p: Nucleus sampling for response diversity.
  • repetition_penalty: Prevents excessive repetition.

Interactive Chatbot Mode

The script includes an interactive mode where users can input queries:

python chatbot.py

To exit, type exit.

Model Storage

  • Trained model is stored in ./saved_model.
  • Training logs and checkpoints are stored in ./results and ./logs.

Future Improvements

  • Train on a larger dataset.
  • Use a larger model like facebook/bart-large-xsum.
  • Integrate a web-based frontend.

Author

This project was created for research and development in chatbot training using transformer-based models.

Downloads last month
8
Safetensors
Model size
406M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hoangkha1810/bart-mathematics

Finetuned
(357)
this model