πŸ“ˆ Stock Price Forecasting - DataSynthis ML Job Task

This repository contains implementations of time-series forecasting for stock prices using both traditional statistical models (ARIMA, Prophet) and deep learning (LSTM).
The project demonstrates model comparison, rolling-window evaluation, and deployment to Hugging Face Hub.

Project Overview

  • Dataset: Daily stock price dataset (closing prices).
  • Models Implemented:
    • ARIMA (AutoRegressive Integrated Moving Average)
    • Prophet (Additive Time Series Forecasting by Meta)
    • LSTM (Long Short-Term Memory Neural Network)
  • Evaluation:
    • Rolling-window forecasts
    • Metrics: RMSE, MAPE
  • Deployment:
    • Models and results shared on Hugging Face Hub.

Repository Contents

  • lstm_model.h5 – Trained LSTM model
  • scaler.pkl – Scaler used for preprocessing
  • performance_summary.csv – Comparison of ARIMA, Prophet, and LSTM performance
  • stock_forecasting_notebook.ipynb – Full notebook with preprocessing, training, evaluation, and plots
  • upload_to_hf.py – Script for uploading to Hugging Face Hub

Quick start

  1. Create and activate a python environment (recommended: conda or venv)
    python -m venv venv
    source venv/bin/activate   # Linux/macOS
    venv\Scripts\activate    # Windows
    pip install -r requirements.txt
    
  2. Start Jupyter and open the notebook:
    jupyter notebook stock_forecasting_notebook.ipynb
    
  3. The notebook contains cells to download real stock data via yfinance (if you have internet) or use the included sample_stock.csv for an offline demo.

Hugging Face deployment (notes)

  • Use upload_to_hf.py to push saved model files to the HF repo DataSynthis_ML_JobTask after creating it on the Hugging Face website (or the script will create the repo for you if you provide a valid token).
  • Create a HF token at https://huggingface.co/settings/tokens and set environment variable HF_TOKEN or pass --token to the script.

Results

The performance of the three models on stock price forecasting is summarized below:

Model RMSE MAPE (%)
ARIMA 3.3748 1.8973
Prophet 4.7650 3.1859
LSTM 2.0890 1.2516

Key Insights

  • LSTM achieved the lowest RMSE and MAPE, showing the best accuracy.
  • ARIMA performed reasonably well, but less effective with non-linear trends.
  • Prophet captured trends and seasonality but had higher errors.
  • Overall, LSTM is the most reliable model for this task.
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train Hiruni2207/DataSynthis_ML_JobTask