{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# YOLOv11 Training with Roboflow Dataset\n",
    "\n",
    "This notebook demonstrates how to train a YOLOv11 model using a dataset from Roboflow. It includes:\n",
    "- Automatic GPU/CPU detection\n",
    "- Configurable training parameters\n",
    "- Training visualization and analysis\n",
    "\n",
    "## Step 1: Install Dependencies\n",
    "First, we'll install the required packages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# For Training\n",
    "!pip install ultralytics roboflow \n",
    "\n",
    "# For Storage\n",
    "!pip install minio"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 2: Import Libraries\n",
    "Import all necessary libraries for training and analysis."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Common\n",
    "import  os\n",
    "\n",
    "# For Dataset manipulation\n",
    "import yaml\n",
    "from roboflow import Roboflow\n",
    "\n",
    "# For training\n",
    "import torch\n",
    "from ultralytics import YOLO\n",
    "\n",
    "# For Storage\n",
    "from minio import Minio\n",
    "from minio.error import S3Error"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 3: Download Dataset from Roboflow\n",
    "Connect to Roboflow and download the dataset. Make sure to use your own API key and project details.\n",
    "\n",
    "**Remember to replace the placeholders with your values**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rf = Roboflow(api_key=\"xxxxxxxxxxxxxxxxx\")  # Replace with your API key\n",
    "project = rf.workspace(\"yyyyyyyyyyyyyy\").project(\"zzzzzzzzzzzzzzzzzzz\") # Replace with your workspace and project names\n",
    "version = project.version(1111111111111111111111111111) # Replace with your version number\n",
    "dataset = version.download(\"yolov11\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You'll need to explicitly specify the paths to each data split (training, validation, and test) in your configuration. This ensures YOLO can correctly locate and utilize your dataset files.\n",
    "\n",
    "This is done in the `data.yaml` file. If you open that file you will see these paths that you need to update:\n",
    "\n",
    "```\n",
    "train: ../train/images\n",
    "val: ../valid/images\n",
    "test: ../test/images\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f\"Dataset downloaded to: {dataset.location}\")\n",
    "\n",
    "dataset_yaml_path = f\"{dataset.location}/data.yaml\"\n",
    "\n",
    "with open(dataset_yaml_path, \"r\") as file:\n",
    "    data_config = yaml.safe_load(file)\n",
    "\n",
    "data_config[\"train\"] = f\"{dataset.location}/train/images\"\n",
    "data_config[\"val\"] = f\"{dataset.location}/valid/images\"\n",
    "data_config[\"test\"] = f\"{dataset.location}/test/images\"\n",
    "\n",
    "with open(dataset_yaml_path, \"w\") as file:\n",
    "    yaml.safe_dump(data_config, file)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 4: Configure Hyperparameters\n",
    "Set up GPU/CPU detection (code automatically detects and use GPU if available)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n",
    "print(f\"Using device: {device} ({'GPU' if device.type == 'cuda' else 'CPU'})\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define all training parameters in a single configuration dictionary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "CONFIG = {\n",
    "    'name': 'yolo_hardhat',\n",
    "    'model': 'yolo11m.pt',  # Model size options: n, s, m, l, x\n",
    "    'data': dataset.location + \"/data.yaml\",\n",
    "    'epochs': 1,  # Set the number of epochs (keep 1 for Mock Training)\n",
    "    'batch': 1 ,  # Adjust batch size based on device\n",
    "    'imgsz': 640,\n",
    "    'patience': 15,\n",
    "    'device': device,\n",
    "    \n",
    "    # Optimizer settings\n",
    "    'optimizer': 'SGD',\n",
    "    'lr0': 0.001,\n",
    "    'lrf': 0.005,\n",
    "    'momentum': 0.9,\n",
    "    'weight_decay': 0.0005,\n",
    "    'warmup_epochs': 3,\n",
    "    'warmup_bias_lr': 0.01,\n",
    "    'warmup_momentum': 0.8,\n",
    "    'amp': False,\n",
    "    \n",
    "    # Data augmentation settings\n",
    "    'augment': True,\n",
    "    'hsv_h': 0.015,  # HSV-Hue augmentation\n",
    "    'hsv_s': 0.7,    # HSV-Saturation augmentation\n",
    "    'hsv_v': 0.4,    # HSV-Value augmentation\n",
    "    'degrees': 10,    # Image rotation (+/- deg)\n",
    "    'translate': 0.1, # Image translation\n",
    "    'scale': 0.3,    # Image scale\n",
    "    'shear': 0.0,    # Image shear\n",
    "    'perspective': 0.0,  # Image perspective\n",
    "    'flipud': 0.1,   # Flip up-down\n",
    "    'fliplr': 0.1,   # Flip left-right\n",
    "    'mosaic': 1.0,   # Mosaic augmentation\n",
    "    'mixup': 0.0,    # Mixup augmentation\n",
    "}\n",
    "\n",
    "# Configure PyTorch for GPU memory allocation\n",
    "os.environ[\"PYTORCH_CUDA_ALLOC_CONF\"] = \"expandable_segments:True\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 5: Load Model\n",
    "Initialize the YOLO model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = YOLO(CONFIG['model'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 6: Start Training\n",
    "\n",
    "Begin the training process. By default, the `train` method handles both \"training\" and \"validation\" sets. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "results_train = model.train(\n",
    "    name=CONFIG['name'],\n",
    "    data=CONFIG['data'],\n",
    "    epochs=CONFIG['epochs'],\n",
    "    batch=CONFIG['batch'],\n",
    "    imgsz=CONFIG['imgsz'],\n",
    "    patience=CONFIG['patience'],\n",
    "    device=CONFIG['device'],\n",
    "    verbose=True,\n",
    "    \n",
    "    # Optimizer parameters\n",
    "    optimizer=CONFIG['optimizer'],\n",
    "    lr0=CONFIG['lr0'],\n",
    "    lrf=CONFIG['lrf'],\n",
    "    momentum=CONFIG['momentum'],\n",
    "    weight_decay=CONFIG['weight_decay'],\n",
    "    warmup_epochs=CONFIG['warmup_epochs'],\n",
    "    warmup_bias_lr=CONFIG['warmup_bias_lr'],\n",
    "    warmup_momentum=CONFIG['warmup_momentum'],\n",
    "    amp=CONFIG['amp'],\n",
    "    \n",
    "    # Augmentation parameters\n",
    "    augment=CONFIG['augment'],\n",
    "    hsv_h=CONFIG['hsv_h'],\n",
    "    hsv_s=CONFIG['hsv_s'],\n",
    "    hsv_v=CONFIG['hsv_v'],\n",
    "    degrees=CONFIG['degrees'],\n",
    "    translate=CONFIG['translate'],\n",
    "    scale=CONFIG['scale'],\n",
    "    shear=CONFIG['shear'],\n",
    "    perspective=CONFIG['perspective'],\n",
    "    flipud=CONFIG['flipud'],\n",
    "    fliplr=CONFIG['fliplr'],\n",
    "    mosaic=CONFIG['mosaic'],\n",
    "    mixup=CONFIG['mixup'],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 7: Evaluate Model\n",
    "\n",
    " Evaluate the model on the test set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "results_test = model.val(data=CONFIG['data'], split='test', device=CONFIG['device'], imgsz=CONFIG['imgsz'])\n",
    "\n",
    "#print(\"Test Results:\", results_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 8: (Optional) Model Export\n",
    "\n",
    "Export the trained YOLO model to ONNX format for deployment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.export(format='onnx', imgsz=CONFIG['imgsz'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Export the trained YOLO model to TorchScript"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#model.export(format=\"torchscript\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 9: Store the Model\n",
    "\n",
    "Save the trained model to the Object Storage system configured in your Workbench connection. \n",
    "\n",
    "Start by getting the credentials and configuring variables for accessing Object Storage."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "AWS_S3_ENDPOINT_NAME = os.getenv(\"AWS_S3_ENDPOINT\", \"\").replace('https://', '').replace('http://', '')\n",
    "AWS_ACCESS_KEY_ID = os.getenv(\"AWS_ACCESS_KEY_ID\")\n",
    "AWS_SECRET_ACCESS_KEY = os.getenv(\"AWS_SECRET_ACCESS_KEY\")\n",
    "AWS_S3_BUCKET = os.getenv(\"AWS_S3_BUCKET\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define the S3 client."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "client = Minio(\n",
    "    AWS_S3_ENDPOINT_NAME,\n",
    "    access_key=AWS_ACCESS_KEY_ID,\n",
    "    secret_key=AWS_SECRET_ACCESS_KEY,\n",
    "    secure=True\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Select files to be uploaded (files generated while training and validating the model)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_path_train = results_train.save_dir\n",
    "weights_path = os.path.join(model_path_train, \"weights\")\n",
    "model_path_test = results_test.save_dir\n",
    "\n",
    "files_train = [os.path.join(model_path_train, f) for f in os.listdir(model_path_train) if os.path.isfile(os.path.join(model_path_train, f))]\n",
    "files_models = [os.path.join(weights_path, f) for f in os.listdir(weights_path) if os.path.isfile(os.path.join(weights_path, f))]\n",
    "files_test = [os.path.join(model_path_test, f) for f in os.listdir(model_path_test) if os.path.isfile(os.path.join(model_path_test, f))]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Upload the files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "directory_name= os.path.basename(model_path_train)\n",
    "\n",
    "for file_path_train in files_train:\n",
    "    try:\n",
    "        client.fput_object(AWS_S3_BUCKET, \"prototype/notebook/\" + directory_name + \"/train-val/\" + os.path.basename(file_path_train), file_path_train)\n",
    "        print(f\"'{os.path.basename(file_path_train)}' is successfully uploaded as object to bucket '{AWS_S3_BUCKET}'.\")\n",
    "    except S3Error as e:\n",
    "        print(\"Error occurred: \", e)\n",
    "\n",
    "for file_path_model in files_models:\n",
    "    try:\n",
    "        client.fput_object(AWS_S3_BUCKET, \"prototype/notebook/\" + directory_name + \"/\" + os.path.basename(file_path_model), file_path_model)\n",
    "        print(f\"'{os.path.basename(file_path_model)}' is successfully uploaded as object to bucket '{AWS_S3_BUCKET}'.\")\n",
    "    except S3Error as e:\n",
    "        print(\"Error occurred: \", e)\n",
    "\n",
    "for file_path_test in files_test:\n",
    "    try:\n",
    "        client.fput_object(AWS_S3_BUCKET, \"prototype/notebook/\" + directory_name + \"/test/\" + os.path.basename(file_path_test), file_path_test)\n",
    "        print(f\"'{os.path.basename(file_path_test)}' is successfully uploaded as object to bucket '{AWS_S3_BUCKET}'.\")\n",
    "    except S3Error as e:\n",
    "        print(\"Error occurred: \", e)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 10: Remove local files\n",
    "\n",
    "Once you uploaded the Model data to the Object Storage, you can remove the local files to save disk space."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!rm -rf {model_path_train}\n",
    "!rm -rf {model_path_test}"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}