---
tags:
- reinforcement-learning
- q-learning
- gymnasium
- cart-pole
library_name: gymnasium
license: apache-2.0
video_preview: replay.mp4
---

# Q-Learning Model for CartPole
This project implements a Q-learning model for the CartPole-v1 environment using Gymnasium. The agent is trained to balance a pole on a moving cart by learning optimal actions through trial and error. The learning process uses an epsilon-greedy strategy, where the agent explores random actions at the beginning and gradually shifts towards exploiting learned actions as training progresses.

Key features of the model:

**Discretization**: Continuous state variables (cart position, cart velocity, pole angle, and pole angular velocity) are discretized into bins for efficient Q-learning.
**Q-learning algorithm**: The agent updates its Q-values based on the Bellman equation, learning from the rewards it receives after each action.
**Epsilon-greedy strategy**: The agent balances exploration and exploitation

## Files:
- `train.py`: Code for training the agent.
- `cartPole_qtable.npy`: The trained Q-table.
- `replay.mp4`: A video showing the agent's performance.

## How to Reproduce:
1. Install the dependencies:
   ```bash
   pip install gymnasium numpy imageio
   ```

2. Run the training script:
   ```bash
   python train.py
   ```

3. Use the saved Q-table (`cartpole-qtable.npy`) to evaluate the model.