--- tags: - reinforcement-learning - q-learning - gymnasium - cart-pole library_name: gymnasium license: apache-2.0 video_preview: replay.mp4 --- # Q-Learning Model for CartPole This project implements a Q-learning model for the CartPole-v1 environment using Gymnasium. The agent is trained to balance a pole on a moving cart by learning optimal actions through trial and error. The learning process uses an epsilon-greedy strategy, where the agent explores random actions at the beginning and gradually shifts towards exploiting learned actions as training progresses. Key features of the model: **Discretization**: Continuous state variables (cart position, cart velocity, pole angle, and pole angular velocity) are discretized into bins for efficient Q-learning. **Q-learning algorithm**: The agent updates its Q-values based on the Bellman equation, learning from the rewards it receives after each action. **Epsilon-greedy strategy**: The agent balances exploration and exploitation ## Files: - `train.py`: Code for training the agent. - `cartPole_qtable.npy`: The trained Q-table. - `replay.mp4`: A video showing the agent's performance. ## How to Reproduce: 1. Install the dependencies: ```bash pip install gymnasium numpy imageio ``` 2. Run the training script: ```bash python train.py ``` 3. Use the saved Q-table (`cartpole-qtable.npy`) to evaluate the model.