π€ Soft Actor-Critic (SAC) on Ant-v5 β Modernized OpenAI Spinning Up
This repository presents a fully trained Soft Actor-Critic (SAC) agent on the Ant-v5
environment using a modernized PyTorch-based version of OpenAI's Spinning Up in Deep RL educational framework.
Developed, trained, and maintained by MoniGarr β a self-directed AI researcher focused on NLP, multimodal systems, and RL control frameworks.
Project Mission
This work contributes to the revitalization of OpenAIβs highly respected Spinning Up in Deep RL codebase. The original repo no longer supported Python 3.8+, latest MuJoCo, or gymnasium
. This project patches those limitations and showcases a reproducible, high-performing SAC agent for the modern Ant-v5
benchmark.
It also supports my broader mission: to demonstrate technical excellence and creativity in deep reinforcement learning and AI research while advancing open and inclusive access to intelligent systems. Some of my online students and clients use my demos for learning purposes.
Model Details
Attribute | Value |
---|---|
Algorithm | Soft Actor-Critic (SAC) |
Framework | PyTorch (Modernized Spinning Up) |
Environment | Ant-v5 via gymnasium[mujoco] |
Epochs | 250 |
Action Space | Continuous (Box) |
Observation Space | Continuous (Box) |
Command Used | python -m spinup.run sac --env Ant-v5 --epochs 250 --exp_name experiment_sac_antv5_july_20_2025 |
Training Metrics Summary
Metric | Description |
---|---|
AverageEpRet |
Average return per episode (training) |
StdEpRet |
Std deviation of return |
MaxEpRet |
Max episode return in this run |
MinEpRet |
Min episode return in this run |
AverageTestEpRet |
Average return on test episodes |
Full logs:
https://github.com/monigarr/spinningup/tree/monigarr-dev/data/experiment_sac_antv5_july_20_2025/progress.txt
π Research Observations
- Policy performance stabilized after ~200 epochs
- Reward-to-noise ratio improved with tuned entropy coefficient (Ξ± = 0.2)
- Robust gait developed for complex terrain and perturbations
π§ͺ Research Context
This experiment is part of a broader initiative to:
- Modernize and benchmark deep RL frameworks
- Create reproducible SAC baselines for MuJoCo control tasks
- Prepare high-quality artifacts for hybrid/remote AI research roles (RL, multimodal AI, language models)
I am currently pursuing research roles, residencies and collaborations with a focus on intelligent control systems and language-grounded agents. I bring 30+ years of technical experience/ (previous lead mobile software architect / engineer / dev, XR producer, 3D Technical Artist), speak KanienβkΓ©ha dialects (Mohawk Language), and a long-standing record of building ethical, useful, and inclusive AI.
π Quickstart β Run the Model
# Install required libraries
pip install torch gymnasium[mujoco]
# Clone this repo (or download model + config)
git clone https://huggingface.co/MoniGarr/sac-antv5-modernized
cd sac-antv5-modernized
# Launch the SAC agent (interactive render)
python run_agent.py --env Ant-v5 --model_path ./pyt_save/model.pt
Author & Contact
MoniGarr
- AI Researcher β NLP Β· RL Β· Multimodal AI
- Based in Akwesasne / Massena, New York
- [email protected] | github.com/monigarr
Iβm looking to collaborate with ethical AI teams, remote research labs, and mission-driven builders of intelligent systems.
- Downloads last month
- 7