jacksonferrigno
/

Grid_AI

Reinforcement Learning

stable-baselines3

Model card Files Files and versions Metrics Training metrics Community

Grid_AI / README.md

jacksonferrigno's picture

jacksonferrigno

Update README.md

e964946 verified 5 months ago

|

history blame contribute delete

2.43 kB

	---
	tags:
	- stable-baselines3
	- power-grid
	- ppo
	- lstm
	- electricity
	- reinforcement-learning
	- forecasting
	- tensorflow
	- gym
	license: mit
	---

	# ⚡ Power Grid Optimization with LSTM + PPO

	This repository showcases a hybrid deep learning + reinforcement learning system for power grid optimization in Lauderdale County, AL. The system forecasts demand using a weather-informed LSTM model and trains a PPO-based agent to maintain stability and minimize blackout risk under stress.

	---

	## 📈 Models

	- LSTM Demand Predictor
	A deep bidirectional LSTM with attention, trained on 4 years of TVA and weather data.

	- PPO Grid Policy
	Trained in a custom `PowerGridEnv` with generator output, transformer tap, and load shedding control.

	---

	## 🧠 Dataset Overview

	- Demand Data:
	Sourced from the U.S. EIA (TVA region, 2021–2024)
	- Demand, Net Generation, Day-Ahead Forecasts, Interchange

	- Weather Data:
	Daily min/max temperatures + precipitation
	- From 5 major TVA-region airports via NOAA

	---

	## 🧮 LSTM Model

	- Architecture:
	2-layer bidirectional LSTM + attention, followed by global pooling and dense layers.

	- Key Features:
	- Rolling temperature windows, demand lags
	- Weekly mean demand, change rate
	- Temp volatility, extreme flags

	- Metrics:
	\| Metric \| Value \|
	\|---------------\|--------------------\|
	\| R² \| 0.911 \|
	\| RMSE \| 19,565 MWh \|
	\| Mean Error \| 713 MWh (overbias) \|
	\| Beats TVA Forecast \| 70.08% of days \|

	---

	## 🤖 PPO DRL Agent

	- Environment:
	PyPSA-based Lauderdale County grid
	- 6 generators (Nuclear, Hydro, CCGT)
	- Load centers with realistic demand shares
	- Thermal constraints, ramp limits, marginal costs

	- Action Space:
	- Generator control
	- Transformer tap shift
	- Load shedding (up to 20%)

	- Reward Design:
	✅ Balance demand/supply, low thermal overload
	❌ Penalize instability, overloads, excessive cost

	- Training:
	- Algorithm: PPO (SB3)
	- Timesteps: 400,000
	- VecNormalize, 5 eval episodes per 2048 steps

	- Metrics:
	\| Metric \| Value \|
	\|--------------------\|-----------\|
	\| Mean Reward \| ~1480 \|
	\| Explained Variance \| Up to 0.85 \|
	\| Blackout Risk \| < 5% \|
	\| Load Shedding \| < 3% avg \|

	---