SOCO Energy Demand Forecasting with Weather, Grid Operations Data & Machine Learning
This project builds an end-to-end machine learning workflow to forecast electricity demand for the SOCO balancing authority 24–48 hours ahead using real grid operations data, historical weather observations, calendar effects, and domain-driven lag features.
The work emphasizes production-style ML: a modular data pipeline, automated quality checks, exploratory time-series analysis, feature engineering, classical forecasting baselines, Optuna-tuned models, MLflow experiment tracking, and a public Streamlit portfolio app.
Background & Problem Statement
Electric utilities rely on day-ahead demand forecasts to schedule generation, plan transmission dispatch, coordinate market bids, and maintain reserve margins. Forecast errors carry direct operational costs: overforecasting can lead to excess reserve commitments, while underforecasting may require emergency re-dispatch when demand is higher than expected.
Problem Statement: Can weather data, calendar structure, and horizon-safe historical demand features be combined into a lightweight ML forecasting system that outperforms classical time-series baselines for 24–48 hour SOCO electricity demand prediction?
Dataset & Data Sources
The target variable is demand_imputed_pudl_mwh, an hourly PUDL-imputed demand series for the SOCO balancing authority. The cleaned dataset contains 92,833 hourly rows and spans July 2015 through February 2026.
Energy operations data comes from PUDL / EIA-930, including demand, day-ahead demand forecasts, net generation, and interchange fields. Weather data comes from the Open-Meteo Historical API for seven representative SOCO-region cities: Albany, Atlanta, Birmingham, Huntsville, Meridian, Mobile, and Savannah.
Data Pipeline & Quality Gate
The project uses a modular pipeline before modeling so that ingestion, validation, and cleaning can be rerun consistently from scripts or imported as library functions.
- Loader: parses the raw merged CSV, fixes timezone-aware datetimes, and enforces expected dtypes.
- Quality checks: runs an automated quality gate that returns structured success, failure, warning, and statistics outputs.
- Cleaner: removes near-fully-null columns, deduplicates timestamps, drops null targets, and forward-fills missing day-ahead forecast values.
- Cleaning outcome: 86 raw columns reduced to 82 cleaned columns, 3,899 day-ahead forecast nulls forward-filled, zero null targets, and quality gate passed.
Exploratory Data Analysis
EDA focused on the drivers of hourly electricity demand and the forecast horizon constraints that matter for operational load forecasting. Demand exhibits strong daily and weekly autocorrelation, a dual-peak seasonal pattern, lower weekend and holiday usage, and a U-shaped relationship with temperature due to heating and cooling load.
- Diurnal cycle: demand typically troughs overnight and peaks in the afternoon.
- Seasonality: summer cooling demand and winter heating demand create a bimodal annual pattern.
- Weather signal: temperature and dew point carry meaningful predictive information for HVAC-driven load.
- Autocorrelation: 24-hour and 168-hour structure motivates lag and rolling-window features.
Feature Engineering
Feature engineering converts the cleaned hourly dataset into a model-ready table while avoiding forecast-horizon leakage. The final pipeline reduces the cleaned dataset to 45 retained features after engineered feature creation and two-pass redundancy pruning.
- Lag features: 24h, 48h, and 168h demand lags plus a 24h lag of the balancing authority forecast.
- Rolling statistics: 24h and 168h shifted rolling mean, standard deviation, minimum, and maximum values.
- Weather transforms: heating degree hours and cooling degree hours to represent the U-shaped temperature-demand curve.
- Regional aggregates: weather variables averaged across seven SOCO-region cities to reduce multicollinearity.
- Interactions: terms such as temperature × hour and weekend × hour to capture time-of-day load shape changes.
Modeling Approach
The modeling workflow compares three approaches across the same 80/20 chronological test split. SARIMAX(1,1,1)(1,1,1,24) serves as the classical time-series baseline, evaluated with a rolling 24-hour horizon — forecasting one day ahead before re-anchoring to actual observations. Prophet (Tuned) provides an Optuna-optimized probabilistic baseline using built-in seasonality decomposition and 9 weather and calendar regressors. XGBoost uses the full set of engineered lag, rolling, weather, calendar, and interaction features in a supervised regression setup, exploiting tabular structure that the time-series models cannot access natively.
- SARIMAX: seasonal baseline with exogenous weather and holiday regressors.
- Prophet: Fourier seasonality, US federal holidays, and selected weather regressors.
- Prophet tuning: Optuna TPE search over changepoint, seasonality, holiday, mode, and changepoint-range parameters.
- XGBoost tuning: Optuna TPE search with TimeSeriesSplit cross-validation across tree depth, estimators, learning rate, sampling, and regularization parameters.
Results & Forecast Performance
The tuned XGBoost model is the final winner. It achieves RMSE = 1,147 MWh and MAPE = 2.90%, reducing RMSE by 37% compared with the SARIMAX baseline and substantially outperforming the tuned Prophet model.
Final Model Comparison (Full Test Set)
| Model | MAE (MWh) | RMSE (MWh) | MAPE | vs SARIMAX RMSE |
|---|---|---|---|---|
| XGBoost (Tuned) | 817 | 1,147 | 2.90% | −37% |
| Prophet (Tuned) | 1,494 | 1,956 | 5.42% | +6.1% |
| SARIMAX(1,1,1)(1,0,1,24) | 1,305 | 1,829 | 4.8% | — |
Experiment Tracking & Portfolio App
The project goes beyond notebook modeling by logging trained model runs, parameters, metrics, and artifacts with MLflow. The final outputs are packaged into a public Streamlit portfolio dashboard with pages for project overview, EDA, model results, and methodology.
- MLflow: tracks SARIMAX, Prophet, and XGBoost experiments with metrics and model artifacts.
- Streamlit app: presents the workflow as an interactive portfolio showcase for hiring managers and technical reviewers.
- Testing: includes pytest smoke tests for model code and pipeline components.
- Future work: planned FastAPI inference endpoint for serving forecasts programmatically.
Operational & Portfolio Value
This project demonstrates a practical forecasting workflow that connects energy-sector domain knowledge with production-minded ML engineering.
- Utility relevance: frames demand forecasting around dispatch, reserve planning, and day-ahead operational decisions.
- Domain-driven features: uses autocorrelation, HVAC weather effects, calendar structure, and horizon-safe lags.
- Model comparison discipline: benchmarks XGBoost against classical time-series models rather than reporting a single model in isolation.
- Reproducible portfolio asset: combines data pipeline, modeling, tracking, testing, documentation, and a live demo.
Github Repository & Live Demo
The full implementation and public dashboard are available through the links below.
🔗 View Project Repository on GitHub
The repository includes the data pipeline, feature engineering code, trained model artifacts, MLflow logging scripts, exported EDA figures, tests, and the Streamlit app used for the portfolio demo.