Photo by Nazrin Babashova on Unsplash

Grid Reliability with Renewable Energy

Samet Girgin
PursuitOfData
Published in
9 min readMay 19, 2024

--

Demand & Production Forecasting, Optimization, and Environmental-Economic Impact Assessment with ML Algorithms

A smart city is a city using technology and data to improve public services, find solutions to urban challenges, and improve the quality of life of its residents. This may include using sensors and data analytics to optimize traffic flow, manage energy use, improve public safety and more. This study is based on the use of Python and ML libraries for historical data analysis to identify energy patterns, forecast demand and renewable generation, and design optimization strategies for reliable grid management.

The aim is to improve grid stability while maximizing the use of renewable energy by evaluating both environmental and economic impacts.

The steps and outputs:

1-Data Analysis: Data is analyzed to find patterns in demand and supply. (Reliability and variability in solar and wind are important) Looking at the correlation between weather and renewable energy sources

2- Forecasting:

  • Estimates solar and wind energy production based on historical weather data. RMSE (Root Mean Square Error) is used to evaluate model performance.
  • An ML model is created that predicts the next 7 days

3- Optimization Strategy: A solution is sought for an optimization problem that minimizes the cost of electricity produced.

4- Impact assessment: The reduction in carbon emissions achieved with the optimization strategy is measured. (0.5 kg CO2/kWh for natural gas).

  • The economic impact is evaluated by calculating the savings achieved by reducing dependence on peak-load power plants (gas, etc.) and increasing the use of renewable energy.

1- Install PuLP Library: The PuLP library in Python is a popular open-source linear programming (LP) modeling package. It’s used to solve optimization problems, particularly those related to linear programming, mixed-integer programming, and other related areas. PuLP provides a user-friendly syntax for describing linear programs and interfaces with a variety of solvers for actually solving these problems.

!pip install pulp

2- Python libraries used for data analysis, data visualization, machine learning models, time series analysis, model evaluation, and optimization are imported.

Two datasets are also pulled from the Github repo.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
from math import sqrt
from statsmodels.tsa.statespace.sarimax import SARIMAX
import pulp
# Veri kümeleri yüklenir

weather_data = pd.read_csv("https://raw.githubusercontent.com/sametgirgin/Hibrit-Enerji-Sistemi/main/WeatherData.csv")
energy_data = pd.read_csv("https://raw.githubusercontent.com/sametgirgin/Hibrit-Enerji-Sistemi/main/EnergyData.csv")

3- Preliminary Data Analysis:

  • The monthly energy demand, wind, and solar energy production
  • Merging two datasets and creating a correlation matrix
# Aylık enerji talebi ve üretimlerinin ortalamaları
energy_data["Date"] = pd.to_datetime(energy_data["Date"])
energy_data.set_index('Date', inplace=True)
energy_data.groupby(energy_data.index.month).mean().plot(kind='bar', y=['Demand_MW', 'Solar_MW', 'Wind_MW'])
plt.title('Aylık Ortalama Enerji Talebi ve Üretimi')
plt.show()
# Birleştirme işlemi için indeks silme
energy_data.reset_index(inplace=True)
# Tarih sütunları Datetime formatında
weather_data['Date'] = pd.to_datetime(weather_data['Date'])
# İki veris seti tarihlerden birleştirildi
merged_data = pd.merge(energy_data, weather_data, on='Date')
# Güneşte bazı günler 0 alınıdığı için bunlar filtrelenir.
merged_data = merged_data[merged_data['Solar_MW'] > 0]

# Enerji üretimi ve hava durumu korelasyonu
correlation = merged_data[['AvgTemp_Celsius', 'AvgWindSpeed_kmh','Solar_MW', 'Wind_MW']].corr()
plt.figure(figsize=(8, 6))
sns.heatmap(correlation, annot=True, cmap='coolwarm',linewidths=.5)

4- Forecasting Power Production and Demand

RMSE (Root Mean Square Error) indicates the accuracy of our model’s predictions. Specifically, it measures the average magnitude of errors between predicted ​​and actual observed values.

Understanding RMSE:

  • The magnitude of Errors: RMSE represents the square root of the squared differences between predicted ​​and actual values. Lower RMSE values ​​indicate better sleep and more accurate predictions.
  • Sensitivity to Outliers: Since errors are not framed before they are received, RMSE gives a high weight to sensitivity to large errors. This means that RMSE is sensitive to outliers.

Random Forest Regressor:

A random forest is a meta-estimator that fits a number of decision tree regressors on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. Trees in the forest use the best-split strategy, i.e. equivalent to passing splitter="best" to the underlying DecisionTreeRegressor.

# Tahmin kurulumu (y Hedef değişkenler, X öznitelikler)
X = merged_data[['AvgTemp_Celsius', 'AvgWindSpeed_kmh']]
y_talep = merged_data['Demand_MW']
y_gunes = merged_data['Solar_MW']
y_ruzgar = merged_data['Wind_MW']

# Güç Talebi için eğitim ve test verisi oluşturma, model eğitme ve tahminleme
X_train, X_test, y_train, y_test = train_test_split(X, y_talep, test_size=0.2, random_state = 42)
talep_model = RandomForestRegressor(random_state=42)
talep_model.fit(X_train, y_train)
y_talep_tahmin = talep_model.predict(X_test)

# Güneş ve rüzgar modeli kurulumu- Rüzgar ve güneş için iki ayrı model örneği oluşturulur.
gunes_model = RandomForestRegressor(random_state=42)
ruzgar_model = RandomForestRegressor(random_state=42)

# Sıfır güneş enerjisi üretimini filtreledikten sonra endekslerin hizalanması gereklidir
X = X.reset_index(drop=True)
y_gunes = y_gunes.reset_index(drop=True)
y_ruzgar = y_ruzgar.reset_index(drop=True)

# Güneş ve rüzgar gücü için eğitim ve test verisi oluşturma, model eğitme ve tahminleme
X_train_gunes, X_test_gunes, y_train_gunes, y_test_gunes = train_test_split(X, y_gunes, test_size=0.2, random_state=42)
X_train_ruzgar, X_test_ruzgar, y_train_ruzgar, y_test_ruzgar = train_test_split(X, y_ruzgar, test_size=0.2, random_state=42)
gunes_model.fit(X_train_gunes, y_train_gunes)
ruzgar_model.fit(X_train_ruzgar, y_train_ruzgar)
y_gunes_tahmin = gunes_model.predict(X_test_gunes)
y_ruzgar_tahmin = ruzgar_model.predict(X_test_ruzgar)

# Gerçek değeri sıfır olan tahmin değerlerini düzeltmek için sıfır değeri girilir.
actual_zeros = energy_data[energy_data['Solar_MW']== 0].index
mask_zeros_in_actuals = X_test_gunes.index.isin(actual_zeros)
y_gunes_tahmin_corrected = np.where(mask_zeros_in_actuals, 0, y_gunes_tahmin)

#Model değerlendirme süreci
# Gerçek ve tahmini değerlerden elde edilen RMSE değerini tanımlayan fonksiyorn yazılır.
def rmse_percentage(true_values, predicted_values):
rmse = sqrt(mean_squared_error(true_values, predicted_values))
average = np.mean(true_values)
return (rmse / average)*100

# Enerji Talebi, güneş ve rüzgar enerji üretimleri için RMSE hesabu
talep_rmse = rmse_percentage (y_test, y_talep_tahmin)
gunes_rmse = rmse_percentage (y_test_gunes, y_gunes_tahmin)
ruzgar_rmse = rmse_percentage (y_test_ruzgar, y_ruzgar_tahmin)

# RMSE yüzdesi çıktıları
print (f"Talep Tahmini RMSE (%): {talep_rmse}")
print (f"Güneş Üretim Tahmini RMSE (%): {gunes_rmse}")
print(f"Rüzgar Üretim Tahmini RMSE (%): {ruzgar_rmse}")
Talep Tahmini RMSE (%): 5.943996634938721
Güneş Üretim Tahmini RMSE (%): 9.011440280631595
Rüzgar Üretim Tahmini RMSE (%): 8.31763762351461

5- Visualization of predicted ​​and actual values: These plots are important to evaluate the model’s ability to handle natural variability in solar energy production, which is affected by factors such as sunlight availability and weather conditions.

import seaborn as sns

fig, axes = plt.subplots(1, 3, figsize=(20, 7))

# Enerji talep görselleştirmesi
sns.scatterplot(x=y_test, y=y_talep_tahmin, color='blue', label='Predicted', ax=axes[0])
axes[0].set_title('Güç Tahmini: Gerçek vs. Tahmin', fontsize=16)
axes[0].set_xlabel('Gerçek Talep', fontsize=14)
axes[0].set_ylabel('Tahmini Talep', fontsize=14)
axes[0].legend(fontsize=12)

# Güneş enerji üretimi görselleştirmesi
sns.scatterplot(x=y_test_gunes, y=y_gunes_tahmin, color='green', label='Tahmin', ax=axes[1])
axes[1].set_title('Güneş Üretimi: Gerçek vs. Tahmini', fontsize=16)
axes[1].set_xlabel('Gerçek Güneş Üretimi', fontsize=14)
axes[1].set_ylabel('Tahmini Güneş Üretimi', fontsize=14)
axes[1].legend(fontsize=12)

# Rüzgar enerji üretimi görselleştirmesi
sns.scatterplot(x=y_test_ruzgar, y=y_ruzgar_tahmin, color='purple', label='Tahmini', ax=axes[2])
axes[2].set_title('Rüzgar Üretimi: Gerçek vs. Tahmini', fontsize=16)
axes[2].set_xlabel('Gerçek Rüzgar Üretimi', fontsize=14)
axes[2].set_ylabel('Tahmini Rüzgar Üretimi', fontsize=14)
axes[2].legend(fontsize=12)

plt.tight_layout()
plt.show()

6- Daily Energy Forecasting with SARIMAX:

SARIMAX (Seasonal AutoRegressive Integrated Moving Average) can model both seasonal and non-seasonal components in time series. In this way, more complex patterns can be captured and the accuracy of the prediction, which is important in energy management, is better ensured.

  • energy_data is converted to a daily time series to prepare for time series analysis. (Daily average calculation)
  • The SARIMAX model is structured with certain parameters (order and seasonal_order) that define the structure of the model, including autoregressive (AR), differencing, and moving average components for both the non-seasonal and seasonal parts of the series. The season_order parameter includes a 7-day period to model the weekly seasonality common in energy demand data.
# SARIMAX ile günlük enerji talep tahmini
energy_data_daily = energy_data.set_index('Date').resample('D').mean()
sarima_model = SARIMAX( energy_data_daily['Demand_MW'], order=(1, 1, 1), seasonal_order=(1, 1, 1, 7) )
sarima_results = sarima_model.fit(disp=False)
sarima_forecast = sarima_results.forecast(steps=7)
print("Sonraki 7 gün için SARIMAX modeli: ")
print(sarima_forecast)
Sonraki 7 gün için SARIMAX modeli: 
2024-01-01 101.253215
2024-01-02 102.083820
2024-01-03 101.868729
2024-01-04 102.082013
2024-01-05 102.392957
2024-01-06 102.757862
2024-01-07 101.833041
Freq: D, Name: predicted_mean, dtype: float64

7- Optimization with PuLP:

For an optimization problem whose goal is to determine the optimal combination of solar, wind, and grid supply to meet the average daily energy demand at minimum cost, PuLP, a linear programming library, can be used.

  • Average solar and wind production and demand are calculated from the forecast results. This step determines the supply requirement and therefore the basis for what the optimization model aims to achieve.
  • Decision Variables: Decision variables in the model are the values ​​we want to find. Here, the decision variables sebeke_tedarik (grid_supply), gunes_tedarik (solar_supply), and ruzgar_tedarik (wind_supply) give the amount of energy to be supplied from each source.
  • Solar and wind resources are limited to average production capacities determined to ensure that solutions are realistic and do not exceed the estimated potential energy production.
  • Objective Function: It is the main element of the optimization problem. Here, our goal is to minimize the energy supply cost. This function includes costs per kWh for solar, wind and grid-supplied (electricity from natural gas). By minimizing this function, it is aimed to meet the demand at the lowest cost.
  • Constraints: These are the rules that the solution to the optimization problem must comply with. The main constraint here is that the total energy provided by solar, wind, and the grid must be at least equal to the average daily demand.
  • Solution and Output: It outputs the optimum feeding amounts for each energy source and warns the user if it cannot find a suitable solution.
  • Emission and Economic Impact Assessment: CO2 emission reduction and economic savings amount thanks to renewable energy sources instead of grid supply.
# PuLP ile optimizasyon
ort_gunes_MWh_corrected = np.mean(y_gunes_tahmin_corrected)
ort_ruzgar_MWh = np.mean(y_ruzgar_tahmin)
ort_günlük_talep_MW = np.mean(y_talep_tahmin)

# Etki değerlendirme için sabitler
sebeke_emisyon_faktoru = 0.5
gunes_maliyeti_per_kWh = 0.05
ruzgar_maliyeti_per_kWh = 0.07
gaz_maliyeti_per_kWh = 0.15

# Optimizasyon problemini belirle
problem = pulp.LpProblem("Enerji_Optimizasyonu", pulp.LpMinimize)

# Düzeltilmiş güneş üretimi kullanımı için tedarik değişkenlerinin güncellenmesi
sebeke_tedarik = pulp.LpVariable("Şebeke_Tedarik", lowBound=0)
gunes_tedarik = pulp.LpVariable("Güneş_Tedarik", lowBound=0,upBound=ort_gunes_MWh_corrected)
ruzgar_tedarik = pulp.LpVariable("Rüzgar_Tedarik",lowBound=0, upBound=ort_ruzgar_MWh)

# Amaç Fonksiyonu
problem += (gunes_maliyeti_per_kWh * gunes_tedarik + ruzgar_maliyeti_per_kWh* ruzgar_tedarik + gaz_maliyeti_per_kWh*sebeke_tedarik, "Toplam Maliyet")
problem += (sebeke_tedarik + gunes_tedarik + ruzgar_tedarik >= ort_günlük_talep_MW,"Talep Karşılama")
problem.solve()
# Optimum tedarikte Çözüm ve Çıktı
if pulp.LpStatus[problem.status] == 'Optimal':
print(f"Optimum şebeke tedariği: {sebeke_tedarik.varValue:.2f} MW")
print (f"Optimum solar tedariği: {gunes_tedarik.varValue:.2f} MW")
print(f"Optimum rüzgar tedariği: {ruzgar_tedarik.varValue:.2f} MW")
else:
print("Optimizasyon ekonomik bir çözüm bulamadı")

# Emisyon Etki Değerlendirmesi
emisyon_azaltma_kg = (gunes_tedarik.varValue + ruzgar_tedarik.varValue) * sebeke_emisyon_faktoru
gaz_maliyeti = ort_günlük_talep_MW * gaz_maliyeti_per_kWh
optimizasyonlu_maliyet = pulp.value(problem.objective)
tasarruf = gaz_maliyeti - optimizasyonlu_maliyet

print(f"Emisyon azalımı: {emisyon_azaltma_kg:.2f} kg CO2")
print (f"Ekonomik Tasarruf: ${tasarruf:.2f}")
Optimum şebeke tedariği: 57.64 MW
Optimum solar tedariği: 29.99 MW
Optimum rüzgar tedariği: 13.63 MW
Emisyon azalımı: 21.81 kg CO2
Ekonomik Tasarruf: $4.09

8- Impact Assessment:

  • Calculates the CO2 emissions saved by using renewable energy sources (solar and wind) instead of relying solely on the grid supply, which is assumed to have a higher emission factor.
  • Computes the economic savings by comparing the cost of meeting demand with natural gas to the optimized cost involving renewables and grid energy.
  • Prints the estimated CO2 emissions savings and economic benefits.
  • Visualization of Energy Supply Distribution: Creates a bar chart showing the distribution of energy supplied by the grid, solar, and wind sources based on the optimization results. This visual helps stakeholders understand the contribution of each energy source to meeting demand.
  • Visualization of Environmental and Economic Impact: Displays bar chart illustrating the CO2 emissions saved and the economic savings achieved through optimization. This visualization highlights the dual benefits of optimizing energy supply: reducing environmental impact and lowering costs.
# Set up subplots
fig, axes = plt.subplots(1, 2, figsize=(20, 6))

# Visualization 1: Optimized Energy Supply Distribution
axes[0].bar(['Şebeke', 'Güneş', 'Rüzgar'], [sebeke_tedarik.varValue, gunes_tedarik.varValue, ruzgar_tedarik.varValue], color=['blue', 'orange', 'green'])
axes[0].set_title('Optimize edilmiş Güç Tedarik Dağılımı')
axes[0].set_xlabel('Güç Kaynağı')
axes[0].set_ylabel('Güç Tedariği (MW)')

# Visualization 2: Environmental and Economic Impact of Optimization
axes[1].bar(['CO2 Emisyon Tasarrufu (kg)', 'Ekonomik Tasarruf ($)'], [emisyon_azaltma_kg, tasarruf], color=['red', 'green'])
axes[1].set_title('Optimizasyonun Çevre ve Ekonomik Etkisi')

plt.tight_layout()
plt.show()

In summary, the optimized grid supply configuration highlights that the grid remains the primary energy source, supplying 57.64 MW, with significant contributions from solar (29.99 MW) and wind (13.63 MW) sources. This optimized distribution signifies a transition toward renewable energy adoption ensuring grid stability and meeting overall demand.

The emissions reductions and economic savings underscore the advantages of the optimization approach. The strategy has led to a decrease of 21.81 kg of CO2 emissions, supporting environmental preservation efforts, along with an economic saving of about $4.09. While seemingly modest, this economic benefit can accumulate significantly over time.

Moreover, the optimized objective value, achieved with rapid computation, indicates the efficiency and effectiveness of the optimization algorithm. Its ability to swiftly find the best solution is advantageous for real-time grid management, where prompt decision-making is crucial.

References:

L. Zhao and M. Wei, “Optimization of Renewable Energy Supply Chains with Python,” Journal of Cleaner Production, vol. 250, pp. 119–127, Jan. 2020.

N. Grant and O. Boyd, “Evaluating Solar Irradiance Forecasting Techniques Using Python,” Energy Reports, vol. 6, pp. 213–220, Jan. 2020.

R. Gupta and S. Kumar, “Python-Based Optimization for Load Forecasting in Smart Grids,” Electric Power Systems Research, vol. 176, pp. 105–112, Nov. 2019.

E. Roberts, “Python for Environmental Modeling: Applications to Wind Energy,” Environmental Modelling & Software, vol. 118, pp. 112–119, Nov. 2018.

Open Source Dataset: https://data.worldbank.org/

https://towardsdatascience.com/introduction-to-linear-programming-in-python-9261e7eb44b

https://towardsdatascience.com/random-forest-regression-5f605132d19d

https://scikitlearn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html

https://towardsdatascience.com/time-series-forecasting-with-arima-sarima-and-sarimax-ee61099e78f6

https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html

--

--

Samet Girgin
PursuitOfData

Data Analyst, Petroleum & Natural Gas Engineer, PMP®