Optimal Control of Microgrids with Multi-stage Mixed-integer Nonlinear Programming Guided Q-learning Algorithm

YOLDAŞ, YELİZ; GÖREN, SELÇUK; ÖNEN, AHMET

doi:10.35833/mpce.2020.000506

Optimal Control of Microgrids with Multi-stage Mixed-integer Nonlinear Programming Guided Q-learning Algorithm

Atıf İçin Kopyala

YOLDAŞ Y., GÖREN S., ÖNEN A.

JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, cilt.8, sa.6, ss.1151-1159, 2020 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 8 Sayı: 6
Basım Tarihi: 2020
Doi Numarası: 10.35833/mpce.2020.000506
Dergi Adı: JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
Sayfa Sayıları: ss.1151-1159
Anahtar Kelimeler: Heuristic algorithms, Simulation, Microgrids, Programming, Minimization, Real-time systems, Batteries, Cost minimization, energy management system, microgrid, real-time optimization, reinforcement learning, OPTIMAL ENERGY MANAGEMENT, DEMAND RESPONSE, OPTIMIZATION, RESOURCE, SYSTEMS
Abdullah Gül Üniversitesi Adresli: Evet

Özet

This paper proposes an energy management system (EMS) for the real-time operation of a pilot stochastic and dynamic microgrid on a university campus in Malta consisting of a diesel generator, photovoltaic panels, and batteries. The objective is to minimize the total daily operation costs, which include the degradation cost of batteries, the cost of energy bought from the main grid, the fuel cost of the diesel generator, and the emission cost. The optimization problem is modeled as a finite Markov decision process (MDP) by combining network and technical constraints, and Q-learning algorithm is adopted to solve the sequential decision subproblems. The proposed algorithm decomposes a multi-stage mixed-integer nonlinear programming (MINLP) problem into a series of single-stage problems so that each subproblem can be solved by using Bellman's equation. To prove the effectiveness of the proposed algorithm, three case studies are taken into consideration: (1) minimizing the daily energy cost; (2) minimizing the emission cost; (3) minimizing the daily energy cost and emission cost simultaneously. Moreover, each case is operated under different battery operation conditions to investigate the battery lifetime. Finally, performance comparisons are carried out with a conventional Q-learning algorithm.