Abstract :
[en] Electromechanical oscillations threaten the secure operation of power systems and if not controlled efficiently can lead to generator outages, line tripping and even large-scale blackouts. Different damping devices, like Power System Stabilizers (PSSs), Thyristor Controlled Series Compensators (TCSCs), and so on, are installed to damp these oscillations.
This thesis proposes a trajectory-based supplementary control to improve damping effects of existing controllers, which treats damping control as a multi-step optimization control problem with discrete dynamics and costs. At each control time, it collects current system states, solves the optimal control problem and superimposes the calculated supplementary inputs on the outputs of existing damping controllers, in order to enhance the damping. These supplementary signals are continuously updated, which allows to adaptively adjust and coordinate a subset of existing damping controllers, and eventually all of them. Two kinds of methods, Model Predictive Control (MPC) and Reinforcement Learning (RL), are used to embody the proposed supplementary damping control.
Firstly, a fully centralized MPC scheme is designed based on a linearized, discrete, complete state space model. Its performances are evaluated both in ideal conditions and considering realistic state estimation errors, and computation and communication delays. The effects of the number and type of available damping controllers are also studied. This scheme is further extended into a distributed scheme with the aim of making it more viable for very large-scale or multi-area systems. Different ways of decoupling and coordinating between subsystems are analyzed. Finally, a robust hierarchical multi-area MPC scheme is proposed, introducing a second layer of MPC based controllers at the level of individual power plants and transmission lines.
Secondly, a tree-based batch mode RL algorithm is applied to carry on the proposed supplementary damping control. Using a set of dynamic and reward four-tuples, it constructs an approximation of the optimal $Q$-function over a given temporal horizon. The actions greediest with respect to the $Q$-function are applied as supplementary signals to existing damping controllers. The scheme is firstly tested on a single generator, and then on multiple generators. Different reward signals and damping levels are also considered. Finally, the combined control effects of MPC and RL are investigated.