PhD Oral Exam - Neshat Elhami Fard, Electrical and Computer Engineering
Control of Multi-agent Reinforcement Learning Systems Under Adversarial Attacks
This event is free
School of Graduate Studies
When studying for a doctoral degree (PhD), candidates submit a thesis that provides a critical review of the current state of knowledge of the thesis subject as well as the student’s own contributions to the subject. The distinguishing criterion of doctoral graduate research is a significant and original contribution to knowledge.
Once accepted, the candidate presents the thesis orally. This oral exam is open to the public.
This PhD dissertation studies the control of multi-agent reinforcement learning (MARL) and multi-agent deep reinforcement learning (MADRL) systems under adversarial attacks. Various attacks are investigated, and several defence algorithms (mitigation approaches) are proposed to assist the consensus control and proper data transmission.
We studied the consensus problem of a leaderless, homogeneous MARL system using actor-critic algorithms, with and without malicious agents. We considered various distance-based immediate reward functions to improve the system's performance. In addition to proposing four different immediate reward functions based on Euclidean, n-norm, and Chebyshev distances, we rigorous-ly demonstrated which reward function performs better based on a cumulative reward for each agent and the entire team of agents. The claims have been proven theoretically, and the simula-tion confirmed theoretical findings.
We examined whether modifying the malicious agent's neural network (NN) structure, as well as providing a compatible combination of the mean squared error (MSE) loss function and the sig-moid activation function can mitigate the destructive effects of the malicious agent on the leader-less, homogeneous, MARL system performance. In addition to the theoretical support, the simu-lation confirmed the findings of the theory.
We studied the gradient-based adversarial attacks on cluster-based, heterogeneous MADRL sys-tems with time-delayed data transmission using deep Q-network (DQN) algorithms. We intro-duced two novel observations, termed on-time and time-delay observations, considered when the data transmission channel is idle, and the data is transmitted on-time or time-delayed. By consid-ering the distance between the neighbouring agents, we presented a novel immediate reward function that appends a distance-based reward to the previously utilized reward to improve the MADRL system performance. We considered three types of gradient-based attacks to investi-gate the robustness of the proposed system data transmission. Two defence methods were pro-posed to reduce the effects of the discussed malicious attacks. The theoretical results are illus-trated and verified with simulation examples.
We also investigated the data transmission robustness between agents of a cluster-based, hetero-geneous MADRL system under a gradient-based adversarial attack. An algorithm using a DQN approach and a proportional feedback controller to defend against the fast gradient sign method (FGSM) attack and improve the DQN agent performance was proposed. Simulation results are included to verify the presented results.