Skip to main content
Thesis defences

PhD Oral Exam - Mohammad Esrafilian Najafabadi, Building Engineering

Self-learning building HVAC control system based on dynamic occupancy patterns: A predictive approach using deep Q-networks and transfer learning

Thursday, August 18, 2022 (all day)

This event is free


School of Graduate Studies


Daniela Ferrer



When studying for a doctoral degree (PhD), candidates submit a thesis that provides a critical review of the current state of knowledge of the thesis subject as well as the student’s own contributions to the subject. The distinguishing criterion of doctoral graduate research is a significant and original contribution to knowledge.

Once accepted, the candidate presents the thesis orally. This oral exam is open to the public.


This dissertation reports the development of a self-learning control system that adjusts the building setpoint temperature to the dynamic occupancy schedules, aiming to maximize energy saving and thermal comfort. The controller interacts with the environment and learns the occupancy patterns and the lag time of heating, ventilation, and air-conditioning (HVAC) systems with no need for developing online models of occupancy and buildings. This process aims to minimize the runtime of HVAC systems during vacancy periods to save energy while providing thermal comfort conditions upon the occupants' arrival. This control framework also leverages the knowledge of the pre-trained controllers to accelerate the training process in unseen new buildings. This transfer learning method is performed based on an inter-building similarity analysis using unsupervised learning of the occupancy profiles. This process intends to minimize the thermal discomfort caused by the trial-and-error nature of the self-learning algorithm. The proposed system takes advantage of a double deep Q-network (DDQN) algorithm to find the optimal control policy. Moreover, an optimal feature selection algorithm is integrated into this framework for identifying irrelevancy and redundancy in the feature sets to further improve the training process. The merit of the controller is quantified by comparing its performance with that of a model-based predictive control (MPC), as a well-practiced occupancy-based control method. The results demonstrate that the control system provides superior thermal comfort for occupants by taking the occupancy forecasting uncertainty into account in the decision-making process. This ability improves thermal comfort by 7.87% on average with MPC as the benchmark. The use of the transfer learning method enhances thermal comfort by 68% during the training process of the algorithm.

Back to top

© Concordia University