PhD Oral Exam - Rabiah A. Al-qudah, Computer Science
Peripheral Blood Smear Analyses Using Deep Learning
This event is free
School of Graduate Studies
When studying for a doctoral degree (PhD), candidates submit a thesis that provides a critical review of the current state of knowledge of the thesis subject as well as the student’s own contributions to the subject. The distinguishing criterion of doctoral graduate research is a significant and original contribution to knowledge.
Once accepted, the candidate presents the thesis orally. This oral exam is open to the public.
Peripheral Blood Smear (PBS) analysis is a vital routine test carried out by hematologists to assess some aspects of humans' health status. PBS analysis is prone to human errors and utilizing computer-based analysis can greatly enhance this process in terms of accuracy and cost. Recent approaches in learning algorithms, such as deep learning, are data hungry, but due to the scarcity of labeled medical images, researchers had to find viable alternative solutions to increase the size of available datasets. Synthetic datasets provide a promising solution to data scarcity, however, the complexity of blood smears' natural structure adds an extra layer of challenge to its synthesizing process. In this thesis, we propose a methodology that utilizes Locality Sensitive Hashing (LSH) to create a novel balanced dataset of synthetic blood smears. This dataset, which was automatically annotated during the generation phase, covers 17 essential categories of blood cells. The dataset also got the approval of 5 experienced hematologists to meet the general standards of making thin blood smears.
Moreover, a platelet classifier and a WBC classifier were trained on the synthetic dataset. For classifying platelets, a hybrid approach of deep learning and image processing techniques is proposed. This approach improved the platelet classification accuracy and macro-average precision from 82.6\% to 98.6\% and 76.6\% to 97.6\% respectively. Moreover, for white blood cell classification, a novel scheme for training deep networks is proposed, namely, Enhanced Incremental Training, that automatically recognises and handles classes that confuse and negatively affect neural network predictions. To handle the confusable classes, we also propose a procedure called "training revert". Application of the proposed method has improved the classification accuracy and macro-average precision from 61.5\% to 95\% and 76.6\% to 94.27\% respectively.
In addition, the feasibility of using animal reticulocyte cells as a viable solution to compensate for the deficiency of human data is investigated. The integration of animal cells is implemented by employing multiple deep classifiers that utilize transfer learning in different experimental setups in a procedure that mimics the protocol followed in experimental medical labs. Moreover, three measures are defined, namely, the pretraining boost, the dataset similarity boost, and the dataset size boost measures to compare the effectiveness of the utilized experimental setups. All the experiments of this work were conducted on a novel public human reticulocyte dataset and the best performing model achieved 98.9\%, 98.9\%, 98.6\% average accuracy, average macro precision, and average macro F-score respectively.
Finally, this work provides a comprehensive framework for analysing two main blood smears that are still being conducted manually in labs. To automate the analysis process, a novel method for constructing synthetic whole-slide blood smear datasets is proposed. Moreover, to conduct the blood cell classification, which includes eighteen blood cell types and abnormalities, two novel techniques are proposed, namely: enhanced incremental training and animal to human cells transfer learning. The outcomes of this work were published in six reputable international conferences and journals such as the computers in biology and medicine and IEEE access journals.