Skip to main content
Thesis defences

PhD Oral Exam - Munira Alballa, Computer Science

Predicting transporter proteins and their substrate specificity

Date & time
Friday, June 5, 2020
9:30 a.m. – 12:30 p.m.

This event is free


School of Graduate Studies


Daniela Ferrer



When studying for a doctoral degree (PhD), candidates submit a thesis that provides a critical review of the current state of knowledge of the thesis subject as well as the student’s own contributions to the subject. The distinguishing criterion of doctoral graduate research is a significant and original contribution to knowledge.

Once accepted, the candidate presents the thesis orally. This oral exam is open to the public.


The publication of numerous genome projects has resulted in an abundance of protein sequences, a significant number of which are still unannotated. Membrane proteins such as transporters, receptors, and enzymes are among the least characterized proteins due to their hydrophobic surfaces and lack of conformational stability. This research aims to build a proteome-wide system to determine transporter substrate specificity, which involves three phases: 1) distinguishing membrane proteins, 2) differentiating transporters from other functional types of membrane proteins, and 3) detecting the substrate specificity of the transporters.

To distinguish membrane from non-membrane proteins, we propose a novel tool, TooT-M, that combines the predictions from transmembrane topology prediction tools and a selective set of classifiers where protein samples are represented by pseudo position-specific scoring matrix (Pse-PSSM) vectors. The results suggest that the proposed tool outperforms all state-of-the-art methods in terms of the overall accuracy and Matthews correlation coefficient (MCC).

To distinguish transporters from other proteins, we propose an ensemble classifier, TooT-T, that is trained to optimally combine the predictions from homology annotation transfer and machine-learning methods. The homology annotation transfer components detect transporters by searching against the transporter classification database (TCDB) using different thresholds. The machine learning methods include three models wherein the protein sequences are encoded using a novel encoding psi-composition. The results show that TooT-T outperforms all state-of-the-art de novo transporter predictors in terms of the overall accuracy and MCC.

To detect the substrate specificity of a transporter, we propose a novel tool, TooT-SC, that combines compositional, evolutionary, and positional information to represent protein samples. TooT-SC can efficiently classify transport proteins into eleven classes according to their transported substrate, which is the highest number of predicted substrates offered by any de novo prediction tool. Our results indicate that TooT-SC significantly outperforms all of the state-of-the-art methods. Further analysis of the locations of the informative positions reveals that there are more statistically significant informative positions in the transmembrane segments (TMSs) than the non-TMSs, and there are more statistically significant informative positions that occur close to the TMSs compared to regions far from them.

Back to top Back to top

© Concordia University