PhD Oral Exam - Parsa Bagherzadeh, Computer Science
Studies on Decoupled Modules for Integration of Extant Knowledge Sources
This event is free
School of Graduate Studies
When studying for a doctoral degree (PhD), candidates submit a thesis that provides a critical review of the current state of knowledge of the thesis subject as well as the student’s own contributions to the subject. The distinguishing criterion of doctoral graduate research is a significant and original contribution to knowledge.
Once accepted, the candidate presents the thesis orally. This oral exam is open to the public.
The field of Natural Language Processing has experienced a drastic paradigm shift in the past few years. Deep neural networks have demonstrated significant improvements for a variety of NLP tasks and are becoming the dominant systems for NLP. As a consequence, fine-tuning of pre-trained language models on a target task (transfer) is now standard practice. Recent studies, however, show that pre-trained language models often fail to demonstrate an understanding of common-sense, specialized, or syntactic knowledge. To address this issue, many studies have focused on incorporating external knowledge sources (experts) which have well-established benefits for AI systems. The current dominant practice for incorporating a knowledge source (KS) is to use an existing language model as backbone and inject the KS into it, often using a custom-designed model. Currently, most knowledge-enhanced models try to incorporate a single KS. A variety of NLP tasks however often benefit from multiple KSs, but custom-design of a model for every new KS is laborious. Moreover, KSs can largely differ in their expertise--they may have overlapping or even contradicting expertise. Once accommodated, such KSs need to interoperate well. We argue that KSs, including language models, are self-contained experts, and they need to be able to contribute their knowledge without being restricted by one another.
Inspired by classic frameworks such as Blackboard architecture as well as recent studies in conditional computation, this thesis advocates for a decoupled integration of KSs. Under a decoupled framework, different KSs can be anonymous to each other, and addition of a new KS does not impose any changes to other knowledge sources. This allows easy transfer and recycling of extant knowledge sources. Moreover, KSs can be used with different orders conditioned on input, since a decoupled approach does not assume any pre-determined order for KSs. We present two frameworks called multi-input Recurrent Independent Mechanisms (mi-RIM) and Decoupled Assorted Modules (DAM) for integration of KSs such as pre-trained language models, pre-trained graph embeddings of ontological resources, as well as grammatical information. Both frameworks allow sparse activity of KSs, enabling to ignore a KS if deemed irrelevant for a task or an input. The proposed frameworks are extensively evaluated using 11 language processing tasks-- 7 tasks from biomedical domain, and 4 general language tasks. Compared to three knowledge-enhanced models, mi-RIM and DAM frameworks both achieve superior performances on all tasks, as well as competitive results compared to state-of-the-art.