Thesis defences

PhD Oral Exam - Elias Bou-Harb, Computer Science

Approaches and Techniques for Fingerprinting and Attributing Probing Activities by Observing Network Telescopes

Date & time

Thursday, June 25, 2015
1 p.m. – 4 p.m.

Cost

This event is free

Organization

School of Graduate Studies

Contact

Sharon Carey
514-848-2424 ext. 3802

Where

Engineering, Computer Science and Visual Arts Integrated Complex
1515 St. Catherine W.
Room 1.162

Accessible location

Yes

When studying for a doctoral degree (PhD), candidates submit a thesis that provides a critical review of the current state of knowledge of the thesis subject as well as the student’s own contributions to the subject. The distinguishing criterion of doctoral graduate research is a significant and original contribution to knowledge.

Once accepted, the candidate presents the thesis orally. This oral exam is open to the public.

Abstract

The explosive growth, complexity, adoption and dynamism of cyberspace over the last decade has radically altered the globe. A plethora of nations have been at the very forefront of this change, fully embracing the opportunities provided by the advancements in science and technology in order to fortify the economy and to increase the productivity of everyday's life. However, the significant dependence on cyberspace has indeed brought new risks that often compromise, exploit and damage invaluable data and systems. Thus, the capability to proactively infer malicious activities is of paramount importance. In this context, generating cyber threat intelligence related to probing or scanning activities render an effective tactic to achieve the latter.

In this thesis, we investigate such malicious activities, which are typically the precursors of various amplified, debilitating and disrupting cyber attacks. To achieve this task, we analyze real Internet-scale traffic targeting network telescopes or darknets, which are defined by routable, allocated yet unused Internet Protocol addresses.

First, we present a comprehensive survey of the entire probing topic. Specifically, we categorize this topic by elaborating on the nature, strategies and approaches of such probing activities. Additionally, we provide the reader with a classification and an exhaustive review of various techniques that could be employed in such malicious activities. Finally, we depict a taxonomy of the current literature by focusing on distributed probing detection methods.

Second, we focus on the problem of fingerprinting probing activities. To this end, we design, develop and validate approaches that can identify such activities targeting enterprise networks as well as those targeting the Internet-space. On one hand, the corporate probing detection approach uniquely exploits the information that could be leaked to the scanner, inferred from the internal network topology, to perform the detection. On the other hand, the more darknet tailored probing fingerprinting approach adopts a statistical approach to not only detect the probing activities but also identify the exact technique that was employed in the such activities.

Third, for attribution purposes, we propose a correlation approach that fuses probing activities with malware samples. The approach aims at detecting whether Internet-scale machines are infected or not as well as pinpointing the exact malware type/family, if the machines were found to be compromised. To achieve the intended goals, the proposed approach initially devises a probabilistic model to filter out darknet misconfiguration traf- fic. Consequently, probing activities are correlated with malware samples by leveraging fuzzy hashing and entropy based techniques. To this end, we also investigate and report a rare Internet-scale probing event by proposing a multifaceted approach that correlates darknet, malware and passive dns traffic.

Fourth, we focus on the problem of identifying and attributing large-scale probing campaigns, which render a new era of probing events. These are distinguished from previous probing incidents as (1) the population of the participating bots is several orders of magnitude larger, (2) the target scope is generally the entire Internet Protocol (IP) address space, and (3) the bots adopt well-orchestrated, often botmaster coordinated, stealth scan strategies that maximize targets' coverage while minimizing redundancy and overlap. To this end, we propose and validate three approaches. On one hand, two of the approaches rely on a set of behavioral analytics that aim at scrutinizing the generated traffic by the probing sources. Subsequently, they employ data mining and graph theoretic techniques to systematically cluster the probing sources into well-defined campaigns possessing similar behavioral similarity. The third approach, on the other hand, exploit time series interpolation and prediction to pinpoint orchestrated probing campaigns and to filter out non-coordinated probing ows.

We conclude this thesis by highlighting some research gaps that pave the way for future work.

Events