Seminar by Dr. Essam Mansour (Qatar Computing Research Institute)
Speaker: Dr. Essam Mansour (Qatar Computing Research Institute)
Title: Data Science Infrastructures: from Scalable Systems to Effective Discovery
Date: Friday May 4, 2018
Time: 10:30 am - 12 noon
Room: Ev 1.162
The core of data science infrastructures is the ability to analyze big data at scale. This talk discusses several challenges, such as cache efficiency, a high degree of parallelism, and automatic performance tuning in big data systems. These challenges should be addressed carefully in order to develop highly scalable systems for big data. In particular, we consider addressing these challenges effectively on multi-core workstations, cloud resources, and supercomputers with thousands of CPUs. Throughout this talk, we will present a set of novel algorithms, which we developed in order to build a highly scalable system for large-scale analytics on strings. Our experiments show that our system scales to 16,384 cores on an IBM Blue Gene/P supercomputer (with more than 80% speedup efficiency); supports 3 orders of magnitude more data, and is orders of magnitude faster than existing solutions. This talk concludes with the challenges of analytics on geo-distributed datasets and the importance of effective data discovery on large graphs.
Dr. Essam Mansour has been a Senior Research Engineer at QCRI since 2013. Before that, he was a Research Fellow at KAUST from 2009 to 2013, and a Research Fellow at the International University in Germany Bruchsal from 2008 to 2009. Essam received his Ph.D. in Computer Science from Dublin Institute of Technology (DIT), Ireland in 2008. He obtained his B.Sc. and M.Sc. in Computer Science from Cairo University, Egypt in 2000 and 2003, respectively. Essam spent more than 8 years doing world-class research, in the areas of databases, parallel/distributed systems, big data analytics, and querying geo-distributed graphs. He is developing and optimizing big data systems to work at scale on supercomputers and cloud resources. During these years, his research contributions have led to 24 conference papers and 7 journal articles (mostly in top-tier venues, such as VLDBJ, PVLDB, SIGMOD, ICDE, EDBT, and CIKM). He has been invited as a reviewer for top journals, such as ACM Transactions on Database Systems (TODS), VLDB Journal, and IEEE Transactions on Knowledge and Data Engineering (TKDE). Essam also served as a program committee member in several top conferences, such as VLDB 2019, 2018, 2017, SIGMOD 2016, and ICDE 2016. More information can be found at http://emansour.com/.