Master Thesis Defense: Zhenhao Li
Speaker: Zhenhao Li
Supervisors: Drs. T.-H. Chen, W. Shang
Drs. O. Ormandjieva, J. Rilling, N. Tsantalis (Chair)
Title: Characterizing and Detecting Duplicate Logging Code Smells
Date: Thursday, August 1, 2019
Time: 11:00 a.m.
Place: EV 2.260
Developers rely on software logs for a wide variety of tasks, such as debugging, testing, program comprehension, verification, and performance analysis. Despite the importance of logs, prior studies show that there is no industrial standard on how to write logging statements. Recent research on logs often only considers the appropriateness of a log as an individual item (e.g., one single logging statement); while logs are typically analyzed in tandem. In this thesis, we focus on studying duplicate logging statements, which are logging statements that have the same static text message. Such duplications in the text message are potential indications of logging code smells, which may affect developers’ understanding of the dynamic view of the system. We manually studied over 3K duplicate logging statements and their surrounding code in four large-scale open source systems: Hadoop, CloudStack, ElasticSearch, and Cassandra. We uncovered five patterns of duplicate logging code smells. For each instance of the code smell, we further manually identify the problematic (i.e., require fixes) and justifiable (i.e., do not require fixes) cases. Then, we contact developers in order to verify our manual study result. We integrated our manual study result and developers’ feedback into our automated static analysis tool, DLFinder, which automatically detects problematic duplicate logging code smells. We evaluated DLFinder on the four manually studied systems and four additional systems: Kafka, Flink, Camel and Wicket. In total, combining the results of DLFinder and our manual analysis, we reported 91 problematic code smell instances to developers and all of them have been fixed. This thesis provides an initial step on creating a logging guideline for developers to improve the quality of logging code. DLFinder is also able to detect duplicate logging code smells with high precision and recall.