When studying for a doctoral degree (PhD), candidates submit a thesis that provides a critical review of the current state of knowledge of the thesis subject as well as the student’s own contributions to the subject. The distinguishing criterion of doctoral graduate research is a significant and original contribution to knowledge.
Once accepted, the candidate presents the thesis orally. This oral exam is open to the public.
Abstract
Software undergoes constant changes to support new requirements, address bugs, enhance performance, and ensure maintainability. As a result, developers spend a large portion of their workday understanding and reviewing code changes. Abstract Syntax Tree (AST) diff tools were developed to overcome the limitations of line-based diff tools, which are still the default for most developers. Despite their advantages in capturing structural changes, existing AST diff tools suffer from serious limitations, such as lacking multi-mapping support, matching semantically incompatible nodes, ignoring language-specific clues, lacking refactoring awareness, and offering no commit-level diff support.
To address these issues, we propose a novel AST diff tool based on RefactoringMiner that resolves all aforementioned limitations. We improve statement mapping accuracy and introduce an algorithm that produces commit-level AST diffs using refactoring instances and matched program elements. Our evaluation demonstrates significant improvements in both precision and recall, while maintaining competitive execution times.
To facilitate objective and reproducible assessment of diff quality, we introduce a benchmarking framework that measures precision and recall across existing tools using a curated ground-truth of AST node mappings. This infrastructure supports rigorous comparisons and enables deeper investigations into the impact of AST representations and algorithm design choices.
Finally, we investigate the relationship between edit script length and diff quality by combining metric-based analysis with human feedback, revealing that minimizing edit length is not a reliable indicator of developer-preferred diffs.