PhD Oral Exam - Stuart Thiel, Computer Science
The Diverting Fast Radix Algorithm
This event is free
School of Graduate Studies
When studying for a doctoral degree (PhD), candidates submit a thesis that provides a critical review of the current state of knowledge of the thesis subject as well as the student’s own contributions to the subject. The distinguishing criterion of doctoral graduate research is a significant and original contribution to knowledge.
Once accepted, the candidate presents the thesis orally. This oral exam is open to the public.
Radix sort is a classical algorithm to sort $N$ records with integer keys. The keys are represented using digits of a radix. The classical algorithm executes a pass through the records to count the frequency of each digit to determine the digit’s bucket size, and then a pass to deal the records to the corresponding bucket. The algorithm may proceed from the most significant digit to the least significant, or from the least significant digit to the most significant.
This thesis presents the Fast Radix and the Diverting Fast Radix algorithms, which replace all but one count pass with estimation of bucket sizes and correction of overflow, and may, in addition, divert to another sorting algorithm such as insertion sort when the subproblem size is below a threshold.
Extensive performance experiments compare the implementations of the two algorithms against the native std::sort in the C++ Standard Library, radix sort, and RADULS2, which is currently considered the fastest radix sort available.
We develop a mathematical model of the average-case performance, when keys have a uniform distribution of digits, that support the observed performance improvements. The modeling demonstrates that the costs of estimation and correction quickly approach zero, falling under 5% of the costs of traditional counting for N as low as 11100. The modeling also provides a formula to calculate the thresholds for diversion that conform to the experimental results.
Our focus is on algorithms, so our implementation is fixed-digit, single-threaded and avoids optimizations such as parallelism, cache, translation lookaside buffer (TLB), or the specific input distribution, which are utilised by RADULS2. Nevertheless, Diverting Fast Radix is more than twice as fast as std::sort on inputs of size up to one billion (and beyond), and approximately 10% faster than RADULS2 for one million records (even faster on smaller inputs). It is notable that the compiled code size of Diverting Fast Radix is two orders of magnitude smaller than RADULS2.