- Can we do better than O(n log n) for sorting? - What if I told you that the data was between 1 and k? CountingSort1(A, k) make an array of booleans of size k, call it B initialize all entries in B to false for i from 1 to A.length B[A[i]] = true count = 1 for i from 1 to k if B[i] A[count] = i count++ - What is the running time of this algorithm? - \Theta(k + n) - Is this better than O(n log n)? - only if k is smaller than O(n log n) - e.g. if k is O(n), then \Theta(n) - Is it correct? - what does it assume about the data? - assumes the data is unique - how could you relax this constraint? - memory usage? - O(k) - The problem with counting sort is that it is O(k). If we're dealing with a large range of numbers k can be much larger than n - radix sort - let's say we're trying to sort the following COW DOG RUG ROW TAG BOX BAR DIG TAB TAR - how can we do it? can we do it a digit at a time? - idea 1: - bin them based on the first digit - sort the first digit - repeat C OW D OG IG R UG OW T AG AB AR - etc - what is the running time? - use counting sort on each digit - Worst case: - the all start with the same prefix - \Theta(d(n + k)) - Average case: - still \Theta(d(n+k)) - how hard is it to implement? - keeping track of the different bins is challenging - idea 2: - start at the lowest digit and sort them - then move up through the digits COW DOG RUG ROW TAG BOX BAR DIG TAB TAR TAB DOG RUB TAG DIG BAR TAR COW ROW BOX TAB TAG BAR TAR DIG DOG COW ROW BOX BAR BOX COW DIG DOG ROW TAB TAG TAR - what is the running time? - \Theta(d (n + K)) - what does it require of the sorting algorithm? - that it is stable! - memory usage? - depends on the intermediate sorting algorithm!