CS136, Lecture 18

      1. QuickSort
      2. Comparing Sorts
  1. Queues
    1. Interface
    2. Linked List Implementation
    3. Vector implementation
    4. Clever array implementation

QuickSort

There is one last divide and conquer sorting algorithm: Quicksort.

While mergesort divided in half, sorted each half, and then merged (where all work is in the merge), Quicksort works in the opposite order.

That is, Quicksort splits the array (with lots of work), sorts each part, and then puts together (trivially).

/** 
  POST -- "elementArray" sorted into non-decreasing order  
**/
public void quicksort(Comparable[] elementArray)
{
    Q_sort(0, elementArray.length - 1, elementArray);   
}

/**
  PRE -- left <= right are legal indices of table.            
  POST -- table[left..right] sorted in non-decreasing order
**/
protected void Q_sort (int left, int right, Comparable[] table)
{
    if (right >= left)   // More than 1 elt in table
    {
        int pivotIndex = partition(left,right,table);
        // table[Left..pivotIndex] <= table[pivotIndex+1..right]  
        Q_sort(left, pivotIndex-1, table);  // Quicksort small elts
        Q_sort(pivotIndex+1, right, table); // Quicksort large elts
    }
}

If partition works then Q_sort (and hence quicksort) clearly works. Note it always makes a recursive call on a smaller array (easy to blow so doesn't and then never terminates).

Partition is a little trickier. Algorithm below starts out by ensuring the elt at the left edge of the table is <= the one at the right. This allows guards on the while loops to be simpler and speeds up the algorithm by about 20% or more. Other optimizations can make it even faster.

/**
    post: table[left..pivotIndex-1] <= pivot 
            and pivot <= table[pivotIndex+1..right]  
**/
protected int partition (int left, int right, 
                                        Comparable[] table)
{
        Comparable tempElt;         // used for swaps
        int smallIndex = left;      
            // index of current posn in left (small elt) partition
        int bigIndex = right;           
            // index of current posn in right (big elt) partition
        
        if (table[bigIndex].lessThan(table[smallIndex]))    
        {   // put sentinel at table[bigIndex] so don't 
            // walk off right edge of table in loop below
            tempElt = table[bigIndex];
            table[bigIndex] = table[smallIndex];
            table[smallIndex] = tempElt;
        } 
        
        Comparable pivot = table[left]; // pivot is fst elt 
        // Now table[smallIndex] = pivot <= table[bigIndex]
        do
        {
            do                          // scan right from smallIndex 
                smallIndex++;   
            while (table[smallIndex].lessThan(pivot));

            do                          // scan left from bigIndex
                bigIndex--;
            while (pivot.lessThan(table[bigIndex]));
            
            // Now table[smallIndex] >= pivot >= table[bigIndex]
             
            if (smallIndex < bigIndex)   
            {   // if big elt to left of small element, swap them
                tempElt = table[smallIndex]; 
                table[smallIndex] = table[bigIndex];
                table[bigIndex] = tempElt;
            } // if 
        } while (smallIndex < bigIndex); 
        // Move pivot into correct pos'n bet'n small & big elts
        
        int pivotIndex = bigIndex;    // pivot goes where bigIndex got stuck
        
        // swap pivot elt w/small elt at pivotIndex
        tempElt = table[pivotIndex];            
        table[pivotIndex] = table[left];    
        table[left] = tempElt;
        
        return pivotIndex;  
    }

The basic idea of the algorithm is to start with smallIndex and bigIndex at the left and right edges of the array. Move each of them toward the middle until smallIndex is on a "big" element (one >= than pivot) and bigIndex is on a small one. As long as the indices haven't crossed (i.e. as long as smallIndex < bigIndex) swap them so that the small elt goes on the small side and the big elt on the big side. When they cross, swap the rightmost small elt (pointed to by bigIndex) with the pivot element and return its new index to Q_sort. Clearly at the end,

        table[left..pivotIndex-1] <= pivot <= table[pivotIndex+1..right]
The complexity of QuickSort is harder to evaluate than MergeSort because the pivotIndex need not always be in the middle of the array (in the worst case pivotIndex = left or right).

Partition is clearly O(n) because every comparison results in smallIndex or bigIndex moving toward the other and quit when they cross.

In the best case the pivot element is always in the middle and the analysis results in
O(n log n), exactly like MergeSort.

In the worst case the pivot is at the ends and QuickSort behaves like SelectionSort, giving O(n2).

Careful analysis shows that QuickSort is O(n log n) in the average case (under reasonable assumptions on distribution of elements of array). (Proof uses integration!)

Comparing Sorts

Compare the algorithms with real data:

Complexity   100 elts    100 elts    500 elts    500 elts    1000 elts   1000 elts   
            unordered   ordered     unordered   ordered     unordered   ordered     
Insertion   0.033       0.002       0.75        0.008       3.2         .017        
Selection   0.051       0.051       1.27        1.31        5.2         5.3         
Merge       0.016       0.015       0.108       0.093       0.24        0.20        
Quick       0.009       0.044       0.058       1.12        0.13        4.5         

Notice that for Insertion or Selection sorts, doubling size of list increases time by 4 times (for unordered case), whereas for Merge and Quick sorts bit more than doubles time. Calculate (1000 log 1000) / (500 log 500) = 2 * (log 1000 / log 500) ~ 2 * (10/9) ~ 2.2

Queues

Queues are FIFO (first in-first out) structures. Applications include:


Interface

public interface Queue extends Linear {
    public void enqueue(Object value);
    // post: the value is added to the tail of the structure

    public Object dequeue();
    // pre: the queue is not empty
    // post: the head of the queue is removed and returned

    public void add(Object value);
    // post: the value is added to the tail of the structure

    public Object remove();
    // pre: the queue is not empty
    // post: the head of the queue is removed and returned

    public Object peek();
    // pre: the queue is not empty
    // post: the element at the head of the queue is returned

    public int size();
    // post: returns the number of elements in the queue.

    public void clear();
    // post: removes all elements from the queue.
    
    public boolean isEmpty();
    // post: returns true iff the queue is empty
}

Linked List Implementation

We can easily hold the queue as a linked list where we keep pointers to the front and rear of the queue.

Which way should pointers go? From front to rear or vice-versa?
If the pointers point from the rear to the front, it will be quite difficult to remove the front element. As a result, we orient them to point from the front to the rear.

All of the operations are O(1).

The text and structures package implementation uses a doubly-linked list (typo resulted in actual code for singly-linked list). The doubly-linked list wastes space (sort of like our using a doubly-linked circular list for Josephus when a singly-linked list would have sufficed) unnecessarily.

Vector implementation

Can put queue in a Vector with head at index 0 and tail to the right. Addition of new elements is done using the addElement method, while deletions set the head slot to null and move head one place to the right.

The main problem with this is that deletions and additions to the queue will result in elements removed from the left side and added to the right side. Thus the queue will "walk" off to the right over time, even if it remains the same size (e.g., imagine what happens after 100 adds and 100 removes). While the vector will keep growing to compensate, the Vector will keep growing and each time it grows, it will cost twice as much.

Alternatively we could change the deletions so that the element at the head (0 index) is actually removed. However this would make the remove method O(n).

Clever array implementation

Also have array implementation, but bit trickier!

Suppose we can set an upper bound on the maximum size of the queue.

How can we solve the problem of the queue "walking" off one end of the array?

Instead we try a 'Circular' Array Implementation w/ "references" (subscripts) referring to the head and tail of the list.

We increase the subscripts by one when we add or remove an element from the queue. In particular, add 1 to front when you remove an element and add 1 to rear when you add an element. If nothing else is done, you soon bump up against the end of the array, even if there is lots of space at the beginning (which used to hold elements which have now been removed from the queue).

To avoid this, we become more clever. When you walk off one end of the array, we go back to beginning. Use

    index = (index + 1) mod MaxQueue 
to walk forward. This avoids the problem with falling off of the end of the array.

Exercise: Before reading futher, see if you can figure out the representation of a full queue and empty queue and how you can tell them apart.

Notice that the representation of a full queue and an empty queue can have identical values for front and rear.

The only way we can keep track of whether it is full or empty is to keep track of the number of elements in the queue. (But notice we now don't have to keep track of the rear of the queue, since it is count-1 after front.)

There is an alternative way of keeping track of the front and rear which allow you to determine whether it is full or empty without keeping track of the number of elements in the queue. Always keep the rear pointer pointing to the slot where the next element added will be put. Thus an empty queue will have front = rear. We will say that a queue is full if front = rear + 1 (mod queue_length). When this is true, there will still be one empty slot in the queue, but we will sacrifice this empty slot in order to be able to determine whether the queue is empty or full.

public class QueueArray implements Queue
{
    protected Object data[];    // array of the data
    protected int head;         // next dequeue-able value
    protected int count;        // # elts in queue

    public QueueArray(int size)
    // post: create a queue capable of holding at most size values.
    {
        data = new Object[size];
        head = 0;
        count = 0;
    }

    public void enqueue(Object value)
    // post: the value is added to the tail of the structure
    {
        Assert.pre(!isFull(),"Queue is not full.");
        int tail = (head + count) % data.length;
        data[tail] = value;
        count++;
    }

    public Object dequeue()
    // pre: the queue is not empty
    // post: the element at the head of the queue is removed and returned
    {
        Assert.pre(!isEmpty(),"The queue is not empty.");
        Object value = data[head];
        head = (head + 1) % data.length;
        count--;
        return value;
    }

    public Object peek()
    // pre: the queue is not empty
    // post: the element at the head of the queue is returned
    {
        Assert.pre(!isEmpty(),"The queue is not empty.");
        return data[head];
    }

    ...

    public int size()
    // post: returns the number of elements in the queue.
    {
        return count;
    }

    public void clear()
    // post: removes all elements from the queue.
    {
        // we could remove all the elements from the queue.
        count = 0;
        head = 0;
    }
    
    public boolean isFull()
    // post: returns true if the queue is at the capacity.
    {
        return count == data.length;
    }

    public boolean isEmpty()
    // post: returns true iff the queue is empty
    {
        return count == 0;
    }
}

The complexity of operations for the array implementation of the queue is the same as for the linked list implementation.

There are the same trade-offs between the two implementations in terms of space and time as with stacks above. Notice that we do not bother to set array entry for dequeued element to null. Similarly with clear. Thus the garbage collector would not be able to sweep up removed elements even if not in use elsewhere. It would probably make sense to be more consistent with Vector and clean up behind ourselves.