CS136, Lecture 14

More Sorting
1. Quicksort
Linear Structures
1. Stacks
2. Stack Implementations

More Sorting

Quicksort

There is one last divide and conquer sorting algorithm: Quicksort.

While mergesort divided in half, sorted each half, and then merged (where all work is in the merge), Quicksort works in the opposite order.

That is, Quicksort splits the array (with lots of work), sorts each part, and then puts together (trivially).

/** 
  POST -- "elementArray" sorted into non-decreasing order  
**/
public void quicksort(Comparable[] elementArray)
{
    Q_sort(0, elementArray.length - 1, elementArray);   
}

/**
  PRE -- left <= right are legal indices of table.            
  POST -- table[left..right] sorted in non-decreasing order
**/
protected void Q_sort (int left, int right, Comparable[] table)
{
    if (right >= left)   // More than 1 elt in table
    {
        int pivotIndex = partition(left,right,table);
        // table[Left..pivotIndex] <= table[pivotIndex+1..right]  
        Q_sort(left, pivotIndex-1, table);      // Quicksort small elts
        Q_sort(pivotIndex+1, right, table);     // Quicksort large elts
    }
}

If partition works then Q_sort (and hence quicksort) clearly works. Note it always makes a recursive call on a smaller array (easy to blow so doesn't and then never terminates).

Partition is a little trickier. Algorithm below starts out by ensuring the elt at the left edge of the table is <= the one at the right. This allows guards on the while loops to be simpler and speeds up the algorithm by about 20% or more. Other optimizations can make it even faster.

/**
    post: table[left..pivotIndex-1] <= pivot 
            and pivot <= table[pivotIndex+1..right]  
**/
protected int partition (int left, int right, Comparable[] table)
{
        Comparable tempElt;         // used for swaps
        int smallIndex = left;      // index of current posn in left (small elt) partition
        int bigIndex = right;       // index of current posn in right (big elt) partition
        
        if (table[bigIndex].lessThan(table[smallIndex]))    
        {   // put sentinel at table[bigIndex] so don't 
            // walk off right edge of table in loop below
            tempElt = table[bigIndex];
            table[bigIndex] = table[smallIndex];
            table[smallIndex] = tempElt;
        } 
        
        Comparable pivot = table[left]; // pivot is fst elt 
        // Now table[smallIndex] = pivot <= table[bigIndex]
        do
        {
            do                          // scan right from smallIndex 
                smallIndex++;   
            while (table[smallIndex].lessThan(pivot));

            do                          // scan left from bigIndex
                bigIndex--;
            while (pivot.lessThan(table[bigIndex]));
            
            // Now table[smallIndex] >= pivot >= table[bigIndex]
             
            if (smallIndex < bigIndex)   
            {   // if big elt to left of small element, swap them
                tempElt = table[smallIndex]; 
                table[smallIndex] = table[bigIndex];
                table[bigIndex] = tempElt;
            } // if 
        } while (smallIndex < bigIndex); 
        // Move pivot into correct pos'n bet'n small & big elts
        
        int pivotIndex = bigIndex;      // pivot goes where bigIndex got stuck
        
        // swap pivot elt w/small elt at pivotIndex
        tempElt = table[pivotIndex];            
        table[pivotIndex] = table[left];    
        table[left] = tempElt;
        
        return pivotIndex;  
    }

The basic idea of the algorithm is to start with smallIndex and bigIndex at the left and right edges of the array. Move each of them toward the middle until smallIndex is on a "big" element (one >= than pivot) and bigIndex is on a small one. As long as the indices haven't crossed (i.e. as long as smallIndex < bigIndex) swap them so that the small elt goes on the small side and the big elt on the big side. When they cross, swap the rightmost small elt (pointed to by bigIndex) with the pivot element and return its new index to Q_sort. Clearly at the end,

table[left..pivotIndex-1] <= pivot <= table[pivotIndex+1..right]

The complexity of QuickSort is harder to evaluate than MergeSort because the pivotIndex need not always be in the middle of the array (in the worst case pivotIndex = left or right).

Partition is clearly O(n) because every comparison results in smallIndex or bigIndex moving toward the other and quit when they cross.

In the best case the pivot element is always in the middle and the analysis results in
O(n log n), exactly like MergeSort.

In the worst case the pivot is at the ends and QuickSort behaves like SelectionSort, giving O(n²).

Careful analysis shows that QuickSort is O(n log n) in the average case (under reasonable assumptions on distribution of elements of array). (Proof uses integration!)

Compare the algorithms with real data:

Complxity   100 elts    100 elts    500 elts    500 elts    1000 elts   1000 elts   
            unordered   ordered     unordered   ordered     unordered   ordered     
Insertion   0.033       0.002       0.75        0.008       3.2         .017        
Selection   0.051       0.051       1.27        1.31        5.2         5.3         
Merge       0.016       0.015       0.108       0.093       0.24        0.20        
Quick       0.009       0.044       0.058       1.12        0.13        4.5

Notice that for Insertion or Selection sorts, doubling size of list increases time by 4 times (for unordered case), whereas for Merge and Quick sorts bit more than doubles time. Calculate (1000 log 1000) / (500 log 500) = 2 * (log 1000 / log 500) ~ 2 * (10/9) ~ 2.2

Linear Structures

Lists allowing insertion and deletion from multiple spots. At least from head and tail and often from inside. Linear structures are more restricted, allowing only a single add and single remove method. More restrictions on a structure generally allows for more efficient implementation of operations.

public interface Linear extends Container
{  // get size, isEmpty, & clear from Container.

    public void add(Object value);
    // pre: value is non-null
    // post: the value is added to the collection,
    //       the replacement policy not specified.

    public Object remove();
    // pre: structure is not empty.
    // post: removes an object from container
}

Look at two particular highly restricted linear structures:

Stack: all additions and deletions at same end: LIFO (last-in, first-out)

Queue: all additions at one end, all deletions from the other: FIFO (first-in, first-out)

Stacks

Stacks can be described recursively:

Stack is either empty or has top element sitting on a stack.

Here is the picture of a stack of integers:

All additions take place at the top of the stack and all deletions take place there as well.

Traditionally refer to addition as "Push" and removal as "Pop" in analogy with spring-loaded stack of trays. Here we'll use both names interchangeably:

public interface Stack extends Linear {
    public void push(Object item);
    // post: item is added to stack
    //       will be popped next if no further push

    public Object pop();
    // pre: stack is not empty
    // post: most recently pushed item is removed & returned

    public void add(Object item);
    // post: item is added to stack
    //       will be popped next if no further add

    public Object remove();
    // pre: stack is not empty
    // post: most recently added item is removed and returned

    public Object peek();
    // pre: stack is not empty
    // post: top value (next to be popped) is returned

    public boolean empty();
    // post: returns true if and only if the stack is empty

    public boolean isEmpty();
    // post: returns true if and only if the stack is empty
}

Stack Implementations

As we saw earlier, stacks are essentially restricted lists and therefore have similar representations.

Array-based implementation

Since all operations are at the top of the stack, the array implementation is now much, much better.

See on-line StackArray implementation:

public class StackArray implements Stack
{
    protected int top;
    protected Object data[];
    ...

The array implementation keeps the bottom of the stack at the beginning of the array. It grows toward the end of the array.

The only problem is if you attempt to push an element when the array is full. If so

    Assert.pre(!isFull(),"Stack is not full.");

will fail, raising an exception. Thus makes more sense to implement with Vector (see StackVector) to allow unbounded growth (at cost of occasional O(n) delays).

All operations are O(1) with exception of occasional push and clear, which should replace all entries by null in order to let them be garbage-collected.

Linked list implementation

The linked list implementation is singly-linked with references pointing from the top to the bottom of the stack.

See on-line StackList implementation.

Analyzing the implementations:

Operations: peek, pop, isEmpty all O(1) for Array, Vector, and Linked List implementations. push can be O(n) in worst case for Vector, though average is O(1), other implementations always O(1). clear O(1) for linked list, O(n) for array or vector representation.

Arrays use a fixed amount of space: this wastes space if you reserve too much, while the program won't run if there is too little.

Vector provides more flexibility, but at the cost of occasional significant delays (though average cost of push is O(1))

The linked list implementation has all operations O(1) in worst case, but needs extra space for the links.