CS136, Lecture 25

    1. Sorting with Trees
      1. Tree sort:
      2. Heap Sort
      3. Comparisons of Advanced sorts:
  1. Binary Search Trees

Sorting with Trees

Tree sort:

We can build a binary search tree (as explained in next chapter of text) and then do an inorder traversal. Since the cost of entering an element into a (balanced) binary tree of size n is log n, the cost of building the tree is

(log 1) + (log 2) + (log 3) + ... + (log n) = O(n log n) compares.

Traversal is O(n). Total cost is O(n log n) in both the best and average cases.

The worst case is if the list is in order, it then behaves more like an insertion sort, creating a tree with one long branch. This results in a tree search as bad as O(n2).

The heap sort described below is always better - since it automatically keeps the tree in balance.

Heap Sort

We build a heap with the smallest element at top (taking <= (n/2) log n compares)

Once the heap is established remove elements one at a time, putting smallest at end, second smallest next to end, etc.

In detail:

Swap top with last element, sift down, do heap sort on remaining n-1 elements.

Ex. 25, 46, 19, 58, 21, 23, 12

public void heapSort (VectorHeap aheap)
{
    int last = aheap.size()-1;      // keeps track of how much of list is now in a heap 

     // Construct the initial heap.  Push down elts starting w/parent of last elt.
    for (int index = (last-1) / 2; index >= 0; index--)
    aheap.pushDownRoot(index);

    // Extract the elements in sorted order. 
    for (int index = last; index > 0; index--)
    {
            aheap.Swap(0, index);
            aheap.pushDownRoot(0)
    }
 }
We've cheated here since pushDownRoot is not public, but could make it so

Each sift down takes <= log n steps.

Therefore total compares <= f(n,2) log n + n log n = f(3n,2) log n, in worst case. Average about same.

No extra space needed!

Actually, with a little extra work we can show that the initial "heapifying" of the list can be done in O(n) compares. The key is that we only call SiftDown on the first half of the elements of the list. That is, no calls of SiftDown are made on subscripts corresponding to leaves of the tree (corresponding to n/2 of the elements). For those elements sitting just above the leaves (n/4 of the elements), we only go through the loop once (and thus we make only two comparisons of priorities). For those in the next layer (n/8 of the elements) we only go through the loop twice (4 comparisons), and so on. Thus we make 2*(n/4) + 4*(n/8) + 6*(n/16) + ... + 2*(log n)*(1) total comparisons. We can rewrite this as n*( 1/21 + 2/22 + 3/23 + ... + log n/2log n). (Of course in the last term, 2log n = n, so this works out as above.) The sum inside the parentheses can be rewritten as Sum for i=1 to log n of (i/2i). This is clearly bounded above by the infinite sum, Sum for i=1 to infinity of i/2i. With some work the infinite sum can be shown to be equal to 2. (The trick is to arrange the terms in a triangle:

         1/2 +  1/4  +  1/8 +  1/16 + ... =  1
                1/4  +  1/8 +  1/16 + ... =  1/2
                        1/8 +  1/16 + ... =  1/4
                               1/16 + ... =  1/8
                                      ... = ..
                 -------------------------------
          Sum for i=1 to infinity of i/2i = 2
Thus n*( 1/21 + 2/22 + 3/23 + ... + log n/2log n) <= 2n, and hence the time to heapify an array is O(n).

Comparisons of Advanced sorts:

Quicksort fastest on average - O(n log n)), but bad in worst case O(n2), and takes O(log n) extra space. HeapSort takes O(n log n) in average and worst case, no extra space. MergeSort takes O(n log n) in average and worst case, O(n) extra space. All suffer from copying of large elements except insertion and merge sorts of linked lists.

Selection sort least affected since # copies is O(n), for rest is same as # compares.

Binary Search Trees

Discussed binary search trees earlier.

Definition: A binary tree is a binary search tree iff it is empty or if the value of every node is both greater than or equal to every value in its left subtree and less than or equal to every value in its right subtree.

Because it stores items in order, it is an implementation of OrderedStructure. Skip implementation of obvious methods, to focus on harder: add, get, & remove. Add protected methods locate, predecessor, and removeTop to make add, get, and remove easier.

public class BinarySearchTree implements OrderedStructure
{
    protected BinaryTreeNode root; 
    protected int count;

    public BinarySearchTree()
    // post: constructs an empty binary search tree.
    {
        root = null;
        count = 0;
    }

    public boolean isEmpty()
    // post: returns true iff binary search tree is empty.

    public void clear()
    // post: removes all elements from binary search tree

    public int size()
    // post: returns number of elements in binary search tree

    protected BinaryTreeNode locate(BinaryTreeNode subRoot,
                    Comparable value)
    // pre: subRootand value are non-null
    // post: returned: 1 - existing tree node with the desired value, or
    //                 2 - node to which value shd be added.
    {
        Comparable subRootValue = (Comparable)subRoot.value();
        BinaryTreeNode child;

        // found at root: done
        if (subRootValue.equals(value)) return subRoot;
        // look left, if less, right if more
        if (subRootValue.lessThan(value))
          child = subRoot.right();
        else
          child = subRoot.left();
        // no child there: not in tree, return this node,
        // else keep searching
        if (child == null) 
          return subRoot;
        else
          return locate(child, value);
    }

    protected BinaryTreeNode predecessor(BinaryTreeNode root)
    // pre: tree is not empty, root node has left child.
    // post: returns pointer to predecessor of root
    {
        Assert.pre(root != null, "No predecessor to middle value.");
        Assert.pre(root.left() != null, "Root has left child.");
        BinaryTreeNode result = root.left();
        while (result.right() != null)
          result = result.right();
        return result;
    }

    protected BinaryTreeNode successor(BinaryTreeNode root)
    // pre: tree is not empty, root node has right child.
    // post: returns pointer to successor of root
    {
        Assert.pre(root != null, "Tree is non-null.");
        Assert.pre(root.left() != null,"Root has right child.");
        BinaryTreeNode result = root.right();
        while (result.left() != null)
          result = result.left();
        return result;
    }

    public void add(Object val)
    // post: adds a value to the binary search tree.
    {
        BinaryTreeNode newNode = new BinaryTreeNode(val);

        // add value to binary search tree 
        // if there's no root, create value at root.
        if (root == null)
          root = newNode;
        else {
          Comparable value = (Comparable)val;
          BinaryTreeNode insertLocation = locate(root,value);
          Comparable nodeValue = (Comparable)insertLocation.value();
          // Location returned is the successor or predecessor
          // of the to-be-inserted value.
          if (nodeValue.lessThan(value))
            insertLocation.setRight(newNode);
          else {
            if (insertLocation.left() != null)
            // if value is in tree, we insert just before
              predecessor(insertLocation).setRight(newNode);
            else
              insertLocation.setLeft(newNode);
          }
        }
        count++;
    }

    public boolean contains(Object val)
    // post: returns true iff val is found within the tree
    {
        if (root == null) return false;

        BinaryTreeNode possibleLocation = locate(root,(Comparable)val);
        return val.equals(possibleLocation.value());
    }

    public Object get(Object val)
    // post: returns object found in tree, or null
    {
        if (root == null) return null;

        BinaryTreeNode possibleLocation = locate(root,(Comparable)val);
        if (val.equals(possibleLocation.value()))
          return possibleLocation.value();
        else
          return null;
    }

    public Object remove(Object val) 
    // post: removes one instance of val, if found
    {
        // remove value from a binary search tree
        // no root, just quit
        Comparable cval = (Comparable)val;

        if (isEmpty()) return null;
      
        if (val.equals(root.value())) // delete root value
        {
          BinaryTreeNode newroot = removeTop(root);
          count--;
          Object result = root.value();
          root = newroot;
          return result;
        } else {
          BinaryTreeNode location = locate(root,cval);

          if (cval.equals(location.value())) {
            count--;
            BinaryTreeNode parent = location.parent();
            if (parent.right() == location)
              parent.setRight(removeTop(location));
            else 
              parent.setLeft(removeTop(location));
            return location.value();
          }
        }
        return null;
    }

    protected BinaryTreeNode removeTop(BinaryTreeNode topNode)
    // pre: tree is not empty.
    // post: root of tree (topNode) is disconnected from tree 
    //       & new root is returned, new root has no parent.

    {
        // remove topmost BinaryTreeNode from binary search tree
        BinaryTreeNode left  = topNode.left();
        BinaryTreeNode right = topNode.right();
        // disconnect top node
        topNode.setLeft(null);
        topNode.setRight(null);
        // Case a, no left BinaryTreeNode
        //   easy: right subtree is new tree
        if (left == null)  return right; 
        // Case b, no right BinaryTreeNode
        //   easy: left subtree is new tree
        if (right == null) return left;
        // Case c, left node has no right subtree
        //   easy: make right subtree of left
        BinaryTreeNode predecessor = left.right();
        if (predecessor == null)
        {
          left.setRight(right);
          return left;
        }
        // General case, slide down left tree
        //   harder: successor of root becomes new root
        //           parent always points to parent of n
        BinaryTreeNode parent = left;
        while (predecessor.right() != null)
        {
          parent = predecessor;
          predecessor = predecessor.right();
        }
        // Assert: n is predecessor of root
        parent.setRight(predecessor.left());
        predecessor.setLeft(left);
        predecessor.setRight(right);
        return predecessor;
    }

    ...
}

Complexity of add, get, contains, and remove all proportional to height of tree. If balanced then O(log n), owise O(n) in worst case.

Can we guarantee that methods have complexity O(log n)?

AVL trees (which keep a measure of the differences of heights of subtrees in each node) guarantee remain balanced and therefore ops fast.

Splay trees need not be balanced, but average performance guaranteed O(log n) (like vector additions!).