CSCI 256
Design and analysis of algorithms
Solutions to Assignment 4

Due Wednesday, 2/28/2001

Numbered problems are from Cormen, Leiserson, and Rivest. You should do all of the problems, but only turn in those from the second section.

Practice problems:

Prove that any binary tree with height d has at most 2^d leaves.

Proof: Proof by induction on d. If d = 0 then tree has at most 1 = 2⁰ nodes.

Suppose true for d, show for d+1. Let T have height d+1. Let T' be the tree resulting from T by removing all of the leaves at level d+1. By induction, T' has at most 2^d leaves, and therefore at most 2^d at level d. If add the erased leaves back as descendents of newly uncovered leaves, then each newly uncovered leave disappears, but has at most 2 children. Thus the total number of leaves in T is at most 2*2^d = 2^d+1.
Suppose you are given 12 coins, exactly one of which is counterfeit. The counterfeit coin is either lighter or heavier than the other coins, which all weigh the same amount. Using only a balance scale, please show how to find the counterfeit coin using only 3 weighings. Please express your algorithm as a decision tree. This problem will be discussed in lecture on Friday.

Solution: The algorithm is most easily expressed via a decision tree, but that is hard to draw in html!. Label the coins from 1 to 12.
1. Put coins 1 to 4 in one basket and 5 to 8 in the other.
2. If the left side is light then
  1. Compare {1,2,5} with {3,4,6}
  2. If the left side is light then compare 1 vs. 2
    1. If the left side is light then coin 1 is light
    2. If they are equal then 7 is heavy
    3. If the right side is light then coin 2 is light
  3. If they are equal then compare 6 vs. 7. The heavy side contains the bad coin and it is heavy.
  4. If the right side is light then proceed similarly to left side light.
3. If they are equal then compare {9,10} vs {11,1} (note know 1 is good!)
  1. If the left side is light then
    1. Compare 9 and 10
    2. If they are equal then 11 is the bad coin and it is heavy.
    3. Otherwise the lightest side contains the bad coin and it is light.
  2. If they are equal then compare 12 and 1. 12 is the bad coin and it is light if lighter than 1 and heavy if heavier than 1.
  3. If the right side is light then behave similar to left side light
4. If the right side is light then behave similarly to left side light.
In class, we found an iterative algorithm to find the min and max of an array in ceiling(3n/2)-2 comparisons. Find a recursive divide and conquer algorithm (i.e., start by making recursive calls to the first and second halves of the list and then ...) to find the min and max. Write down a recurrence relation for the number of comparisons, T(n), and show that T(n) <= ceiling(3n/2) - 2 for n >= 2. You need only show the bound holds for n a power of 2 (which means you can ignore the ceiling).

Solution: If the list has 1 element then return it as both max and min. If has 2 elements then compare them and return the largest as max and smallest as min. When the number of elements is > 2: Split the list in half and find the max and min of left half and right halves of the list. Compare the two maxs and return the largest, compare the two mins and return the smallest.

T(1) = 0, T(2) = 1, T(n) = 2T(n/2) + 2.
```
    T(n) = 2T(n/2) + 2
         = 2(2T(n/(2²))+2)+2 
         = 2²T((n/(2²)) + 2² + 2
     ... = 2^kT((n/(2^k)) + 2^k + ... + 2
     
```
Let n = 2^k+1. Then
```
    T(n) = (n/2)T(2) + (n/2 + n/4 + ... + 2)
         = (n/2) + (n-2) = 3n/2 -2
    
```
It is easy to prove this for all n > 1 that are powers of 2 by induction on k. (It clearly holds for n = 2.)
Design an algorithm to determine whether two arrays of real numbers are disjoint (i.e., there is no overlap between the elements of the first and the second). State the complexity of your algorithm in terms of the sizes m and n of the given sets. Be sure to consider the case where m is substantially smaller than n.

Solution: Suppose m < n. Let A be the smaller list and B the larger. Sort list A (the smaller) in time O(m lg m). For each of the elements in list B, do a binary search of A to see if the B element is in A. If you find it, return not disjoint. If don't find any, then return disjoint.

The cost of the sort is O(m lg m). Each binary search is lg m, and there are n of them, giving n lg m. Therefore the total cost is O(m lg m) + O(n lg m) = O((n+m) lg m). Notice that it would be more expensive if we sorted the larger list!

Problems to be turned in:

Prove that any comparison based search algorithm must have running time at least Omega(lg n), even if the input is held in a sorted array! Hint: Emulate the proof for comparison based sorts. Of course, binary search is an example of a comparison-based search with running time Theta(lg n)

Solution: Like the proof of lower bound for comparison-based sorts. The decision tree for searching has at least n leaves. Therefore the height of the tree must be at least lg n.
The input to the following problems is an array S containing n real numbers, and a separate real number x.
1. Design an algorithm to determine whether there are two elements of S whose sum is exactly x. The algorithm should run in time O(n lg n) (and you should provide an argument that it does!).
  
  Solution: Sort the elements of S. For each element y of S, perform a binary search for x-y. Sorting takes O(n lg n), each of n possible searches takes O(lg n). Therefore O(n lg n) + O(n lg n) = O(n lg n). Alternatively use the solution below after sorting.
2. Suppose now that the set S is given in a sorted order. Design an algorithm to solve the problem in time O(n). Again argue that it works in that time.
  
  Solution: Start at the top and bottom of the array and take the sum of those elements. If the sum is x then return true. If the sum is less than x, increase the lower index and continue [this is reasonable as the lower index can never be part of a sum to x since it is being added to the largest element in the list]. If the sum is greater than x, decrease the upper index and continue [valid for similar reasons]. Stop when the upper and lower bounds meet. At that point return false.
Problem 8-4 on page 168 of CLR. For part a, an informal argument as to why the algorithm works the same as recursive quicksort is fine. I'd like a convincing argument for part c (and be sure to maintain the Theta( n lg n ) expected running time).

(a) Solution: The algorithm is exactly the same as the fully recursive quicksort because after finishing the recursive call of quicksort, it then resets the variables p indicating the beginning of the array and then goes back and restarts the procedure so that it is sorting the second half of the array. This hand-waving argument can be made precise via a proof by induction on the size of r-p.

(b) Solution: It is easy to get this backwards! Each recursive call pushes an activation record onto the stack and each return pops it off. Thus the worst case happens when lots of recursive calls stack up. This happens when the partition divides the array with a very large first half and small second half. Thus the worst case (presuming the partition element is always the first element in the array) occurs when the original list is in reverse order. Thus, each time through the partition element is put in the last slot of the subarray being worked on. In particular, with a list of size n, n-1 recursive calls are made before obtaining a list of size 1 which terminates the recursion.

(c) Solution: Based on the above discussion, it seems clear that you want to make the recursive calls to as small a subarray as possible. Thus change the algorithm so that immediately after partition, it compares the sizes of the left and right sides and makes a recursive call to the smallest, leaving the larger to be handled upon return. (I won't write out the algorithm, but it should be easy to write down from this description.)

Notice this solution takes the same amount of time (we just add an extra comparison of integers each time through). I claim the largest the stack can get is Theta(n) (where the constant is just the size of the activation records) because each time we make a recursive call, the size of the subarray is 1/2 or less of the array working on. Thus if the initial call is to an array of size n, the next is to n/2 or less, the next to n/4 or less, etc. As we saw earlier, you can only cut a number in half floor(lg n) times before you get down to 1. Thus the stack of activation records waiting to be resumed includes at most lg n activation records.

Back to:

CS256 home page

Kim Bruce's home page

CS Department home page

kim@cs.williams.edu

CSCI 256 Design and analysis of algorithms Solutions to Assignment 4

Due Wednesday, 2/28/2001

Practice problems:

Problems to be turned in:

CSCI 256
Design and analysis of algorithms
Solutions to Assignment 4