CS62 - Spring 2010 - Lecture 5

  • Problem 3.5: Write a method, indexOf, that returns the index of an object in the Vector. What should the method return if no object that is equals to the input object be found? What does java.util.Vector do in this case? How long does this operation take to perform on average?

  • What's the difference between "equals" and "=="? When should we use one or the other?
       - every class has an equals method that it inherits from Object, but it may not always do what you expect
       - if you write your own class and plan to use equals, you should rewrite the equals method

  • We talked a bit about running time when we looked at different approaches for growing the Vector class

  • We need a way to talk about that computational cost of an algorithm that focuses on the essential parts and ignores details that are not relevant and that is somewhat agnostic to the underlying hardware

  • Asymptotic notation
       - Precisely calculating the actual steps of a method is tedious and not generally useful
       - Different operations take different amounts of time. Even from run to run, things such as caching, etc. will complicate things
       - Want to identify categories of algorithmic runtimes
       - Compare different algorithms:
          - method1 takes n^2 steps
          - method2 takes 3n + 1 steps
          - method3 takes 2n + 100 steps
       - Which algorithm is better? Is the difference between method2 and method3 important/significant?

  • Big-O: O(g(n))
       - O(g(n)) = {f(n): there exists positive constants c and n_0 such that 0 <= f(n) <= cg(n) for all n >= n_0}
          - we have a function f(n) and we're trying to upper found it by g(n), i.e. O(g(n))
          - we can linearly scaled g(n)
          - and eventually (i.e. for large enough n) that scaled version of g(n) is ALWAYS larger than f(n)

  • draw some picturess

  • What are some examples?
       - n is O(n)
       - n is O(n^2)
       - 2n + 20 is O(n)
       - 5n^3 + 100n^2 + 5000 is O(n^3)
       - 50 n log 100 is O(n log n)

  • How does Big-O notation allow us to ignore irrelevant details?

  • Show running time table (see pg. 2 of Big-O notes from algorithms: http://www.cs.pomona.edu/~dkauchak/classes/algorithms/lectures/2_BigO.pdf)

  • look at search1 method in BinarySearchExamples code
       - what does this method do?
       - what is the running time, using Big-O notation?
          - depends! sometimes an method/approach always has the same running time
             - look at search1a method in BinarySearchExamples code
          - In general, we'll talk about three different things for a method:
             - best case running time
             - worst case running time
             - average case running time
       - what are the best, worst and average case running times
          - best: O(1), constant, when the example is the first element
          - worse: O(n), linear, when the example is the last element
          - average: O(n), linear, on average, we'll need to traverse half of the elements (~n/2) which is still O(n)

  • look at search2 method in BinarySearchExamples code
       - what does this method do?
          - uses a helper method
          - does the same thing as the previous, but using recursion
       - which version is better?
       - what is the running time, using Big-O notation?
          - same as for the iterative version

  • can we do better than either of these procedures?
       - without any preprocessing of the data, no

  • number guessing game

  • what if I told you the data was in sorted order?
       - how do you find information in a phonebook (where the data is in essence, sorted)?

  • show binarySearch method in BinarySearchExamples code
       - what does the code do?
          - keeps a low and a high value
          - we know that if findMe is in the array, then nums.get(low) <= findMe <= nums.get(high)
             - this is what's called a loop invariant
             - it's always true throughout the the loops lifetime
          - picks the middle element between low and high
          - compares that middle element to our value and then either finds the data or DISCARDS HALF OF THE REMAINING DATA
       - an example
          - 1, 3, 7, 15, 16, 18, 21, 40, 45, 50
       - what's the running-time?
          - best case: O(1) it's the midpoint in the ArrayList
          - worst case: not found
             - how many times do we iterate through the while loop?
             - let's consider the case where the number of elements is a power of 2
                - we could always pad it up do the next largest power of 2
             - at each iteration we throw away half of the data, n/2, n/4, n/8
             - when will it be done?
                - when n/2^i = 1
                log n/2^i = log 1
                log n - log 2^i = 0
                log n - 2 log i = 0
                log n = 2 log i
                i = log_2 n
             - runtime is O(log_2 n)
          - average case: half as many iterations through the while loop... still O(log_2 n)

  • how would we write a recursive version?
       - we'd need a helper method
       - rather than keeping low and high as variables, they'll be parameters
       - then, rather than adjust them, we just call recursively call our method with smaller values

  • show binarySearchRecursive in BinarySearchExamples code

  • why is binary search useful? We'll find out next time the cost to sort data is O(n log n), which is more expensive than just the linear search.

  • Last comments on binary search
       - It’s easy to get indices wrong, so be careful!
          - "Although the basic idea of binary search is comparatively straightforward, the details can be surprisingly tricky...” Professor Donald Knuth (taken from http://en.wikipedia.org/wiki/Binary_search_algorithm)
       - http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearyly.html
       - How hard is it? http://portal.acm.org/citation.cfm?doid=358476.358484, only accessible on campus
       - What if we wanted to return the first one in the list?

  • methods in the different Big-O categories:
       - O(1), constant time: regardless of the size of the input, there is a fixed amount of
    work
          - add two 32 bit numbers
          - determine if a number is even or odd
          - sum the first 20 elements of an array
       - O(log n) - logarithmic: at each iteration of the algorithm, discard some proportion of
    the input (often half)
       - O(n) - linear: do some constant amount of work on each element of the input
          - find an item in an Vector (or ArrayList)
          - determine the largest element in the array
       - O(n log n): divide and conquer algorithms with a linear amount of work to
    recombine
          - sorting a list of numbers (with some algorithms)
          - FFT
       - O(n^2): double nested loops that iterate over the data
          - matrix addition
          - some other sorting algorithms
       - O(2^n)
          - Enumerate all possible subsets
          - Traveling salesman using dynamic programming
       - O(n!)
          - Enumerate all permutation
          - determinant of a matrix with exansion by minors