CS62 - Spring 2011 - Lecture 37

  • A few random things:
       - Use a public variable for empty. In BinaryTree.cpp, you can declare a public variable at the top:

          BinaryTree* empty = new BinaryTree();

          There is only one instance of this variable (i.e. only one empty tree) and we can use it throughout the code.
       - In lab, we looked at an attempt to do a linked list. What was the problem?
          - Hard to do without a node class
          - Encapsulate the linked list behavior inside a class and only provide add, get, set methods so we can control the data structure
          - As a reminder, look at linkedlistimproved.cpp code
       - Graders for CS (see previous e-mail). Applications due Sunday.
       - CS liaisons

  • random thing in C++ for the day :)
       - arrays in C++
          - arrays in C++ inherit most of their functionality (and weirdness) from C
          - my advice... just don't use them... use the vector class and call it a day
          - If for some reason you come across them...
             - int myArray[50];
                - allocates an array of ints with 50 elements in it
                - the [] have to come AFTER the variable name (unlike in Java where either before or after the variable is fine)
                - string myArray[50], etc.
             - What is an array in Java?
                - reference
                   - one hint is that we use "new" to create it, so it is probably created on the heap
                - how could you check this?
                   - I've told you that Java always uses call-by value
                   - make an array
                   - pass it to a method
                   - change an array entry in the method
                   - see if the change is seen outside the method
             - Creating a new array does the following:
                - allocates enough memory for the, say 50, objects/items
                   - the "sizeof" function returns the size in bytes of an object, if you're curious
                      sizeof(int)
                      sizeof(string)
                      sizeof(MyClass)
                - an array is then shorthand for a pointer to the beginning of that chunk of memory
                   - when you access the array, say,
                      myArray[i]

                   this is just shorthand for:
                      *(myArray+i)

                      - add i "int"s worth to the pointer myArray (remember adding to a pointer adds the size in bytes of the object, in this case 4)
                         - if it were a different object, for example:
                            IntCell myIntCellArray[50]
                         then it would increment a different amount of memory, since an IntCell would be larger than an int
                      - the * gets the value referenced by this new pointer
                   - For example, the following are equivalent:

                   for( int i = 0; i < 50; i++ ){
                      myArray[i] = 0;
                   }

                   string* ptr;

                   for( ptr = myArray; ptr < myArray+50; ptr++ ){
                      *ptr = 0;
                   }

             - other differences
                - there is no .length member variable for arrays, and no way of telling how long an array is, so you need to pass along the length
                - there is no bounds checking, so be careful with your array indices   

       - iterators
          - what methods does an Iterator have in Java?
             - next()
             - hasNext()
             - remove() // optional
          - how are they used?
             - to traverse through a data set
          - In C++, iterators are implemented in a fashion similar to pointers
          - unlike Java, each class has it's own iterator type
             - vector<int>::iterator
             - map<int, int>::iterator
             - map<int, list<pair<int,int> > >::iterator
          - Look at vector_iterator() method in iterator.cpp code
             - we start out at the beginning of our elements with the begin() method
                - returns an iterator at the beginning of the data to iterate through
             - we can access the elements we're iterating through via the iterator variable
                - notice that we're given a pointer to that object, so we can actually modify the object
                - "it" is LIKE a pointer, though, so you need to dereference it or use -> if you want to call methods
             - incrementing the iterator, is just like incrementing a pointer
                - it++
             - the end() method also returns an iterator that is just past the end of the data
                - we use this to see if we've iterated through all of the data
          - just to make it clear, iterators are NOT pointers, but by using operator overloading, they're made to function like operators   
          - What does map_iterator() method do in iterator.cpp code ?
             - first, we create a map object
                - the key is an int
                - the value is a pair of ints
             - next, we add things to that object
                - the key is i
                - the value is (i, i)
             - finally, we make an iterator and traverse the data
                - again, each different type has a different iterator type
                - the iterator for a map acts like a pointer to a key/value pair
                - it->first is the key
                - it->second is the value (which is itself a pair)
          - look at map_iterator(const ...) method in iterator.cpp code
             - often we want to pass objects by constant references
             - in this case, we can't use a normal iterator!
                - a normal iterator would allow us to modify the values
             - instead, we use a const_iterator
             - almost all classes that have iterators also implement const_iterator


  • finding cycles
       - given an undirected graph, how can we determine if it has a cycle in it?
          - or, given an undirected graph, determine that it is not a tree
       - what is the definition of a cycle?
          - a simple path, where the endpoints are the same
       - idea:
          - start at a node, go down a path
             - stop when either we find a vertex on the path that we've already seen
             - or when we hit a dead-end
          - if we hit a dead-end, backtrack and find another path
          - if we visit all of the nodes, without finding a repeat vertex, it's acyclic
       - does this sound like anything we've seen before?
          - depth first search!

       void dfs(vertex u, visited) {
          if(!visited(u)){    
             visited.add(u);
             
             for (v: neighbors of u){
                if (!visited(v)){
                   dfs(v, visited);
                }
             }
          }
       }

       - what modifications need to be made?
          - if we visit a node that we've already visited, then we've found a cycle
          - what about where we just came from?
             - need to know where we came from so we can avoid calling that a cycle
          - want to return true if we find a cycle, false otherwise

       bool dfsCycle(vertex u, vertex parent, visited) {
          bool result = false;
          visited.add(u);

          for(v: neighbors of u){
             if(!visited(v)){
                result = result || dfsCycle(v, u);
             } else if (v != parent){
                result = true;
             }
          }

          return result;
       }

       - observations:
          - what does it do?
             - runs depth first search
             - if it finds a visited node that was not it's parent (i.e. a cycle) returns true
             - otherwise, false
          - how is this different from DFS that we saw before?
             - we have the additional else if to see if we've found a cycle
             - why do we need the parent as a parameter?
                - so we can distinguish finding a visited node in a cycle vs. a visited node where we just came from
          - what does "result = results || dfsCycle(v, u)" do?
             - if we find a cycle (i.e. dfsCycle returned true below), then result will be set to true and we will eventually return true
       - walk through an example
       - let's try and actually implement our boolean cycle detector
          - how can we represent a vertex?
             - simplest is just use a number, i.e. an int
          - we'll use an adjacency list representationation
             - in C++ there is a "list" class in the STL library
          - we have a few options for declaring the graph type:
             - what if we wanted to use a vector to store the vertices?
                - vector<list<int> > graph
                - what is one downside to this approach?
                   - assumes the vertices are sequential, that is 0, 1, 2, ...
             - what is another option?
                - map<int, list<int> >
             - what if we wanted to add weights?
                - map<int, list<pair<int, int> > >
          - look at dfs_hasCycles in graph_algorithms.cpp code
             - what does "list<int> nbrList = adjMap.find(v)->second" do?
                - get's the adjacency list associated with vertex v
                - why the "->second"?
                   - recall, the map iterator returns a pair
                   - "->first" would give us the key (in this case, just v)
                - why can't we write "list<int> nbrList = adjMap[v]"?
                   - the operator[] is not a const method
                   - it can be used to change the map
                      adjMap[v] = ...
             - use an iterator to iterate thorough the neighbor list
             - what does "visited.find(*nbr) == 0" do?
                - checks to see if *nbr is in the list
             - note the recursive call to dfs_hasCycles
          - look at grop_hasCycles in graph_algorithms.cpp code
             - takes a graph
                - why passed by reference?
                   - to avoid copying
             - the "set" class is useful for keeping track of which nodes we've visited
             - why do we have the for loop?
                - graphs may not be connected!
                - still could have a cycle though
                - need to make sure we've visited all possible sections of the graph when looking for cycles
             - notice the use of a const_iterator (and the type of that iterator)
          - will be implementing a version of this where you actually return the cycle