CS150 - Fall 2012 - Class 10

  • Pi video
       http://www.youtube.com/watch?v=jG7vhMMXagQ

  • quiz problem 1c.

  • admin
       - Lab on Friday can be done with a partner
          - must both be there when you're working on it
          - should only be working on one computer
          - just submit one file with both your names on it
       - Test project 1 out soon
          - honor code
             - must work alone
             - may only use: book, your notes, class notes, python.org documentation
             - may NOT: get help from other students, get help from the tutors (except for file issues, etc), look online for solutions
          - 3 problems
          - required to do some extra credit
             - 63 points total, but only 60 for just doing what I stated
          - More than a third of the points come from code style and commenting
          - follow instructions carefully!

  • lists are objects and therefore have methods. What methods might we want?
       - http://docs.python.org/tutorial/datastructures.html

       - append: add a value on to the end of the list
          >>> my_list = [15, 2, 1, 20, 5]
          >>> my_list.append(100)
          >>> my_list
          [15, 2, 1, 20, 5, 100]

          - notice that append does NOT return a new list, it modifies the existing list!

       - pop: remove a value off of the end of the list and return it
          >>> my_list.pop()
          100
          >>> my_list
          [15, 2, 1, 20, 5]
          
          - notice that it both modifies the list and returns a value
          - if you want to use this value, you need to store it!
             >>> x = my_list.pop()
             >>> x
             5
          - pop also has another version where you can specify the index
       
             >>> my_list = [15, 2, 1, 20, 5]
             >>> my_list.pop(2)
             1
             >>> my_list
             [15, 2, 20, 5]
       - insert: inserts a value at a particular index
          >>> my_list = [15, 2, 1, 20, 5]
          >>> my_list.insert(2, 100)
          >>> my_list
          [15, 2, 100, 1, 20, 5]

          - again, lists are mutable, so insert does not return a new list, but modifies the underlying one
       - sort
          >>> my_list = [15, 2, 1, 20, 5]
          >>> my_list.sort()
          >>> my_list
          [1, 2, 5, 15, 20]
          >>> my_list = ["these", "are", "some", "words", "to", "sort"]
          >>> ["these", "are", "some", "words", "to", "sort"].sort()
          >>> my_list = ["these", "are", "some", "words", "to", "sort"]
          >>> my_list.sort()
          >>> my_list
          ['are', 'some', 'sort', 'these', 'to', 'words']

  • lists are mutable
       - what does that mean?
          - we can change (or mutate) the values in a list
       
       - we can mutate lists with methods, but we can also change particular indices
       
          >>> my_list = [15, 2, 1, 20, 5]
          >>> my_list
          [15, 2, 1, 20, 5]
          >>> my_list[2] = 100
          >>> my_list
          [15, 2, 100, 20, 5]

  • back to our grades program: look at scores-lists.py code
       - there is a function called get_scores. That gets the scores and returns them as a list. How?
          - starts with an empty list
          - uses append to add them on to the end of the list
          - returns the list when the loop finishes
       - average function
          - has a single parameter, but this parameter will represent a list
          - inelegant_average
             - calculates the sum and divides by the number of entries
                - uses a for loop to iterate over the values
                - often, we'll use something besides "i" as a variable name that makes our program more readable
          - is there a better way to do this?
             - look at fancy_average
                - us the sum function over lists
       - median function
          - sorts the values
             - notice again that sort does NOT return a value, but sorts the list that it is called on
          - returns the middle entry


  • aliasing
       - what will be the output of my_list after doing the following:

          >>> my_list = [1, 2, 3, 4, 5]
          >>> other_list = my_list
          >>> other_list[2] = 100
          >>> other_list
          [1, 2, 100, 4, 5]
          >>> my_list

       - [1, 2, 100, 4, 5] ... why?
          - my_list and other_list are just references to the SAME object
             - this is called aliasing, since other_list is an alias (another name) for my_list
          - saying other_list = my_list does not do a deep copy, that is it does NOT create a new list that is a copy of the list
          - draw a picture
       
       - notice that if I make changes to either one, changes will be seen in the other
          >>> my_list
          [1, 2, 100, 4, 5]
          >>> other_list
          [1, 2, 100, 4, 5]
          >>> my_list[0] = 0
          >>> other_list[1] = 1000
          >>> my_list
          [0, 1000, 100, 4, 5]
          >>> other_list
          [0, 1000, 100, 4, 5]      

       - aliasing can also show up in other places
          def mystery(x):
             x[0] = 1000

          >>> my_list = [1, 2, 3, 4, 5]
          >>> my_list
          [1, 2, 3, 4, 5]
          >>> mystery(my_list)
          >>> my_list
          [1000, 2, 3, 4, 5]

       - parameters are passed as a shallow copy (i.e. an alias)
          - "parameter passing" describes how the values that are input to the function (i.e. the arguments) are bound to the parameters inside the function
          - be careful!
          - why do you think this is done?
             - a deep copy can be a lot of work
             - also allows us to write functions that manipulate the parameter (which we may or may not do)
          - notice that we cannot change what other_list reference (only mutate the object)
          
             def mystery(alist):
                alist = [0]*10
                print alist

             >>> my_list = [1, 2, 3, 4, 5]
             >>> mystery(my_list)
             [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
             >>> my_list
             [1, 2, 3, 4, 5]      

       - slicing does create a new copy
          >>> my_list = [1, 2, 3, 4, 5]
          >>> other_list = my_list[2:4]
          >>> other_list
          [3, 4]
          >>> other_list[0] = 100
          >>> other_list
          [100, 4]
          >>> my_list
          [1, 2, 3, 4, 5]
       
       - given this, how could we create a deep copy of other_list?
          >>> my_list = [1, 2, 3, 4, 5]
          >>> other_list = my_list[:]
          >>> other_list[3] = 100
          >>> other_list
          [1, 2, 3, 100, 5]
          >>> my_list
          [1, 2, 3, 4, 5]


  • run the sentence_stats function from word-stats.py code
       - similar idea to our scores functions except now we're going it over strings instead of numbers
       - the string class has a "split" method that splits up a sentence into a list by splitting on spaces
          
          >>> "this is a sentence".split()
          ['this', 'is', 'a', 'sentence']

       - optionally, can specify what to split on (though this is much more rare)

          >>> "this is a sentence".split("s")
          ['thi', ' i', ' a ', 'sentence']

  • files
       - what is a file?
          - a chunk of data stored on the hard disk
       - why do we need files?
          - hard-drives persist state regardless of whether the power is on or not
          - when a program is running, all the data it is generating/processing is in main memory (e.g. RAM)
             - main memory is faster, but doesn't persist when the power goes off

  • reading files
       - to read a file in Python we first need to open it

          file = open("some_file_name", "r")

          - open is another function that has two parameters
          - the first parameter is a string identifying the filename
             - be careful about the path/directory. Python looks for the file in the same directory as the program (.py file) unless you tell it to look elsewhere
          - the second parameter is another string telling Python what you want to do with the file
             - "r" stands for "read", that is, we're going to read some data from the file
          - open returns a "file" object that we can use later on for reading purposes
             - above, I've saved that in a variable called "file", but I could have called it anything else

             >>> open("english.txt", "r")
             <open file 'english.txt', mode 'r' at 0x10120a030>
             >>> type(open("english.txt", "r"))
             <type 'file'>

       - once we have a file open, we can read a line at a time from the file using a for loop:

          for <variable> in <file_variable>:
             # do something

          - for each line in the file, the loop will get run
          - each time the variable will get assigned to the next line in the file
             - the line will be of type string
             - the line will also have an endline at the end of it which you'll often want to get rid of (the strings strip() method is often good for this)
       
  • look at the file_stats function in word-stats.py code
       - what does it do?
          - opens a file
          - reads a line at a time
          - appends each entry in the file to a list called words (stripping of the end of line)
          - prints out the statistics of the word file

       - in this same directory I have a file call "english.txt" that has a large list of English words

          >>> file_stats("english.txt")
          Number of words: 47158
          Longest word: antidisestablishmentarianism
          Shortest word: Hz
          Avg. word length: 8.37891768099

          - notice how quickly it can process through the file
             - computers are fast!