Department of Computer Science
Middlebury College
CS 451 - Machine Learning and Big Data
Fall 2013

instructor: Dave Kauchak
e-mail: [first_initial][last_name]
office hours: MBH 635
  M/W 3-4pm
  T/Th 10-11am
  and by appointment

time: MWF 11:15-12:05pm
location: MWF - MBH 538
web page: (or go/cs451)

Computer lab help:
For computer issues in the lab (MBH 505) e-mail cshelp @ middlebury edu.


general information:

Machine learning is the field interested in discovering patterns in and learning from data. This course is an introduction to the most common problems in machine learning and to the techniques used to tackle these problems. In addition, this course will explore current tools for dealing with large amounts of data, particularly processing that cannot be handled on a single computer.


digital submission



Note: This is a tentative schedule and is subject to change
DateTopic Reading  Assignment  Misc 
9/9introduction (ppt)Daumé Ch. 1-1.3assignment 1 (solution) 
9/11decision trees (ppt)Tan Ch. 4.3-4.3.5  
9/13decision trees (ppt)Daumé Ch. 1.4-1.10assignment 2 
9/16/geometric view of data (ppt)Daumé Ch. 2 (2.4 optional)  
9/18perceptron (ppt)Daumé Ch. 3-3.3  
9/20advanced perceptron (ppt)Daumé Ch. 3.5-3.7  
9/23features (ppt)Daumé Ch. 4-4.4assignment 3 
9/25feature pre-processing (ppt)
9/27evaluation (ppt)Daumé Ch. 4.5-4.8  
9/30imbalanced data (ppt)Daumé Ch. 5-5.1assignment 4 
10/2multiclass classification (ppt)Daumé Ch. 5.2  
10/4multiclass (ppt)Daumé Ch. 5.3-5.4  
10/7ranking (ppt)
assignment 5 
10/9gradient descent (ppt)Daumé Ch. 6-6.2  
10/11regularization (ppt)Daumé Ch. 6.3-6.5 (6.6 optional)  
10/14large margin classifiers (ppt)Daumé Ch. 6.7 midterm
10/16soft large margins (ppt)
10/18SVM Lab   
10/21NO CLASS   
10/23probability basics (ppt)Optional probability refresher:
Movallen pg. 7-23
assignment 6 
10/25probabilistic models (ppt)Daumé Ch. 7-7.3  
10/28Naive Bayes (ppt)Daumé Ch. 7.4-7.5assignment 7 
10/30priors and logistic regression (ppt)Daumé Ch. 7.6-7.7  
11/1Hadoop intro (ppt)tutorial module 1  
11/4HDFStutorial 2.1-2.4, 2.5assignment 8 
hadoop demos
tutorial 4  
11/8MapReduce lab
11/11MapReduce: config
11/13Advanced MapReduce
hadoop demos
tutorial 5  
11/15Machine learning and biology   
11/18ensemble learning (ppt)Daumé Ch. 11-11.2final project 
11/20project proposals   
11/22boosting (ppt)Daumé Ch. 11.3  
11/25k-means (ppt)Daumé Ch. 2.4, 13-13.1  
11/27No Class   
11/29No Class   
12/2k-means2 (ppt)Daumé Ch. 14-14.2  
12/4clustering (ppt)
12/6project presentations