CS109A: Introduction to Data Science



Fall 2018

Pavlos Protopapas and Kevin A. Rader

Welcome to CS109a/STAT121a/AC209a, also offered by the DCE as CSCI E-109a, Introduction to Data Science. This course is the first half of a one‐year course to data science. We will focus on the analysis of data to perform predictions using statistical and machine learning methods. Topics include data scraping, data management, data visualization, regression and classification methods, and deep neural networks (see the schedule). You will get ample practice through weekly homework assignments. The class material integrates the five key facets of an investigation using data:

1. Data Collection ‐ data wrangling, cleaning, and sampling to get a suitable data set
2. Data Management ‐ accessing data quickly and reliably
3. Exploratory Data Analysis – generating hypotheses and building intuition
4. Prediction or Statistical Learning
5. Communication – summarizing results through visualization, stories, and interpretable summaries

Only one of CS 109a, AC 209a, or Stat 121a can be taken for credit. Students who have previously taken CS 109, AC 209, or Stat 121 cannot take CS 109a, AC 209a, or Stat 121a for credit.


Lectures: Mon and Wed 1:30‐2:45 pm in Harvard Northwest Building, NW B-103
Labs: Thur 4:30-6:00 pm and Fri 10:30-11:45 am in Pierce 301 (content is identical, students should only attend one)
Head TFs: Eleni Kaxiras -DCE Head TF: Sol Girouard
Office Hours: IACS student lobby in Maxwell-Dworkin's ground. Just follow the signs.
Online Office Hours zoom link: https://harvard-dce.zoom.us/j/7607382317
Class meetings have concluded! 
Thank you all for a great semester!
Guest Lecture on Nov 28th: Ethics and Critical Thinking, Julia Stoyanovich,  Assistant Professor in the Department of Computer Science and Engineering at the Tandon School of Engineering, and the Center for Data Science. 
Course material can be viewed in the public GitHub repository.
   
REGULAR SECTIONS
Cover the material presented in class. All 2 sessions are identical.
Standard Sections have concluded. Thank you!
ADVANCED SECTIONS
Cover a different topic per week and are required for 209a students.
Advanced Sections have concluded. Thank you!
Instructor Office Hours
Pavlos: Mon. 4:00-5:00 pm
Kevin: Mon. 3:00-4:00 pm.



TF Office Hours 
See the Weekly Schedule

Please be aware, that we will not publicly release the homework assignments this year. If you want to follow the course online without registering, you can use the assignments from 2013 and 2014, available at the links below. Additionally, the material from 2015 is also available.


Previous Years Material