CS109B: Advanced Topics in Data Science
- Pavlos Protopapas (Computer Science)
- Mark Glickman (Statistics)
Katy McKeough firstname.lastname@example.org
Kevin Wu email@example.com
Eric Wu firstname.lastname@example.org
David Wihl email@example.com
Rashmi Banthia firstname.lastname@example.org
Zona Kostic email@example.com
Eleni Kaxiras firstname.lastname@example.org
Nicholas Ruta email@example.com
Sol Girouard firstname.lastname@example.org
Samuel Plank email@example.com
Welcome to Data Science 2 (DS2)! The course is listed as CS109b, STAT121b, and AC209b, and offered through the Harvard University Extension School as distance education course CSCI E-109b.
The requirements for these four labelings of the course are the same, except that for students registered for AC209b there may be additional work.
What is this class about?
Data Science 2 is the second half of a one-year introduction to data science. Building upon the material in Data Science 1, the course introduces advanced methods for data wrangling, data visualization, and statistical modeling and prediction. Topics include big data and database management, basic Bayesian methods, nonlinear statistical models, unsupervised learning, and topic models. The final module will consist of multiple deep learning subjects such as CNNs, RNNs and Autoencoders. The major programming languages used will be R and Python.
This course can only be taken after successful completion of CS 109a, AC 209a, Stat 121a, or CSCI E-109a. Students who have previously taken CS 109, AC 209, Stat 121, or CSCI E-109 cannot take this class for credit.
ISLR: An Introduction to Statistical Learning by James, Witten, Hastie, Tibshirani (Springer: New York, 2013)
DL: Deep Learning by Goodfellow, Bengio and Courville.
Free electronic versions are available (ISLR, DL or hardcopy through Amazon ([ISLR] (https://www.amazon.com/Introduction-Statistical-Learning-Applications-Statistics/dp/1461471370/ref=sr_1_1?ie=UTF8), DL).